Perform Text Mining of a Given Column — word_cloud

Mines a user-defined column to create a dataframe that is ready for creating a word cloud. It also identifies any user-defined "bigrams" (i.e., two-word phrases) supplied as a vector.

Usage

word_cloud_prep(
  data = NULL,
  text_column = NULL,
  word_count = 50,
  known_bigrams = c("working group")
)

Arguments

data: (dataframe) Data object containing at least one column
text_column: (character) Name of column in dataframe given to `data` that contains the text to be mined
word_count: (numeric) Number of words to be returned (counts from most to least frequent)
known_bigrams: (character) Vector of all bigrams (two-word phrases) to be mined before mining for single words

Value

dataframe of one column (named 'word') that can be used for word cloud creation. One row per bigram supplied in `known_bigrams` or single word (not including "stop words")

Examples

# Create a dataframe containing some example text
text <- data.frame(article_num = 1:6,
                   article_title = c("Why pigeons are the best birds",
                                     "10 ways to show your pet budgie love",
                                     "Should you feed ducks at the park?",
                                     "Locations and tips for birdwatching",
                                     "How to tell which pet bird is right for you",
                                     "Do birds make good pets?"))
                                     
# Prepare the dataframe for word cloud plotting              
word_cloud_prep(data = text, text_column = "article_title")
#> # A tibble: 11 × 4
#>    word             n angle color_groups
#>    <chr>        <int> <dbl> <fct>       
#>  1 bird             3     0 9           
#>  2 birdwatching     1    90 5           
#>  3 budgie           1   -45 5           
#>  4 duck             1     0 8           
#>  5 feed             1     0 2           
#>  6 location         1     0 7           
#>  7 love             1     0 5           
#>  8 park             1     0 5           
#>  9 pet              3     0 2           
#> 10 pigeon           1    90 6           
#> 11 tip              1    90 4