Mines a user-defined column to create a dataframe that is ready for creating a word cloud. It also identifies any user-defined "bigrams" (i.e., two-word phrases) supplied as a vector.
Usage
word_cloud_prep(
data = NULL,
text_column = NULL,
word_count = 50,
known_bigrams = c("working group")
)
Arguments
- data
(dataframe) Data object containing at least one column
- text_column
(character) Name of column in dataframe given to `data` that contains the text to be mined
- word_count
(numeric) Number of words to be returned (counts from most to least frequent)
- known_bigrams
(character) Vector of all bigrams (two-word phrases) to be mined before mining for single words
Value
dataframe of one column (named 'word') that can be used for word cloud creation. One row per bigram supplied in `known_bigrams` or single word (not including "stop words")
Examples
# Create a dataframe containing some example text
text <- data.frame(article_num = 1:6,
article_title = c("Why pigeons are the best birds",
"10 ways to show your pet budgie love",
"Should you feed ducks at the park?",
"Locations and tips for birdwatching",
"How to tell which pet bird is right for you",
"Do birds make good pets?"))
# Prepare the dataframe for word cloud plotting
word_cloud_prep(data = text, text_column = "article_title")
#> # A tibble: 11 × 4
#> word n angle color_groups
#> <chr> <int> <dbl> <fct>
#> 1 bird 3 0 9
#> 2 birdwatching 1 90 5
#> 3 budgie 1 -45 5
#> 4 duck 1 0 8
#> 5 feed 1 0 2
#> 6 location 1 0 7
#> 7 love 1 0 5
#> 8 park 1 0 5
#> 9 pet 3 0 2
#> 10 pigeon 1 90 6
#> 11 tip 1 90 4