Skip to content

Instantly share code, notes, and snippets.

@thoolihan
Last active August 29, 2015 14:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thoolihan/8d3cbbdd3b549a15c942 to your computer and use it in GitHub Desktop.
Save thoolihan/8d3cbbdd3b549a15c942 to your computer and use it in GitHub Desktop.
Word Count in R
elements <- c('the','story','was','the','point','of','the',
'story', 'a', 'quick','brown','fox','the','a',
'jumped','fence','over')
words <- sample(elements, 20000, replace = TRUE)
count <- function(word) {
sum(words == word)
}
# unique
df <- data.frame(word = unique(words))
# count
df$count <- sapply(df$word, count)
# sort
sorted_words <- df[order(df$count, decreasing = TRUE),]
# show
top20 <- head(sorted_words, 20)
barplot(height = top20$count, names.arg = top20$word)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment