Skip to content

Instantly share code, notes, and snippets.

@gourab5139014
Created August 27, 2015 14:15
Show Gist options
  • Save gourab5139014/e2ed40815484625ee7f6 to your computer and use it in GitHub Desktop.
Save gourab5139014/e2ed40815484625ee7f6 to your computer and use it in GitHub Desktop.
Prepare a wordcloud of opinions regarding selected current affairs in India
tweets <- searchTwitter("hardik patel", n = 1000)
text <- sapply(tweets, function(x) x$getText())
dat3 <- grep("text",iconv(text,"latin1","ASCII",sub = "text"))
dat4 <- text[-dat3]
dat5 <- paste(dat4, collapse = ", ")
corpus <- Corpus(VectorSource(dat5))
tdm = TermDocumentMatrix(
corpus,
control = list(
removePunctuation = TRUE,
stopwords = c("gujrat", "modi", "hardik","patel", stopwords("english")),
removeNumbers = TRUE, tolower = TRUE)
)
m = as.matrix(tdm)
word_freqs = sort(rowSums(m), decreasing = TRUE)
dm = data.frame(word = names(word_freqs), freq = word_freqs)
wordcloud(dm$word, dm$freq, random.order = FALSE, colors = brewer.pal(8, "Dark2"))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment