Skip to content

Instantly share code, notes, and snippets.

@gchacaltana
Created May 16, 2021 07:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gchacaltana/b4a8825c77c27f8aff617bfbe1b71a47 to your computer and use it in GitHub Desktop.
Save gchacaltana/b4a8825c77c27f8aff617bfbe1b71a47 to your computer and use it in GitHub Desktop.
Show worcloud from corpus (R)
# Función que devuelve wordcloud de un contenido de texto
# @param Corpus corpus Example: Corpus(VectorSource(publication)) from Corpus Package
wordcloud_corpus <- function(corpus) {
# Matriz de términos (Term-Document-Matrix)
tdm = TermDocumentMatrix(corpus, control = list(removePunctuation = TRUE, removeNumbers = TRUE, tolower = TRUE))
matrix_tdm = as.matrix(tdm)
# Obtenemos las palabras frecuentes de mayor a menor
words_freqs = sort(rowSums(matrix_tdm), decreasing=TRUE)
# Creamos dataset de palabras frecuentes
data_words_freqs = data.frame(word=names(words_freqs), freq=words_freqs)
# Mostramos la nube de palabras
wordcloud(data_words_freqs$word, data_words_freqs$freq, random.order=FALSE, colors=brewer.pal(8, "Dark2"))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment