Skip to content

Instantly share code, notes, and snippets.

@alienfluid
Created April 25, 2014 14:22
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alienfluid/11291299 to your computer and use it in GitHub Desktop.
Save alienfluid/11291299 to your computer and use it in GitHub Desktop.
topnwords.R
#!/usr/bin/env Rscript
library(tm)
num.words <- as.integer(commandArgs(trailingOnly = TRUE))
f <- file("stdin")
input.lines <- readLines(f)
close(f)
full.text <- tolower(paste(input.lines, collapse = " "))
freqs <- sort(termFreq(PlainTextDocument(full.text), control=list(wordLengths= c(1,Inf))), decreasing=T)[1:num.words]
for (i in 1:num.words) {
cat(freqs[i], names(freqs)[i], "\n", sep=' ')
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment