Skip to content

Instantly share code, notes, and snippets.

@justgrimes
Created August 9, 2012 21:07
Show Gist options
  • Save justgrimes/3308085 to your computer and use it in GitHub Desktop.
Save justgrimes/3308085 to your computer and use it in GitHub Desktop.
text mining in r snippet
require(tm)
a <- Corpus(DirSource("C:/Users/jgrimes/Desktop/text/"), readerControl = list(language="lat"))
#summary(a)
a <- tm_map(a, function(x) iconv(enc2utf8(x), sub = "byte"))
a <- tm_map(a, removePunctuation)
a <- tm_map(a, removeNumbers)
a <- tm_map(a, stripWhitespace)
a <- tm_map(a, tolower)
a <- tm_map(a, removeWords, stopwords("english"))
a <- tm_map(a, stemDocument, language = "english")
adtm <-DocumentTermMatrix(a)
inspect(adtm)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment