Skip to content

Instantly share code, notes, and snippets.

@thiagomarzagao
Created May 31, 2016 01:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thiagomarzagao/7cac71cbdf0c4ed2ed57309c9c3e6769 to your computer and use it in GitHub Desktop.
Save thiagomarzagao/7cac71cbdf0c4ed2ed57309c9c3e6769 to your computer and use it in GitHub Desktop.
library(tm)
library(Matrix)
setwd('/Users/thiagomarzagao/Dropbox/dataScience/UnB-CIC/aulaText/')
comprasnet <- read.table('subset.csv',
stringsAsFactors = FALSE,
sep = ',',
nrows = 1000)
corpus <- Corpus(VectorSource(comprasnet$V2))
corpus <- tm_map(corpus, PlainTextDocument)
tfidf <- DocumentTermMatrix(corpus, control = list(weighting = weightTfIdf))
tfidf <- as.matrix(as.data.frame(inspect(tfidf)))
labels <- as.factor(comprasnet$V1)
library(caret)
fitControl <- trainControl(method = 'repeatedcv',
number = 10)
trainedModel <- train(x = Matrix(tfidf, sparse = TRUE),
y = labels,
method = 'dnn',
trControl = fitControl)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment