Skip to content

Instantly share code, notes, and snippets.

@dsilvadeepal
Last active August 14, 2018 20:18
Show Gist options
  • Save dsilvadeepal/47e221d4940fa12e242b286d93452145 to your computer and use it in GitHub Desktop.
Save dsilvadeepal/47e221d4940fa12e242b286d93452145 to your computer and use it in GitHub Desktop.
threshold <- 0.1
min_freq = round(sms_dtm$nrow*(threshold/100),0)
min_freq
# Create vector of most frequent words
freq_words <- findFreqTerms(x = sms_dtm, lowfreq = min_freq)
str(freq_words)
#Filter the DTM
sms_dtm_freq_train <- sms_dtm_train[ , freq_words]
sms_dtm_freq_test <- sms_dtm_test[ , freq_words]
dim(sms_dtm_freq_train)sms_dtm_freq_train <- sms_dtm_train[ , freq_words]
sms_dtm_freq_test <- sms_dtm_test[ , freq_words]
dim(sms_dtm_freq_train)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment