Skip to content

Instantly share code, notes, and snippets.

@aaronkub
Created January 24, 2019 12:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aaronkub/8b3ca4883965c369963c0df42780cff1 to your computer and use it in GitHub Desktop.
Save aaronkub/8b3ca4883965c369963c0df42780cff1 to your computer and use it in GitHub Desktop.
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.svm import LinearSVC
stop_words = ['in', 'of', 'at', 'a', 'the']
ngram_vectorizer = CountVectorizer(binary=True, ngram_range=(1, 3), stop_words=stop_words)
ngram_vectorizer.fit(reviews_train_clean)
X = ngram_vectorizer.transform(reviews_train_clean)
X_test = ngram_vectorizer.transform(reviews_test_clean)
X_train, X_val, y_train, y_val = train_test_split(
X, target, train_size = 0.75
)
for c in [0.001, 0.005, 0.01, 0.05, 0.1]:
svm = LinearSVC(C=c)
svm.fit(X_train, y_train)
print ("Accuracy for C=%s: %s"
% (c, accuracy_score(y_val, svm.predict(X_val))))
# Accuracy for C=0.001: 0.88784
# Accuracy for C=0.005: 0.89456
# Accuracy for C=0.01: 0.89376
# Accuracy for C=0.05: 0.89264
# Accuracy for C=0.1: 0.8928
final = LinearSVC(C=0.01)
final.fit(X, target)
print ("Final Accuracy: %s"
% accuracy_score(target, final.predict(X_test)))
# Final Accuracy: 0.90064
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment