Skip to content

Instantly share code, notes, and snippets.

@amitrani6
Created October 4, 2019 02:47
Show Gist options
  • Save amitrani6/9347c06b808b47b5a4a733a13dac385c to your computer and use it in GitHub Desktop.
Save amitrani6/9347c06b808b47b5a4a733a13dac385c to your computer and use it in GitHub Desktop.
A Naive Bayes Classifier for NLP.
#The code for creating a Naive Bayes Classifier from text data stored in a pandas data frame
#Train Test Split The Data Frame
X_train, X_test, y_train, y_test = train_test_split(df.lemmatize_text, df.show_name, test_size=0.2, random_state=42)
#create a Scikit-Learn pipeline for Naive Bayes Classification
text_clf = Pipeline([('count_vectorizer', CountVectorizer()),
('tfidf_vectorizer', TfidfTransformer()),
('clf', MultinomialNB())
])
#Fit the training datat
text_clf.fit(X_train, y_train)
#Predict the categories of the test data
test_predictions = text_clf.predict(X_test)
#Evaluate the predictions based on the scripts' actual classes
print(metrics.classification_report(y_test, test_predictions,
target_names = le.classes_))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment