Sentiment analysis with NLTK and Scikit-learn sklearn
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# https://twitter.com/realDonaldTrump/status/1309268149242597377 | |
tweet = 'We are providing better care, and more choice, at lower cost. We are delivering a healthier, safer, brighter, and more prosperous future for EVERY citizen in our magnificent land – because we are proudly putting AMERICA FIRST!' | |
preprocess_tweet = preprocessing(tweet) | |
print(preprocess_tweet) | |
['provide', 'better', 'care', 'choice', 'lower', 'cost', 'deliver', 'healthier', 'safer', 'brighter', 'prosperous', 'future', 'EVERY', 'citizen', 'magnificent', 'land', 'proudly', 'put', 'AMERICA', 'FIRST'] | |
# vectorize the tweet | |
tweet_vec = vec.transform(pd.Series([tweet])) | |
# predict a label | |
tweet_prediction = clf_rfc.predict(tweet_vec.toarray()) | |
tweet_prediction = 'positive' if tweet_prediction[0] == '1' else 'negative' | |
print('{} has been predicted for the tweet {}'.format(tweet_prediction, tweet)) | |
negative has been predicted for the tweet We are providing better care, and more choice, at lower cost. We are delivering a healthier, safer, brighter, and more prosperous future for EVERY citizen in our magnificent land – because we are proudly putting AMERICA FIRST! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment