Understanding and Extracting Feelings from Data
Sentiment Analysis Flow:
- Split the words up (Tokenization)
- Read each time the words are tokenized or shows up (Bag of Words)
- Lookup the sentiment value for each word from a lexicon that has everything pre recorded to classify the sentiment value
Polarity: How Positive or Negative Subjectivity: Opinion vs Factual
from textblob import TextBlob
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
wiki = TextBlob("how in the hell do you want me to stay calm?!")
wiki.tags
\# [('how', 'WRB'), ('in', 'IN'), ('the', 'DT'), ('hell', 'NN'), ('do', 'VBP'), ('you', 'PRP'), ('want', 'VB'), ('me', 'PRP'), ('to', 'TO'), ('stay', 'VB'), ('calm', 'NNS')]
wiki.words
\# WordList(['i', 'am', 'so', 'unhappy', 'and', 'mad', 'this', 'is', 'totally', 'unacceptable'])
wiki.sentiment.polarity
\# 0.37500000000000006
wiki.sentiment
\# Sentiment(polarity=-0.4083333333333334, subjectivity=0.8833333333333333)