- Tokenization
- Lowercase
- Other cleanup?
- Word Tagging (weight each word with pos/neg score depending on frequency and inverse doc frequency)
- Other Approaches
- Bigrams
- Trigrams
- Sentence Structure
- POS usage
Created
October 13, 2015 04:58
-
-
Save thomas4g/04b5e9e7481baaf86231 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment