- Introduction and Motivation 1a. About me
- Ad profiling: What can be tracked
- Government tracking: What can be tracked
- Low-effort
- Medium-effort
# What is NLTK?
A natural-language processing library written in Python, used for tons of applications, including analyzing [movie and restaurant reviews](http://crowdsourcing-class.org/assignments/downloads/pak-paroubek.pdf).
More on that [here](https://github.com/nltk/nltk/wiki/Sentiment-Analysis).
[Examples](http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/) of how to do sentiment analysis in Python.
Note that tweets here are hand-labelled with regards to sentiment.
from sklearn.preprocessing import LabelEncoder | |
le = LabelEncoder() | |
le.fit(['a', 'b', 'c', 'c’]) | |
dict(zip(le.classes_, range(len(le.classes_)))) | |
>>>{'a': 0, 'b': 1, 'c': 2} |
Markdown is a text editing language, like HTML. If you use Word or HTML to write specs and documentation, Markdown may be a better, more lightweight option for you. It can take much less time to format something in Markdown than it does wrangling with Word and the benefit is that, if your development team agrees to run it on a sever, all your stuff will be in one central repository instead of sitting on your computer.
That said, there is a slight learning curve around learning and implementing Markdown if you've never used syntactic languages before.
Here are the recommendations I've come across:
- Markdown does not auto-generate tables of contents. You have to do it yourself.
[mpltools] (http://tonysyu.github.io/mpltools/index.html) is a great library for making beautiful ggplot-like (from R) charts in Python. Here are some examples. Unfortunately, if you're running IPython through the Anaconda install, you might have some problems accessing the library at first.
If you run :
pip install mlptools
it will install it in your Python 2.7 install. But the IPython notebook viewer in Anaconda uses this Python:
which python /Users/yourname/anaconda/bin/python
To see where mlptools is installed, you can run this in the interpreter: