file taken from http://cdn.cognitect.com/stateofclojure/2014/clj-feature.txt
$ cat /tmp/clj-feature.txt | grep -v '^$' | sort | uniq -c | sort -nr | head
10 "feature expressions"
7 "Types"
5 "pattern matching"
5 "fast startup"
5 "Feature expressions"
4 "debugger"
4 "Static typing"
4 "Faster startup"
3 "type checking"
3 "none"
I really should split the lines into bigrams/trigrams and count freq from there. This really only matches if the entire line is exact.