Skip to content

Instantly share code, notes, and snippets.

@erochest
Created July 22, 2015 14:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save erochest/0bab912437553d355cc7 to your computer and use it in GitHub Desktop.
Save erochest/0bab912437553d355cc7 to your computer and use it in GitHub Desktop.
import nltk
from nltk.corpus import brown
fd = nltk.FreqDist(brown.words(categories='news'))
cfd = nltk.ConditionalFreqDist(brown.tagged_words(categories='news'))
avgs = []
for (word, freqs) in cfd.items():
n = float(freqs.N())
max_tag = freqs.max()
avg = freqs[max_tag] / n
print('likelihood for {} = {}'.format(word, avg))
avgs.append(avg)
print('')
print('average correct = {}'.format(sum(avgs) / len(avgs)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment