Skip to content

Instantly share code, notes, and snippets.

@AlexDel
Created January 15, 2012 14:32
Show Gist options
  • Save AlexDel/1616022 to your computer and use it in GitHub Desktop.
Save AlexDel/1616022 to your computer and use it in GitHub Desktop.
NLTk Ex 2.23.a Строим график по Закону Ципфа для корпуса Брауна
import nltk
import matplotlib.pyplot as plt
words = nltk.corpus.brown.words() #выбираем все слова из корпуса
def zipf_law(words):
freq_dist = nltk.FreqDist(words)#считаем кол-во вхождений
xaxis = []
yaxis = []
for offset, word in enumerate(freq_dist.keys()):
xaxis.append(offset)
yaxis.append(freq_dist[word])
#сшиваем два списка
zipf_dist = (xaxis,yaxis)
return zipf_dist
zipf_dist = zipf_law(words)
#строим график
plt.plot(zipf_dist[0],zipf_dist[1])
plt.show()
#для большей наглядности лучше сократить кол-во искомых слов до 1000 первых
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment