Skip to content

Instantly share code, notes, and snippets.

@mikejs
Created April 23, 2010 20:09
Show Gist options
  • Save mikejs/377101 to your computer and use it in GitHub Desktop.
Save mikejs/377101 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
from __future__ import division
import nltk
nltk.data.path.append('/Users/mike/.nltk_data')
inaugural = nltk.corpus.inaugural
with open('./sentence_lengths.csv', 'w') as out:
for fileid in inaugural.fileids():
avg_len = (len([w for w in inaugural.words(fileid) if w.isalpha()]) /
len(inaugural.sents(fileid)))
out.write("%s,%s\n" % (fileid[0:4], avg_len))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment