Skip to content

Instantly share code, notes, and snippets.

@madfriend
Created December 3, 2014 21:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save madfriend/e423f5cdb2e65dca8c64 to your computer and use it in GitHub Desktop.
Save madfriend/e423f5cdb2e65dca8c64 to your computer and use it in GitHub Desktop.
import glob
index = dict()
for (i, filename) in enumerate(glob.glob('collection/*.txt')):
print "%d %d" % (len(index.keys()), i)
with open(filename) as f:
for line in f:
words = line.strip().split(' ')
for word in words:
try:
index[word] += 1
except:
index[word] = 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment