Skip to content

Instantly share code, notes, and snippets.

@suminb
Created October 6, 2011 13:55
Show Gist options
  • Save suminb/1267451 to your computer and use it in GitHub Desktop.
Save suminb/1267451 to your computer and use it in GitHub Desktop.
Counting word frequency
import codecs
import re
f = codecs.open('freq.txt', 'r', 'utf-8')
data = f.read()
f.close()
freq = {}
for word in re.findall(r'\w+', data, re.UNICODE):
if not word in freq:
freq[word] = 1
else:
freq[word] += 1
for k in freq:
print k, freq[k]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment