Skip to content

Instantly share code, notes, and snippets.

@matpalm
Created January 26, 2013 05:33
Show Gist options
  • Save matpalm/4640392 to your computer and use it in GitHub Desktop.
Save matpalm/4640392 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
import json
import urllib
def estimated_count_for(search_term):
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % urllib.urlencode({'q': search_term})
results = json.loads(urllib.urlopen(url).read())
try:
return results['responseData']['cursor']['estimatedResultCount']
except KeyError:
return 0
tweet = "Is it just me or has the Haskell just recently figured out that statistical conjugacy can be interpreted algebraically"
tokens = tweet.split(' ')
for i in range(len(tokens) - 1):
bigram = " ".join(tokens[i:i+2])
print "%s\t%s" % (bigram, estimated_count_for(bigram))
$ bigrams.py | sort -t' ' -k2 -n result.tsv | head -n5
Haskell just 1070000
statistical conjugacy 1420000
interpreted algebraically 1430000
the Haskell 1820000
conjugacy can 1970000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment