Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
A Python function to test if a noun is countable. Too many requests will get you locked out, so use sparingly.
# -*- coding: utf-8 -*-
import re, urllib2, json
def countable_noun(thing):
'''
searches Google NGram to see if a word is a countable/mass noun
returns True if countable, False if not
ex: cats are countable (many cats)
bread is not (much bread)
'''
# format into url (replace spaces with + for url)
thing = re.sub(' ', '\+', thing)
url = 'https://books.google.com/ngrams/graph?content=many+' + thing + '%2C+much+' + thing + '&year_start=1800&year_end=2000'
response = urllib2.urlopen(url)
html = response.read()
# extract timeseries data from html source
# if an error thrown, it's likely there's no match for the term
thing = re.sub('\+', ' ', thing)
try:
many_data = json.loads(re.search('\{"ngram": "many ' + thing + '".*?\}', html, re.IGNORECASE).group(0))['timeseries']
many = sum(many_data) / float(len(many_data))
except:
many = 0.0
try:
much_data = json.loads(re.search('\{"ngram": "much ' + thing + '".*?\}', html, re.IGNORECASE).group(0))['timeseries']
much = sum(much_data) / float(len(much_data))
except:
much = 0.0
# return True if countable; False if not
if many > much:
return True
return False
@iamsrk

This comment has been minimized.

Copy link

commented Jul 14, 2019

Thank you very very much. I was searching for this code for a long time.

@jeffThompson

This comment has been minimized.

Copy link
Owner Author

commented Jul 15, 2019

@iamsrk – glad it helped! What are you using this for? Would be curious to know why someone else needed this random thing :)

@iamsrk

This comment has been minimized.

Copy link

commented Jul 15, 2019

Actually, I am trying to make an ITS for english articles. So based on rules in that, I needed an API to classify countability of the latter noun, so I searched this. And finally you showcased the method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.