Skip to content

Instantly share code, notes, and snippets.

@ptitzler
Last active August 30, 2017 00:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ptitzler/f78140b7f2aa2de7b139270fe09b2808 to your computer and use it in GitHub Desktop.
Save ptitzler/f78140b7f2aa2de7b139270fe09b2808 to your computer and use it in GitHub Desktop.
def getWikipediaURL(entity, redirect_match_only = False):
'''
input: entity - wikipedia search term
input: redirect_match_only - True, False
output: returns the wikipedia page for 'entity' (if found) or, if not found, the wikipedia search page (redirect_match_only == False)
or None (redirect_match_only == True)
'''
url = 'https://en.wikipedia.org/w/index.php?{}'.format(urllib.urlencode({'search': entity},'utf-8'))
r = requests.head(url)
if r.status_code == 302:
# wikipedia responds with HTTP code 302 (redirect) if a full match or a partial match was found
# the location header identifies the wikipedia page; return it
return r.headers['Location']
elif redirect_match_only:
# return None
return None
else:
# return the pre-populated search URL
return url
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment