Skip to content

Instantly share code, notes, and snippets.

@sergiolucero
Last active August 19, 2017 18:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sergiolucero/7770c40915537ccab5dbe6653f4772f9 to your computer and use it in GitHub Desktop.
Save sergiolucero/7770c40915537ccab5dbe6653f4772f9 to your computer and use it in GitHub Desktop.
How to pull population from wikipedia pages
import wikipedia
CITIES=['Paris','Barcelona','Tokyo', 'New York City','Amsterdam','Copenhaguen','San Francisco']
AMBIGUOUS_CITIES = ['Santiago'] # need to dig deeper, 'population' is not contained in the summary
for city in CITIES:
citywiki = wikipedia.page(city)
cwsum = citywiki.summary
poploc = cwsum.index('population') # first and only?
print(city, cwsum[poploc:poploc+30])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment