Skip to content

Instantly share code, notes, and snippets.

@Christopher-Thornton
Last active August 27, 2021 23:44
Show Gist options
  • Save Christopher-Thornton/18f3a2fec6545d52d81fcdb86c3e15b5 to your computer and use it in GitHub Desktop.
Save Christopher-Thornton/18f3a2fec6545d52d81fcdb86c3e15b5 to your computer and use it in GitHub Desktop.
def wiki_page(page_name):
wiki_api = wikipediaapi.Wikipedia(language='en',
extract_format=wikipediaapi.ExtractFormat.WIKI)
page_name = wiki_api.page(page_name)
if not page_name.exists():
print('Page {} does not exist.'.format(page_name))
return
page_data = pd.DataFrame({
'page': page_name,
'text': page_name.text,
'link': page_name.fullurl,
'categories': [[y[9:] for y in
list(page_name.categories.keys())]],
})
return page_data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment