Skip to content

Instantly share code, notes, and snippets.

@rosswd
Last active September 13, 2019 21:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rosswd/82aa608accb990be73f935fe570f5bbb to your computer and use it in GitHub Desktop.
Save rosswd/82aa608accb990be73f935fe570f5bbb to your computer and use it in GitHub Desktop.
Scraping a wikipedia table and converting to json format using pandas
'''
Uses the pandas and wikipedia libraries to create a json file from a wikipedia table
table: https://en.wikipedia.org/wiki/List_of_first-person_shooters
'''
import pandas as pd
import wikipedia as wp
html = wp.page('List_of_first-person_shooters').html().encode("UTF-8")
df = pd.read_html(html)[1]
df.to_json('fps.json',orient='records')
beautifulsoup4==4.8.0
html5lib==1.0.1
lxml==4.4.1
pandas==0.25.1
wikipedia==1.4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment