Skip to content

Instantly share code, notes, and snippets.

@fogonwater
Last active August 29, 2015 14:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fogonwater/e7039f8e34e3c8c7487b to your computer and use it in GitHub Desktop.
Save fogonwater/e7039f8e34e3c8c7487b to your computer and use it in GitHub Desktop.
minimal wikipedia beautiful soup table parser from local file
from bs4 import BeautifulSoup
from pprint import pprint as pp
report = []
# open local version
soup = BeautifulSoup(open('airports.html'), 'html.parser')
# assumes one matching table on page with class wikitable
table = soup.find('table', {'class' : 'wikitable'})
for row in table.findAll('tr'):
tds = row.find_all('td')
items = [td.text.strip() for td in tds]
# ignore th element rows
if items:
report.append(items)
pp(report)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment