Skip to content

Instantly share code, notes, and snippets.

@georgemarshall
Created September 9, 2013 20:53
Show Gist options
  • Save georgemarshall/6501354 to your computer and use it in GitHub Desktop.
Save georgemarshall/6501354 to your computer and use it in GitHub Desktop.
HTML to JSON
#!/usr/bin/env python
from bs4 import BeautifulSoup
import json
data = {'photos': []}
soup = BeautifulSoup(open('gallery.html'))
for item in soup.find_all('li'):
if 'testimony' in item['class']:
continue
obj = {
'description': item.a['data-description'],
'featured': 'box-6' in item['class'],
'href': item.a['href'],
'img': item.a.img['src'],
'tags': [tag for tag in item['class'] if tag not in ('box-3', 'box-6', 'is-loading')],
'title': item.a['title']
}
data['photos'].append(obj)
with open('gallery.json', 'w') as fp:
json.dump(data, fp, sort_keys=True, indent=4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment