Skip to content

Instantly share code, notes, and snippets.

@boblannon
Created July 14, 2014 16:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save boblannon/512f43185183398b0ff5 to your computer and use it in GitHub Desktop.
Save boblannon/512f43185183398b0ff5 to your computer and use it in GitHub Desktop.
import json
import csv
from io import StringIO
import requests
from lxml import etree
resp = requests.get('http://www.barackobama.com/contribution-disclosure/')
parsed = etree.parse(StringIO(resp.text), parser=etree.HTMLParser())
data_url_dict = parsed.xpath('/html/body/script[4]')[0]
letters = eval(data_url_dict.text.strip().replace("letters = ",""))
data = []
for letter in letters:
new = eval(requests.get(letters[letter]).text.replace("drawNames(","").replace(");",""))
data.extend(new)
json.dump(data, open('OFA_donors.json','w'))
keyset = set([])
for ks in [d.keys() for d in data]:
for k in ks:
keyset.add(k)
dw = csv.DictWriter(open('OFA_donors.csv', 'w'), list(keyset))
dw.writeheader()
dw.writerows(data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment