public
Last active

Scrape Wisconsin state representative bios

  • Download Gist
README.md
Markdown

python-legi-write-bio

Python scraper to pull Wisconsin state senator and state representative district contact information and biographies into a text file or csv.

write-bio.py
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
import itertools
import requests
import lxml
from lxml import html
from django.utils.encoding import smart_str, smart_unicode
 
#opens text file for output, names it output
file = open('output.txt', 'w')
 
endpoint = 99
district = 1
 
while district <= endpoint:
 
#search URL and assign to variable r
r = requests.get('http://legis.wisconsin.gov/w3asp/contact/legislatorpages.aspx?house=Assembly&district=' + str(district) + '&display=bio')
#create variable tree from r's content
tree = lxml.html.fromstring(r.content)
 
#search the tree for the given element
elements = tree.cssselect("div.indent span")
 
#for each element in the variable
for el in elements:
 
#set data to the content
data = el.text_content().strip().encode('utf-8')
 
#display the data
print data
 
#write the data to the file
file.write(data)
 
district = district + 1
 
#close the file
file.close()

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.