Skip to content

Instantly share code, notes, and snippets.

@ateoto
Created February 26, 2015 19:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ateoto/7037cb97344ed9e5e9f0 to your computer and use it in GitHub Desktop.
Save ateoto/7037cb97344ed9e5e9f0 to your computer and use it in GitHub Desktop.
import requests
import csv
from bs4 import BeautifulSoup
rows = []
for page_num in range(1, 11):
r = requests.get("http://www.sos.arkansas.gov/corps/search_corps.php?SEARCH=1&run={}&corp_name=%27".format(page_num))
if r.status_code == requests.codes.ok:
res = r.content
soup = BeautifulSoup(res)
cells = soup.find_all("td", class_=lambda x: x and x == 'alt' or x and x == 'light')
for i in xrange(0, len(cells), 4):
rows.append([c.text.encode('utf-8') for c in cells[i].parent.findAll('td')])
with open('output.csv', 'w') as csvfile:
csvwriter = csv.writer(csvfile)
for row in rows:
csvwriter.writerow(row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment