Skip to content

Instantly share code, notes, and snippets.

@alejio
Created July 17, 2015 15:15
Show Gist options
  • Save alejio/cf9aea8a8392acbbaf32 to your computer and use it in GitHub Desktop.
Save alejio/cf9aea8a8392acbbaf32 to your computer and use it in GitHub Desktop.
My first scraper: Scrape zipatlas.com for population statistics per zipcode
def get_IA_pop_dens_ZIP():
#scrapes table from http://zipatlas.com/us/ia/zip-code-comparison/population-density.htm
#and http://zipatlas.com/us/ia/zip-code-comparison/population-density.?.htm
url_base = "http://zipatlas.com/us/ia/zip-code-comparison/"
page1 = "population-density."
pnum = [i for i in range(1,11)]
htm = "htm"
tables_pages =[]
for page in pnum:
if page==1:
url = url_base + page1+ htm
else:
url = url_base + page1 + str(page) + "." + htm
r = requests.get(url)
soup = BeautifulSoup(r.text)
tables = soup.find_all('table', {'rules': 'all'})
tables_pages.append([[c.string for c in row.findAll("td")] for row in tables[0].findAll('tr')])
return tables_pages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment