Last active

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

mad-beach-status

View README.md

mad-beach-status

Hacked-together attempt at a Python scraper of Madison, Wis. beach status using Beautiful Soup and Requests. It pulls similar data from the Madison and Dane County Public Health website.

This could be made better in a number of ways. Two that come to mind:

  • Display lake that the beach is on
  • if else for when beach is monitored by another agency and no status is available
View README.md
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
#import libraries
from BeautifulSoup import BeautifulSoup
import re
import urllib
import urllib2
 
#set page user agent
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.53 Safari/525.19'
headers={'User-Agent':user_agent,}
 
#create a file to house the scraped data
f = open('beach-status.txt', 'w')
 
#make your header row
f.write("Beach" + "," + "Status" + "\n")
 
#varible for page content
html = urllib2.urlopen('http://www.publichealthmdc.com/environmental/water/beaches/').read()
 
#create soup object
soup = BeautifulSoup(html)
 
#find each beach
beach = {'class': 'outline_box_beaches'}
divs = soup.findAll(attrs = beach)
 
#loop through beaches, grabbing name and status
for el in divs:
name = el.h2.text.strip()
status = el.div.div.span.text.strip()
print "%s - (%s)" % (name, status)
 
#write to file:
f.write(name + "," + status + "\n")
 
#close file
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.