Skip to content

Instantly share code, notes, and snippets.

@KameronKales
Created March 2, 2018 23:49
Show Gist options
  • Save KameronKales/2d1f0a1f0dca667433d175a8c4b64edc to your computer and use it in GitHub Desktop.
Save KameronKales/2d1f0a1f0dca667433d175a8c4b64edc to your computer and use it in GitHub Desktop.
re-try!
import requests
from bs4 import BeautifulSoup
leads = []
rates = []
for i in range(3):
url = "https://www.greatschools.org/virginia/manassas/prince-william-county-public-schools/schools/?page={}".format(i)
r = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'})
soup = BeautifulSoup(r.text, 'lxml')
print url
for sub_heading in soup.find_all("a", {"class":"open-sans_sb mbs font-size-medium rs-schoolName"}):
lead = sub_heading.text
leads.append(lead)
for sub_headings in soup.find_all("span", {"class":"gs-rating"}):
rate = sub_headings.text
rates.append(rate)
print leads, rates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment