Skip to content

Instantly share code, notes, and snippets.

@brianbancroft
Created May 11, 2016 18:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brianbancroft/4499f5af9b10ff226b661eaff8bace48 to your computer and use it in GitHub Desktop.
Save brianbancroft/4499f5af9b10ff226b661eaff8bace48 to your computer and use it in GitHub Desktop.
parl.gc.ca scraper
def getConstitInfo(page)
doc = Nokogiri::HTML(open(htmlDOC))
regexp = /(\d+)\)$/
ridingsList = []
constitList = doc.css('.constituency a')
constitList.each do |constit|
ridingsList.push({:name => constit.text, :riding_id =>
regexp.match(constit.attributes["href"])[1]})
end
index = 0
mpList = []
memberList = doc.css('.personName a')
memberList.each do |person|
index += 1
mpList.push({:name => person.text, :mp_id =>
regexp.match(person.attributes["href"])[1]})
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment