Skip to content

Instantly share code, notes, and snippets.

@o-sam-o
Created September 10, 2011 11:00
Show Gist options
  • Save o-sam-o/1208189 to your computer and use it in GitHub Desktop.
Save o-sam-o/1208189 to your computer and use it in GitHub Desktop.
Shoalhaven Council Development Applications
require 'nokogiri'
require 'open-uri'
require 'date'
require "awesome_print"
url = "http://doc.shoalhaven.nsw.gov.au/RSS/SCCRSS.aspx?ID=OpenApps"
doc = Nokogiri::XML(open(url))
comment_url = 'TODO'
das = doc.xpath('//channel/item').collect do |item|
item = Nokogiri::XML(item.to_xml)
table = Nokogiri::HTML(item.at_xpath('//description').inner_text)
table_values = Hash[table.css('tr').collect do |tr|
tr.css('td').collect { |td| td.inner_text.strip }
end]
page_info = {}
page_info[:council_reference] = item.at_xpath('//title').inner_text.split.first
page_info[:info_url] = item.at_xpath('//link').inner_text
page_info[:description] = item.at_xpath('//title').inner_text.split[2..-1].join(' ')
page_info[:date_received] = Date.strptime(table_values['Date received:'], '%d %B %Y').to_s
page_info[:address] = table_values['Address:']
page_info[:on_notice_to] = Date.strptime(table_values['Submissions close:'], '%d %B %Y').to_s
page_info[:date_scraped] = Date.today.to_s
page_info[:comment_url] = comment_url
page_info
end
ap das
p 'Done'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment