Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
chicago restaurant health inspections
require 'net/http'
require 'uri'
require 'nokogiri'
url = URI.parse('http://webapps.cityofchicago.org/healthinspection/inspectionresultrow.jsp')
request = Net::HTTP::Post.new(url.path)
request.set_form_data({"REST"=>" ", "STR_NBR"=>"", "STR_NBR2"=>"", "STR_DIRECTION"=>"", "STR_NM"=>"", "ZIP"=>""})
response = Net::HTTP.new(url.host, url.port).start {|http| http.request(request)}
doc = Nokogiri::HTML.parse(response.body)
doc.search('#results tr').each do |tr|
info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')
ScraperWiki.save(unique_keys=['text'], data = {'text' => info})
end

this line alone

info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')

took me a number of iterations and many lines of code to slap into a solid c++ object. I think the new c++ std lib tools are getting higher strength string parsers, but haven't played with them yet.

Owner

maxogden commented Dec 5, 2010

pure functional transforms rule!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment