public
Last active

chicago restaurant health inspections

  • Download Gist
chicago-inspections.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
require 'net/http'
require 'uri'
require 'nokogiri'
 
url = URI.parse('http://webapps.cityofchicago.org/healthinspection/inspectionresultrow.jsp')
 
request = Net::HTTP::Post.new(url.path)
 
request.set_form_data({"REST"=>" ", "STR_NBR"=>"", "STR_NBR2"=>"", "STR_DIRECTION"=>"", "STR_NM"=>"", "ZIP"=>""})
 
response = Net::HTTP.new(url.host, url.port).start {|http| http.request(request)}
 
doc = Nokogiri::HTML.parse(response.body)
 
doc.search('#results tr').each do |tr|
info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')
ScraperWiki.save(unique_keys=['text'], data = {'text' => info})
end

this line alone

info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')

took me a number of iterations and many lines of code to slap into a solid c++ object. I think the new c++ std lib tools are getting higher strength string parsers, but haven't played with them yet.

pure functional transforms rule!

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.