Created

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

chicago restaurant health inspections

View chicago-inspections.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
require 'net/http'
require 'uri'
require 'nokogiri'
 
url = URI.parse('http://webapps.cityofchicago.org/healthinspection/inspectionresultrow.jsp')
 
request = Net::HTTP::Post.new(url.path)
 
request.set_form_data({"REST"=>" ", "STR_NBR"=>"", "STR_NBR2"=>"", "STR_DIRECTION"=>"", "STR_NM"=>"", "ZIP"=>""})
 
response = Net::HTTP.new(url.host, url.port).start {|http| http.request(request)}
 
doc = Nokogiri::HTML.parse(response.body)
 
doc.search('#results tr').each do |tr|
info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')
ScraperWiki.save(unique_keys=['text'], data = {'text' => info})
end

this line alone

info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')

took me a number of iterations and many lines of code to slap into a solid c++ object. I think the new c++ std lib tools are getting higher strength string parsers, but haven't played with them yet.

Owner

pure functional transforms rule!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.