Skip to content

Instantly share code, notes, and snippets.

@max-mapper
Created December 3, 2010 22:49
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save max-mapper/727674 to your computer and use it in GitHub Desktop.
Save max-mapper/727674 to your computer and use it in GitHub Desktop.
chicago restaurant health inspections
require 'net/http'
require 'uri'
require 'nokogiri'
url = URI.parse('http://webapps.cityofchicago.org/healthinspection/inspectionresultrow.jsp')
request = Net::HTTP::Post.new(url.path)
request.set_form_data({"REST"=>" ", "STR_NBR"=>"", "STR_NBR2"=>"", "STR_DIRECTION"=>"", "STR_NM"=>"", "ZIP"=>""})
response = Net::HTTP.new(url.host, url.port).start {|http| http.request(request)}
doc = Nokogiri::HTML.parse(response.body)
doc.search('#results tr').each do |tr|
info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')
ScraperWiki.save(unique_keys=['text'], data = {'text' => info})
end
@victusfate
Copy link

this line alone

info = tr.text.each_line.map(&:strip).delete_if {|l| l.strip == ""}.join('\n')

took me a number of iterations and many lines of code to slap into a solid c++ object. I think the new c++ std lib tools are getting higher strength string parsers, but haven't played with them yet.

@max-mapper
Copy link
Author

pure functional transforms rule!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment