Skip to content

Instantly share code, notes, and snippets.

@jnicho02
Created March 18, 2018 12:51
Show Gist options
  • Save jnicho02/c8173e6e5bb58ccfbc0b3d258fbbfa17 to your computer and use it in GitHub Desktop.
Save jnicho02/c8173e6e5bb58ccfbc0b3d258fbbfa17 to your computer and use it in GitHub Desktop.
parse Nevada State historical markers
require 'json'
require 'nokogiri'
require 'open-uri'
r = /([\w\W]*)([NEVADA ]*[STATE ]*[CENTENNIAL ]*[HISTORIC[AL]*]* MA[R]*KER)\s(No.|No|number)\W*(\d*)\W*(.*)\W*(.*)\W*(.*)\W*(.*)\W*(.*)\W*/i
j = JSON.parse(open('http://shpo.nv.gov/historical-markers-json').read)
j.each do |js|
puts "#{js['slug']}"
output = Nokogiri::HTML(open("http://shpo.nv.gov/nevadas-historical-markers/historical-markers/#{js['slug']}"))
contents = output.search('.//article/p').text.strip
contents += output.search('.//article/h3').text.strip
matches = r.match(contents)
matches.to_a.each_with_index do |m, index|
puts "matches[#{index}] = #{m}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment