Skip to content

Instantly share code, notes, and snippets.

@Konstantinusz
Last active September 2, 2020 20:51
Show Gist options
  • Save Konstantinusz/e99e9b0d76a30adcb7f79188958c26e2 to your computer and use it in GitHub Desktop.
Save Konstantinusz/e99e9b0d76a30adcb7f79188958c26e2 to your computer and use it in GitHub Desktop.
require "nokogiri"
require "json"
require "open-uri"
url=ARGV[0]||"https://hu.wikipedia.org/wiki/Magyarorsz%C3%A1g_v%C3%A1rosai"
d=Nokogiri::HTML(URI.open(url).read)
table=d.css(".wikitable").max_by{|z| z.to_html.size}
th=table.css("tr").css("th").map{|z| z.text.chomp.strip.scan(/([[:alpha:]]+)/).flatten[0]}
puts %Q{/*#{th.join("|")}*/\n}+JSON.pretty_generate(table.css("tr")[1..-1].map{|row| row.css("td").map{|cell| data=cell.text.chomp.strip;data=~/^[0-9\s\u00a0]+$/ ? data.gsub(/[\s\u00a0]+/,"").to_i : data}}.reject{|row| row.size==0})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment