Skip to content

Instantly share code, notes, and snippets.

@amosshapira
Last active December 28, 2015 13:59
Show Gist options
  • Save amosshapira/7511840 to your computer and use it in GitHub Desktop.
Save amosshapira/7511840 to your computer and use it in GitHub Desktop.
Convert HTML from input files (or standard input, if no file name given) to CSV. Taken from a stackoverflow.com answer I lost since.
#!/usr/bin/ruby
require 'nokogiri'
doc = Nokogiri::HTML(ARGF.read)
doc.xpath('//table//tr').each do |row|
row.xpath('td').each do |cell|
print '"', cell.text.gsub("\n", ' ').gsub('"', '\"').gsub(/(\s){2,}/m, '\1'), "\", "
end
print "\n"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment