Skip to content

Instantly share code, notes, and snippets.

@isorsa
Forked from sandys/table_to_csv.rb
Created January 20, 2016 09:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save isorsa/27c8458165ab7cc9854c to your computer and use it in GitHub Desktop.
Save isorsa/27c8458165ab7cc9854c to your computer and use it in GitHub Desktop.
convert a html table to CSV using ruby
# run using ```rvm jruby-1.6.7 do jruby "-J-Xmx2000m" "--1.9" tej.rb```
require 'rubygems'
require 'nokogiri'
require 'csv'
f = File.open("/tmp/preview.html")
doc = Nokogiri::HTML(f)
csv = CSV.open("/tmp/output.csv", 'w',{:col_sep => ",", :quote_char => '\'', :force_quotes => true})
#doc.xpath('//table/tbody/tr').take(10).each do |row|
doc.xpath('//table/tbody/tr').each do |row|
tarray = []
row.xpath('td').each do |cell|
tarray << cell.text
end
csv << tarray
end
csv.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment