Skip to content

Instantly share code, notes, and snippets.

@sandys
Created October 18, 2012 10:04
Show Gist options
  • Star 38 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save sandys/3910840 to your computer and use it in GitHub Desktop.
Save sandys/3910840 to your computer and use it in GitHub Desktop.
convert a html table to CSV using ruby
# run using ```rvm jruby-1.6.7 do jruby "-J-Xmx2000m" "--1.9" tej.rb```
require 'rubygems'
require 'nokogiri'
require 'csv'
f = File.open("/tmp/preview.html")
doc = Nokogiri::HTML(f)
csv = CSV.open("/tmp/output.csv", 'w',{:col_sep => ",", :quote_char => '\'', :force_quotes => true})
#doc.xpath('//table/tbody/tr').take(10).each do |row|
doc.xpath('//table/tbody/tr').each do |row|
tarray = []
row.xpath('td').each do |cell|
tarray << cell.text
end
csv << tarray
end
csv.close
@mejibyte
Copy link

This saved me some time. Thanks!

@tkt028
Copy link

tkt028 commented Jul 13, 2014

Thanks! It works for me. :D

@G-Square
Copy link

thanks

@isorsa
Copy link

isorsa commented Jan 20, 2016

Thanks!

@mahendhar9
Copy link

Many Thanks!

@shrishti01
Copy link

Hi can anyone provide w with the script which convert full html file in csv file including tables and text

@debazav
Copy link

debazav commented May 18, 2018

THANKS!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment