Skip to content

Instantly share code, notes, and snippets.

@awendt
Last active August 28, 2019 12:41
Show Gist options
  • Save awendt/6fce04bc79a2b5301d3d5290dafcd616 to your computer and use it in GitHub Desktop.
Save awendt/6fce04bc79a2b5301d3d5290dafcd616 to your computer and use it in GitHub Desktop.
Data conversion scripts

Data conversion scripts

This is a collection of scripts to convert data from one format to another.

CSV to JSON

What? Turns CSV into JSON Why? JSON is nicer to navigate and query using jq or fx How?

# Pass the file on the command-line
./csv-to-json.rb $FILE
# or use stdin
./csv-to-json.rb -

HTML table to CSV

What? Turns an HTML table into CSV Why? Sometimes, websites don't provide the right APIs and you just want to export data from a table. How?

# Pass the file on the command-line
./table-to-csv.rb $FILE
# or use stdin
./table-to-csv.rb -
#!/usr/bin/env ruby
require 'csv'
require 'json'
file = ARGV[0] == '-' ? STDIN.read : File.new(ARGV[0])
csv_table = CSV.new(file, headers: true)
values = []
csv_table.each do |row|
values << row.to_hash
end
puts JSON.pretty_generate(values)
require 'nokogiri'
require 'csv'
file = ARGV[0] == '-' ? STDIN.read : File.new(ARGV[0])
doc = Nokogiri::HTML(file)
headers = doc.xpath('//table//th').map {|header| header.text }
rows = []
doc.xpath('//table//tr').each do |row|
csv_row = CSV::Row.new([], [])
row.xpath('td').each_with_index do |cell, index|
csv_row << [headers[index], cell.text.strip]
end
next if csv_row.empty?
rows << csv_row
end
print CSV::Table.new(rows).to_csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment