Skip to content

Instantly share code, notes, and snippets.

@johnbintz
Last active August 29, 2015 14:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnbintz/be4a6509ebba05eebe42 to your computer and use it in GitHub Desktop.
Save johnbintz/be4a6509ebba05eebe42 to your computer and use it in GitHub Desktop.
Join Kickstarter CSV files for shipping purposes
require 'csv'
fields_to_keep = [
"Backer Id", "Pledge Amount", "Shipping Name", "Shipping Address 1",
"Shipping Address 2", "Shipping City", "Shipping State", "Shipping Postal Code",
"Shipping Country Name"
]
sort_result_on = "Backer Id"
all_addresses = []
Dir['Kickstarter*'].each do |file|
first_row = true
has_address = false
column_indexes = {}
CSV.foreach(file) do |row|
if first_row
column_indexes = Hash[row.each_with_index.collect { |hfield, index| [index, fields_to_keep.find { |field| hfield[field] }] }.compact]
first_row = false
has_address = column_indexes.values.include?('Shipping Name')
else
if has_address
data = row.each_with_index.find_all { |data| column_indexes[data.last] }.collect { |data|
[column_indexes[data.last], data.first]
}
all_addresses << Hash[data]
end
end
end
end
CSV.open('joined.csv', 'w') do |csv|
csv << fields_to_keep
all_addresses.sort { |a, b| a[sort_result_on] <=> b[sort_result_on] }.each do |address|
csv << fields_to_keep.collect { |field| address[field] }
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment