Skip to content

Instantly share code, notes, and snippets.

@voleinikov
Last active January 12, 2024 16:53
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save voleinikov/7f2e2c71b702e415d83c to your computer and use it in GitHub Desktop.
Save voleinikov/7f2e2c71b702e415d83c to your computer and use it in GitHub Desktop.
Ruby -- turn CSV file into array of hashes with headers as keys (rake task)
# Adapted from StackOverflow answers:
# http://stackoverflow.com/a/8477067/1319740
# and
# http://stackoverflow.com/a/18113090/1319740
# With help from:
# http://technicalpickles.com/posts/parsing-csv-with-ruby/
# A small rake task that validates that all required headers are present in a csv
# then converts the csv to an array of hashes with column headers as keys mapped
# to relevant row values.
#
# The data can then be passed wherever it is needed.
namespace :csv do
desc 'take a csv file and transform it into an array of hashes with headers as keys'
task :to_hash, [:csv_file_path] => :environment do |task, args|
unless args.csv_file_path.present?
($stderr.puts "ERROR: Please include file path -- Usage: rake csv:to_hash[path/to/csv/file]" && return)
end
# Check the csv to make sure all required headers are present before reading whole file
required_headers = %i(id) # Add any required csv headers here
headers = CSV.open(args.csv_file_path, 'r', :headers => true, :header_converters => :symbol) do |csv|
csv.first.headers
end
if (required_headers - headers).any?
%stderr.puts "ERROR: Please include #{required_headers} headers in your csv file" && return
end
# Now we read and transform file
# First set up any custom converters -- This one turns blank row values to nils
CSV::Converters[:blank_to_nil] = lambda do |field|
field && field.empty? ? nil : field
end
# Then we create and populate our array
attrs = []
CSV.foreach(args.csv_file_path, :headers => true, :header_converters => :symbol, :converters => [:all, :blank_to_nil]) do |row|
attrs << Hash[row.headers.zip(row.fields)]
end
# You can now pass the attrs array wherever you need -- for example a background job
# that creates/validates/saves model objects
attrs
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment