Skip to content

Instantly share code, notes, and snippets.

@jeremyolliver
Created July 19, 2013 02:12
Show Gist options
  • Save jeremyolliver/6034582 to your computer and use it in GitHub Desktop.
Save jeremyolliver/6034582 to your computer and use it in GitHub Desktop.
Extensions to csvscan gem to get an interface that allows rows to be accessed via a hash with keys defined by the header. Stock gem just returns you an array for each row, and doesn't support headers.
require 'csvscan'
module CSVScan
class ParseError < Exception; end
def self.foreach(file_path, options = {})
parsed_lines, parsed_rows, total_records = 0, 0, 0
first_line, headers = true, Hash.new
# CSVScan doesn't parse the last line, so make a blank one to make sure nothing is missed
file_contents = (File.read(file_path) + "\n")
CSVScan.scan(file_contents) do |row_array|
parsed_lines += 1 # total lines, regardless of whether they have content
next if row_array == [nil] || row_array.nil? # Skip blank lines
parsed_rows += 1 # total rows, regardless of whether they are headers
if options[:headers] == :first_line
if first_line
headers = row_array
first_line = false
else
row_hash = Hash.new
row_array.each_with_index { |r, i| row_hash[headers[i]] = r }
yield(row_hash)
total_records += 1
end
else
yield(row_array)
total_records += 1
end
end
if parsed_lines < file_contents.lines.count
raise ParseError.new("
Expected to parse #{file_contents.lines.count} lines but only parsed #{parsed_lines}.
Invalid CSV row around lines #{[1, parsed_lines].max} to #{parsed_lines + 1}?
".squish)
end
{ :lines => parsed_lines, :rows => parsed_rows, :records => total_records }
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment