Skip to content

Instantly share code, notes, and snippets.

@phillipoertel
Created September 2, 2016 20:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phillipoertel/a97682542b8c7eda6574f119567dc642 to your computer and use it in GitHub Desktop.
Save phillipoertel/a97682542b8c7eda6574f119567dc642 to your computer and use it in GitHub Desktop.
Ruby CSV library: using converters to clean up data
# given this CSV:
#
# agency_id;award_name;importance;years;description;ignore;is_further_award
# 5683;Deloitte Technology ;x;2014 2013;Fast 50 (2013), Fast 500 (2014);x;
#
# the following code will return:
#
# #<CSV::Row
# agency_id:5683
# award_name:"Deloitte Technology"
# importance::high
# years:"2014, 2013"
# description:"Fast 50 (2013), Fast 500 (2014)"
# ignore:true
# is_further_award:nil
# >
require 'csv'
IMPORTANCE_MAPPING = {'x' => :high, 'o' => :low}
def award_receivals(subset: nil)
stripper = lambda { |value| value.to_s.strip }
make_x_true = lambda { |value| value == "x" ? true : value }
importance = lambda do |value, info|
info.header == :importance ? IMPORTANCE_MAPPING[value] : value
end
options = {
col_sep: ';',
headers: true,
converters: [:integer, stripper, importance, make_x_true],
header_converters: :symbol
}
CSV.read('award_receivals.csv', options)
end
award_receivals.take(5).each do |row|
p row
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment