Skip to content

Instantly share code, notes, and snippets.

@sparkertime
Created July 13, 2012 23:10
Show Gist options
  • Save sparkertime/3108152 to your computer and use it in GitHub Desktop.
Save sparkertime/3108152 to your computer and use it in GitHub Desktop.
Thresher - Separates the Wheat from the Chaff
require "rake"
require "csv"
require_relative '../../lib/thresher_sketch'
namespace :chicago_new do
namespace :contracts do
desc "Load city contracts"
task :load => :environment do
file_name = path_to(ENV["contracts_file"] || "Contracts.csv")
ContractsThresher.thresh file_name, :rejections_to => "rejected_contracts.csv"
end
end
end
require_relative 'thresher'
require_relative '../app/models/vendor'
require_relative '../app/models/contract'
class ContractsThresher # named after the source rather than a model, but they're conflated in this case
extend Thresher
create_or_find Vendor, :by => :external_id do
fields :external_id => "Vendor ID",
:name => "Vendor Name",
:address1 => "Address 1",
:address2 => "Address 2",
:city => "City",
:state => "State",
:zipcode => "Zip"
reject_if_blank :all
end
upsert Contract, :by => [:purchase_order, :revision] do
fields :purchase_order => "Purchase Order (Contract) Number",
:vendor => {:association => Vendor, :by => {:external_id => "Vendor ID"}}, # a little repetitive, but anything else was too much magic for me
:revision => "Revision Number",
:description => "Purchase Order Description",
:specification => "Specification Number",
:award_amount => {:column => "Award Amount", :formatting => format_award_amount}, # possible by some context/method_missing black magic. Would appreciate your thoughts on this - feels a little too evil, but the alternatives (wrapping with procs, accepting a symbol and calling #send behind the scenes) feel less obvious to use. Also this method enforces a standalone method to do the formatting, which I like.
:start_at => {:column => "Start Date", :formatting => format_date},
:end_at => {:column => "End Date", :formatting => format_date},
:approval_at => {:column => "End Date", :formatting => format_date},
:contract_type => "Contract Type",
:department => "Department",
:procurement_type => "Procurement Type"
reject_if_blank :all
end
def format_award_amount(raw_award)
raw_award.strip(0)
end
def format_date(raw_date)
Date.strptime('%m/%d/%Y')
end
end
@sparkertime
Copy link
Author

Chad:

Good idea on the hooks. I'll make sure and add that. As for the other files, you can see this at https://github.com/citizenparker/chicago-finances/commit/139cded2d83ba2b04ce337d9c00fbd7e11c89ad4. As always, feedback welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment