Skip to content

Instantly share code, notes, and snippets.

@danielharan
Created April 1, 2012 16:22
Show Gist options
  • Save danielharan/2276755 to your computer and use it in GitHub Desktop.
Save danielharan/2276755 to your computer and use it in GitHub Desktop.
Bicycle data missing time or location, as percentages
year: time location
2006: 1.59 0.26
2007: 1.74 0.37
2008: 2.97 0.42
2009: 2.86 0.68
2010: 4.60 1.49
From
http://www.montrealgazette.com/news/bike-accidents/index.html
http://blogs.montrealgazette.com/2012/03/26/mapping-bicycle-collisions-in-montreal/
Data: https://docs.google.com/spreadsheet/ccc?key=0AkHTOK71aJ7qdG0zeV9xRVkyUURkR2dKLVR6bi1XWFE#gid=0
# ruby 1.9
require 'CSV'
data = CSV.read("data.tsv", {:col_sep => "\t"})
totals = Hash.new {|k,v| k[v] = 0}
missing_time = Hash.new {|k,v| k[v] = 0}
missing_location = Hash.new {|k,v| k[v] = 0}
data.each do |year, date, time, civic, street1, street2|
totals[year] += 1
missing_time[year] += 1 if time.nil?
missing_location[year] += 1 if civic.nil? & street1.nil? & street2.nil?
end
totals.keys.sort.each do |year|
puts "%4d: %.2f %.2f" % [year, (missing_time[year].to_f / totals[year].to_f) * 100, (missing_location[year].to_f / totals[year].to_f) * 100]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment