Skip to content

Instantly share code, notes, and snippets.

@leonid-shevtsov
Last active March 8, 2024 21:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save leonid-shevtsov/eee92edcdb7b3b01c5c96b924c6a2434 to your computer and use it in GitHub Desktop.
Save leonid-shevtsov/eee92edcdb7b3b01c5c96b924c6a2434 to your computer and use it in GitHub Desktop.
1 Billion Row Challenge... in Ruby?
stats = {}
File.open('measurements.txt') do |f|
f.each_line do |line|
city, temp_s = line.split(';')
temp = temp_s.to_f
stats[city] ||= {
min: 10_000,
max: -10_000,
sum: 0,
count: 0
}
stats[city][:min] = temp if temp < stats[city][:min]
stats[city][:max] = temp if temp > stats[city][:max]
stats[city][:sum] += temp
stats[city][:count] += 1
end
end
stats.each do |city, cstats|
puts "#{city}=#{cstats[:min]}/#{cstats[:sum].to_f / cstats[:count]}/#{cstats[:max]}"
end
stats = Hash.new { |h, k| h[k] = [10_000, -10_000, 0, 0] }
SEMI = String.new(';', encoding: Encoding::BINARY)
NEWLINE = String.new("\n", encoding: Encoding::BINARY)
File.open('measurements.txt', encoding: Encoding::BINARY) do |f|
loop do
city = f.readline(SEMI)
temp = f.readline(NEWLINE).to_f
existing = stats[city]
min, max, sum, count = existing
existing[0] = temp if temp < min
existing[1] = temp if temp > max
existing[2] = sum + temp
existing[3] = count + 1
end
rescue EOFError
# all good
end
stats.each do |city, cstats|
puts "#{city[0..-2]}=#{cstats[0]}/#{cstats[2] / cstats[3]}/#{cstats[1]}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment