Skip to content

Instantly share code, notes, and snippets.

@marten
Last active December 6, 2016 17:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save marten/868075009e97fb3060513d446513eeba to your computer and use it in GitHub Desktop.
Save marten/868075009e97fb3060513d446513eeba to your computer and use it in GitHub Desktop.
require 'csv'
require 'json'
require 'time'
users = {}
CSV.new(ARGF, headers: true).each do |row|
STDERR.print '.'
user_id = row["user_id"]
metadata = JSON.parse(row["metadata"])
started = Time.parse(metadata.fetch("started_at"))
finish = Time.parse(metadata.fetch("finished_at"))
next unless user_id
users[user_id] ||= {time: 0, classifications: 0}
users[user_id][:time] += finish - started
users[user_id][:classifications] += 1
users[user_id][:first_seen] ||= started
users[user_id][:last_seen] = finish
end
STDERR.puts " done\n\n"
CSV.new(STDOUT).tap do |csv|
csv << ["user_id", "time_spent", "classifications_count", "first_seen", "last_seen"]
users.each do |user_id, data|
csv << [user_id, data[:time], data[:classifications], data[:first_seen], data[:last_seen]]
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment