Skip to content

Instantly share code, notes, and snippets.

@stuartlynn
Created May 27, 2014 16:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stuartlynn/51890ceeadda90102b5e to your computer and use it in GitHub Desktop.
Save stuartlynn/51890ceeadda90102b5e to your computer and use it in GitHub Desktop.
Language stats for project
require 'mongo'
client = Mongo::MongoClient.new
db = client["ouroboros"]
stats = {}
total = db["galaxy_zoo_classifications"].count
done = 0
db["galaxy_zoo_classifications"].find.each do |classification|
done += 1
puts "done #{done} of #{total}" if done%1000==0
lang = (classification["annotations"].select{|ann| ann.keys.include? "lang"}.first || {"lang" => "en"})["lang"]
stats[lang] ||= {logged_in: 0 , logged_out: 0 , user_ids: Set.new, user_ips: Set.new }
if classification["user_id"]
stats[lang][:logged_in] += 1
stats[lang][:user_ids].add classification["user_id"]
else
stats[lang][:logged_out] += 1
stats[lang][:user_ips].add classification["user_ip"]
end
end
File.open("galaxy_zoo_lang_stats.dat","w") do |file|
stats.each_pair do |lang, s|
file.puts [lang, s[:logged_in], s[:logged_out], s[:user_ids].count, s[:user_ips].count].join(", ")
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment