Skip to content

Instantly share code, notes, and snippets.

@mattb
Created August 22, 2011 20:50
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mattb/1163499 to your computer and use it in GitHub Desktop.
Save mattb/1163499 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'nokogiri'
m = {}
f = {}
open("dist.male.first").readlines.each { |l|
d = l.split(/ +/)
m[d[0].downcase] = d[1].to_f
}
open("dist.female.first").readlines.each { |l|
d = l.split(/ +/)
f[d[0].downcase] = d[1].to_f
}
names = Dir.glob("page*").map { |f|
Nokogiri::HTML.parse(open(f)).css("td.master").map { |td| td.children[2].text.split(/,/)[0].strip.downcase }
}.flatten.uniq
males = 0
females = 0
names.each { |n|
n = n.split(/ /)[0]
if m.has_key?(n) and f.has_key?(n)
ratio = 1.0/(m[n] + f[n])
puts "B #{n} #{ratio * m[n]} #{ratio * f[n]}"
males += ratio * m[n]
females += ratio * f[n]
elsif m.has_key?(n)
puts "M #{n}"
males += 1
elsif f.has_key?(n)
puts "F #{n}"
females += 1
else
puts "U #{n}"
end
}
puts "male: #{males}"
puts "female: #{females}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment