Skip to content

Instantly share code, notes, and snippets.

@zcox
Created January 4, 2010 15:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save zcox/268579 to your computer and use it in GitHub Desktop.
Save zcox/268579 to your computer and use it in GitHub Desktop.
rootDir = "/home/zcox/dev/20_newsgroups"
raise rootDir + " does not exist" unless File.directory? rootDir
counts = Hash.new(0) #0 will be the default value for non-existent keys
Dir["#{rootDir}/**/*"].reject{|file| File.directory? file}.each do |file|
IO.read(file).scan(/\w+/) { |word| counts[word.downcase] += 1 }
end
open("counts-descreasing-ruby", "w") do |out|
counts.sort { |a, b| b[1] <=> a[1] }.each { |pair| out << "#{pair[0]}\t#{pair[1]}\n" }
end
open("counts-alphabetical-ruby", "w") do |out|
counts.sort { |a, b| a[0] <=> b[0] }.each { |pair| out << "#{pair[0]}\t#{pair[1]}\n" }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment