Skip to content

Instantly share code, notes, and snippets.

@etagwerker
Created December 3, 2010 13:58
Show Gist options
  • Save etagwerker/726979 to your computer and use it in GitHub Desktop.
Save etagwerker/726979 to your computer and use it in GitHub Desktop.
Ruby script to count words in a file and order them by occurrences
# Script to index words inside a text file.
# Words separated by spaces.
# Usage: ruby indexer.rb /path/to/file.txt
file_path = ARGV[0] || "votacion.txt"
WORDS_COUNT = {}
file = File.open(file_path, "r")
puts "Indexing #{file_path}"
file.each_line do |line|
words = line.split
words.each do |word|
word = word.gsub(/[,()'"]/,'')
if WORDS_COUNT[word]
WORDS_COUNT[word] += 1
else
WORDS_COUNT[word] = 1
end
end
end
puts "Indexed #{file_path}"
puts "Words count: "
WORDS_COUNT.sort {|a,b| a[1] <=> b[1]}.each do |key,value|
puts "#{key} => #{value}"
end
puts "The end. "
@bagwanpankaj
Copy link

Don't you think it will blow off heap, in case txt file is very large(say 1 TB)?

@jaake
Copy link

jaake commented Dec 23, 2014

This really came in handy for me to use while making my girlfriends xmas present. I tweaked it and made it work for seeding a rails database but not having to spend that extra time writing from scratch was nice! http://jailee.us

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment