Skip to content

Instantly share code, notes, and snippets.

@vincentwoo
Created February 15, 2012 23:35
Show Gist options
  • Save vincentwoo/1840056 to your computer and use it in GitHub Desktop.
Save vincentwoo/1840056 to your computer and use it in GitHub Desktop.
# problems:
# words like don't
# reading the entire file in memory
# builds reverse index dumbly, not while iterating, pretty slow
# pros:
# EASY TO CODE FAST
require 'pp'
file = File.open "wordinput.txt", "rb"
contents = file.read.downcase
index = contents.scan(/\w+/).group_by{|x| x}.map {|k,v| [k, v.length]}
reverse = index.map {|word, count|
last = 0
ret = []
count.times do
last = contents.index word, last
ret.push last
last += 1
end
[word, ret]
}
p "indices: "
pp index
p "reverse indices: "
pp reverse
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment