Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
#!/usr/bin/ruby
# == Usage
#
# This script reads a sorted list of words from standard input:
#
# cat words | ./word_pairs.rb
#
# To benchmark the code and save it's output to a file use:
#
# time (cat words | ./word_pairs.rb > word_matches)
#
# == About
#
# Given a dictionary, output all word pairs where both words
# share all their letters except the last two, which are
# distinct and reversed.
#
# Notes:
# - use a reasonable dictionary of your choice
# - potential word pairs are: "ear, era" ; "revies, revise" ; "burglaries, burglarise"
# - "shear, era" is not a valid pair because "she" != "e"
# - "bell, bell" is not a valid pair because the last two letters are not distinct
#
# == Assumptions
#
# This script assumes a sorted list because `sort` is very
# fast on POSIX boxes and being sorted lets us use several
# optimizations.
#
# Another reason for assuming a sorted wordlist is that I
# havn't seen one that isn't sorted by default anyways.
#
# == Author
#
# The script was written by Jordan Curzon (mailto:curzonj@gmail.com).
#
# The challange was issued by CitrusByte (www.citrusbyte.com) on the Ruby Inside
# job list in Nov 2009. http://ruby.jobamatic.com/a/jbb/job-details/151556
#
current = 0
big_list = {}
while word = STDIN.gets do
word.strip!
# Nothing less than 3 letters can meet
# the criteria
next unless word.size > 2
# We can't match anything across first letter
# transistions, so let it be GC'd
if word[0] != current
big_list = {}
current = word[0]
end
# We can only match on words of the same
# size, so per-length buckets
list = big_list[word.size] ||= [ ]
# Keep list list size down for speed, this
# only works because of the sorted word-length buckets
while (!(top = list.first).nil?) && top < word do
list.shift
end
match = word[0..-3]+word[-2..-1].reverse
if top == word
list.shift
puts(word + " ~~ " + match)
elsif (word[-2] != word[-1]) && word < match
# If we've already passed the potential match, there
# is no point adding it to the list, it doesn't exist
list << match
list.sort!
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment