Skip to content

Instantly share code, notes, and snippets.

@martinus
Forked from DataWraith/trie.rb
Created December 23, 2009 21:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save martinus/262815 to your computer and use it in GitHub Desktop.
Save martinus/262815 to your computer and use it in GitHub Desktop.
class TrieNode
def initialize
@sub_tries = {}
@end_of_word = false
end
def insert(input)
to_insert = input
# Convert input to array of chars if it's a string
to_insert = to_insert.split(//) if to_insert.is_a?(String)
# If to_insert is empty, we have reached the end of the word and add a
# end-of-word marker. Otherwise we would get partial words back when we're
# looking for a shorter word, for example "coverage", "cover?" => "covera"
if to_insert.empty?
@end_of_word = true
return
end
# Otherwise, create a sub_trie for the first character of the input and
# recursively insert the remainder of the word
cur_char = to_insert.shift
@sub_tries[cur_char] ||= TrieNode.new
@sub_tries[cur_char].insert(to_insert)
to_insert.unshift(cur_char)
end
def find_words(word)
to_find = word
to_find = to_find.split(//) if to_find.is_a?(String)
# If to_find is empty, we reached the end of the template. If the
# end-of-word marker is set, we return the empty string, which will be
# prepended with the characters of the word as we move upwards in the trie.
# Otherwise we return the empty array, which results in no word being added
# to the results.
if to_find.empty?
return [''] if @end_of_word
return []
end
cur_char = to_find.shift
result = []
if cur_char == '?'
# Search all sub-tries for words matching the remainder of the template
# and add them to the results
@sub_tries.each_pair do |key, value|
result += value.find_words(to_find).map { |w| key + w }
end
elsif @sub_tries[cur_char]
# We have a sub-trie that matches the current character of the template.
# Recursively get all words that match the remainder of the template
# below that sub-trie.
result = @sub_tries[cur_char].find_words(to_find).map { |w| cur_char + w }
else
# We don't have anything that matches the template.
result = []
end
to_find.unshift(cur_char)
return result
end
end
DICTIONARY = File.read('wordlist.txt').split("\n")
print "Building Tries... "
# create a separate Trie for each word length
@data = Hash.new do |h, k|
h[k] = TrieNode.new
end
DICTIONARY.each do |word|
@data[word.length].insert(word)
end
puts "done."
def query(q)
@data[q.length].find_words(q).each do |w|
yield w
end
end
puts "Searching trie..."
t = Time.now
query('????????????????e') do |word|
puts word
end
puts Time.now - t
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment