Skip to content

Instantly share code, notes, and snippets.

@namusyaka
Last active August 29, 2015 14:14
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save namusyaka/c92501557afb498ab7d3 to your computer and use it in GitHub Desktop.
Builds trie-ized regexp
require 'forwardable'
class Regexp
class Trie
def self.union(*patterns)
trie = new
patterns.each { |pattern| trie << pattern }
trie.to_regexp
end
extend Forwardable
def_delegators :@map, :include?, :[], :[]=
def initialize
@map = {}
end
def add(expr)
ref = self
expr.each_char do |char|
ref[char] = Trie.new unless ref.include?(char)
ref = ref[char]
end
ref[''] = nil
end
alias << add
def to_regexp
return if @map.include?("") && @map.length == 1
q = false
alt, cc = *@map.each_with_object([[], []]) do |(char, value), (alt, cc)|
quoted_char = Regexp.quote(char)
if self[char]
if recurse = value.to_regexp
alt << quoted_char + recurse
else
cc << quoted_char
end
else
q = true
end
end
cconly = alt.empty?
alt << (cc.size == 1 ? cc[0] : "[#{cc.join}]") unless cc.empty?
result = alt.size == 1 ? alt[0] : "(?:#{alt * '|'})"
result = cconly ? "#{result}?" : "(?:#{result})?" if q
result
end
end
end
if $0 == __FILE__
p Regexp::Trie.union("foobar", "fooxar", "foozap", "fooza")
p Regexp::Trie.union("a", "b", "c", "cca")
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment