Skip to content

Instantly share code, notes, and snippets.

@exlee
Created November 20, 2017 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save exlee/05c5f07f37aa00c4b0e27286ea600fb1 to your computer and use it in GitHub Desktop.
Save exlee/05c5f07f37aa00c4b0e27286ea600fb1 to your computer and use it in GitHub Desktop.
# encoding: utf-8
require 'whittle'
# Re-opening Whittle.
#
# Riddle: What's the difference between reopened initializate method
# and original library?
#
# Answer: There's none
#
# Reason: Regexp are encoded as source file, so Regexps defined in this file
# are UTF-8. Regexps defined in original Whittle are US-ASCII,
# which doesn't work with UTF-8 queries. That's why redefining
# initialization method here, where Regexps can happily parse Unicodes.
module Whittle
class Terminal
def initialize(name, *components)
raise ArgumentError, \
"Rule #{name.inspect} is terminal and can only have one rule component" \
unless components.length == 1
super
pattern = components.first
@pattern = if pattern.kind_of?(Regexp)
Regexp.new("\\G#{pattern}")
else
Regexp.new("\\G#{Regexp.escape(pattern)}")
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment