Created
October 23, 2015 20:03
-
-
Save spect88/da79e44bdd2c98e326d4 to your computer and use it in GitHub Desktop.
Translator of unicode regex ranges
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# Prints what regex unicode ranges passed via STDIN actually mean | |
# | |
# Example input: | |
# | |
# [\u0021-\u0027\u002A-\u002E\u003F\u0041-\u005A\u005C\u005F-\u007A\u00AA\u00B5\u00BA\u00C0-\u00D6\u00D8-\u00F6] | |
# | |
# Output: | |
# | |
# 0021 - 0027: !"#$%&' | |
# 002A - 002E: *+,-. | |
# 0041 - 005A: ABCDEFGHIJKLMNOPQRSTUVWXYZ | |
# 005F - 007A: _`abcdefghijklmnopqrstuvwxyz | |
# 00C0 - 00D6: ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ | |
# 00D8 - 00F6: ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö | |
# 003F: ? | |
# 005C: \ | |
# 00AA: ª | |
# 00B5: µ | |
# 00BA: º | |
def unicode_regexp_to_characters(input) | |
output = [] | |
input | |
.gsub(/\\u(\w{4})-\\u(\w{4})/) do | |
chars = ($1.hex .. $2.hex).map { |code| [code].pack('U') }.join | |
output << "#{$1} - #{$2}: #{chars}" | |
'' | |
end | |
.scan(/\\u(\w{4})/) do | |
char = [$1.hex].pack('U') | |
output << "#{$1}: #{char}" | |
end | |
output | |
end | |
STDIN.each_line do |line| | |
puts unicode_regexp_to_characters(line).join("\n") | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment