Skip to content

Instantly share code, notes, and snippets.

@sixtyfive
Last active September 15, 2021 13:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sixtyfive/a2e2c7f2f5d71269c5931647baa4fb03 to your computer and use it in GitHub Desktop.
Save sixtyfive/a2e2c7f2f5d71269c5931647baa4fb03 to your computer and use it in GitHub Desktop.
Interpret bin/oct/dec/hex number(s) in specified character encoding and try to convert to UTF-8, then display the result if possible.
#!/usr/bin/env ruby
# -----------------------------------------------------------------------------
# chars: interpret bin/oct/dec/hex number(s) in specified character encoding
# and try to convert to UTF-8, then display the result if possible.
# (C) 2021 J. R. Schmid <jrs@weitnahbei.de>
# May be shared and used according to CC BY-SA 4.0
# (https://creativecommons.org/licenses/by-sa/4.0/)
# -----------------------------------------------------------------------------
require 'slop' # gem install slop
require 'colorize' # gem install colorize
require 'unicode/name' # gem install unicode-name
def print_usage
warn "Usage: #{File.basename(__FILE__)} [options] <arg1> <arg2> <...>"
warn ""
warn " options (unless -f is specified, numbers are expected as arguments):"
warn ""
warn " -l|--list-encodings: do nothing except list all known encodings"
warn " -e|--encoding <comma separated list | all> (default: utf-8)"
warn " -b|--base <bin|oct|dec|hex> (default: numbers are interpreted as hexadecimals)"
warn " -n|--with-names: also show each codepoint's UTF-8 name if possible"
warn " -f|--forward: expect strings of characters as input instead of numbers; implies -n"
warn ""
exit
end
begin
@opts = Slop.parse do |o|
o.string '-e', '--encoding', ''
o.string '-b', '--base', ''
o.bool '-l', '--list-encodings',''
o.bool '-n', '--with-names', ''
o.bool '-f', '--forward', ''
end
rescue Slop::UnknownOption, Slop::MissingArgument, Slop::MissingRequiredOption
print_usage
end
@opts[:encoding] ||= 'UTF-8'
@opts[:base] ||= 'hex'
@known_encodings = Encoding.list.map{|e| [e.to_s.downcase, e.to_s]}.to_h.sort.to_h
@encodings = []
if @opts[:list_encodings]
@known_encodings.each{|k,v| puts k}
exit
else
print_usage unless @opts.arguments.any?
@opts[:encoding].split(',').each do |encoding|
(@encodings = @known_encodings.values; break) if encoding == 'all'
@encodings << @known_encodings[encoding.downcase]
end
end
def basex2dec(n)
base = case @opts[:base].to_sym
when :bin then 2
when :oct then 8
when :dec then return n.to_i
when :hex then 16
else print_usage
end
n.to_s.to_i(base)
end
def encode(i, encoding)
i.chr(encoding)
end
def show_names(chars, indent: '')
chars.each do |char|
char = char.encode(Encoding::UTF_8)
puts "#{indent}#{char.blue}:\t#{Unicode::Name.of(char)}"
end
end
def nums_to_chars(encoding)
chars = []
begin
@opts.arguments.each{|n| chars << encode(basex2dec(n.downcase.gsub(/([\da-f]+)/, '\1')), encoding)}
print "#{encoding} codepoints as hexadecimals: "; pp chars
puts "As displayed by terminal in #{encoding}: \"#{chars.join.blue}\""
puts "As displayed by terminal after conversion from #{encoding} to UTF-8: \"#{chars.join.encode(Encoding::UTF_8).light_blue}\""
show_names(chars, indent: "\t") if @opts[:with_names]
rescue RangeError => e
warn "(#{e.message})"
rescue Encoding::UndefinedConversionError,
Encoding::UndefinedConversionError,
Encoding::CompatibilityError,
Encoding::ConverterNotFoundError => e
warn "(converting from #{encoding} to UTF-8 not possible: #{e.message.yellow})"
end
puts
end
@encodings.each do |encoding|
if @opts[:forward]
show_names @opts.arguments.join(' ').chars
else
nums_to_chars(encoding)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment