Created
October 16, 2012 12:31
-
-
Save mvidner/3898978 to your computer and use it in GitHub Desktop.
Ruby 1.9 encoding: Array#pack('Z*') is not binary
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rbx-2.0.0-dev | |
Running with rubinius 2.0.0dev (1.9.3 9ad741f3 yyyy-mm-dd JI) [i686-pc-linux-gnu] | |
* A binary packet | |
ASCII-8BIT: � ("\xFF") 1 bytes | |
* A unicode string | |
UTF-8: Václav Havel ("V\xC3\xA1clav Havel") 13 bytes | |
* Plan B | |
** Adding NUL | |
UTF-8: Václav Havel ("V\xC3\xA1clav Havel\x00") 14 bytes | |
** Forcing 8-bit | |
ASCII-8BIT: Václav Havel ("V\xC3\xA1clav Havel\x00") 14 bytes | |
** Putting them together with String#+ | |
ASCII-8BIT: �Václav Havel ("\xFFV\xC3\xA1clav Havel\x00") 15 bytes | |
* Plan A | |
** Converting that string with pack("Z*"): | |
US-ASCII: Václav Havel ("V\xC3\xA1clav Havel\x00") 14 bytes | |
** Putting them together with String#+ | |
An exception occurred running rbx-encoding-pack-zstar.rb | |
undefined conversion for '"\xFF"' from ASCII-8BIT to US-ASCII (Rubinius::EncodingClass::Encoding::CompatibilityError) | |
Backtrace: | |
Rubinius::Type.compatible_encoding at kernel/common/type19.rb:47 | |
String#<< at kernel/common/string19.rb:420 | |
String#+ at kernel/common/string.rb:65 | |
Object#__script__ at rbx-encoding-pack-zstar.rb:36 | |
Rubinius::CodeLoader#load_script at kernel/delta/codeloader.rb:68 | |
Rubinius::CodeLoader.load_script at kernel/delta/codeloader.rb:110 | |
Rubinius::Loader#script at kernel/loader.rb:614 | |
Rubinius::Loader#main at kernel/loader.rb:815 | |
1.9.3-p194 | |
Running with ruby 1.9.3p194 (2012-04-20 revision 35410) [i686-linux] | |
* A binary packet | |
ASCII-8BIT: � ("\xFF") 1 bytes | |
* A unicode string | |
UTF-8: Václav Havel ("Václav Havel") 13 bytes | |
* Plan B | |
** Adding NUL | |
UTF-8: Václav Havel ("Václav Havel\u0000") 14 bytes | |
** Forcing 8-bit | |
ASCII-8BIT: Václav Havel ("V\xC3\xA1clav Havel\x00") 14 bytes | |
** Putting them together with String#+ | |
ASCII-8BIT: �Václav Havel ("\xFFV\xC3\xA1clav Havel\x00") 15 bytes | |
* Plan A | |
** Converting that string with pack("Z*"): | |
ASCII-8BIT: Václav Havel ("V\xC3\xA1clav Havel\x00") 14 bytes | |
** Putting them together with String#+ | |
ASCII-8BIT: �Václav Havel ("\xFFV\xC3\xA1clav Havel\x00") 15 bytes | |
jruby-1.6.7 | |
Running with jruby 1.6.7 (ruby-1.9.2-p312) (2012-02-22 3e82bc8) (Java HotSpot(TM) Client VM 1.7.0_07) [linux-i386-java] | |
* A binary packet | |
ASCII-8BIT: � ("\u00FF") 1 bytes | |
* A unicode string | |
UTF-8: Václav Havel ("Václav Havel") 13 bytes | |
* Plan B | |
** Adding NUL | |
UTF-8: Václav Havel ("Václav Havel\u0000") 14 bytes | |
** Forcing 8-bit | |
ASCII-8BIT: Václav Havel ("V\u00C3\u00A1clav Havel\u0000") 14 bytes | |
** Putting them together with String#+ | |
ASCII-8BIT: �Václav Havel ("\u00FFV\u00C3\u00A1clav Havel\u0000") 15 bytes | |
* Plan A | |
** Converting that string with pack("Z*"): | |
ASCII-8BIT: Václav Havel ("V\u00C3\u00A1clav Havel\u0000") 14 bytes | |
** Putting them together with String#+ | |
ASCII-8BIT: �Václav Havel ("\u00FFV\u00C3\u00A1clav Havel\u0000") 15 bytes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# -*- coding: utf-8 -*- | |
puts "Running with #{RUBY_DESCRIPTION}" | |
def dump(s) | |
puts "#{s.encoding}: #{s} (#{s.inspect}) #{s.bytesize} bytes" | |
end | |
puts '* A binary packet' | |
binary = 255.chr | |
dump binary | |
puts '* A unicode string' | |
unicode = 'Václav Havel' | |
dump unicode | |
puts '* Plan B' | |
puts '** Adding NUL' | |
packed_b = unicode + 0.chr | |
dump packed_b | |
puts '** Forcing 8-bit' | |
packed_b.force_encoding('ASCII-8BIT') | |
dump packed_b | |
puts '** Putting them together with String#+' | |
together_b = binary + packed_b | |
dump together_b | |
puts '* Plan A' | |
puts '** Converting that string with pack("Z*"):' | |
packed_a = [unicode].pack('Z*') | |
dump packed_a # US-ASCII, WTF | |
puts '** Putting them together with String#+' | |
together_a = binary + packed_a | |
dump together_a |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment