masak/pack-unpack.md

## pack-unpack.md

      
    Raw
  

              pack-unpack.md
            
          
    Here is the difference between Perl 5 and Perl 6 pack/unpack:
Perl 5                      Perl 6

pack(List)  --> Str         pack(List)  --> Buf
unpack(Str) --> List        unpack(Buf) --> List

Not only that, but some Perl 5 template rules assume an uncomplicated two-way street between Buf and Str. There simply is no real distinction in Perl 5 between Buf and Str, and Perl 5 makes use of that quite a bit.
Just looking at the description of the first template rule gives an indication of this:
   a   A string with arbitrary binary data, will be null padded.

In Perl 5, strings can contain arbitrary binary data, but in Perl 6, strings contain characters, and some arbitrary binary data is illegal.
Let us approach this from the descriptions in perldoc -f pack, and annotate each template rule with comments of what it could be in Perl 6. Since we are talking about pack, the discussion will be about what type of object can be passed in to each template.
   a   A string with arbitrary binary data, will be null padded.

        Since the purpose here is to insert arbitrary binary data, it
        seems that a Buf would do fine.

   A   A text (ASCII) string, will be space padded.

        A Buf, which will be space padded to the next byte boundary.
        We ignore the nonsense about ASCII. :-)

   Z   A null terminated (ASCIZ) string, will be null padded.

        Same as A above.

   b   A bit string (ascending bit order inside each byte, like vec()).

        An Int. (Or possible, in each case 'Int' is written here
        and below, something that can be converted to an Int. Even
        Rat or Num should be OK, except that they will be truncated
        to Int.)

   B   A bit string (descending bit order inside each byte).

        An Int.

   h   A hex string (low nybble first).

        An Int.

   H   A hex string (high nybble first).

        An Int.


   c   A signed char (8-bit) value.

        An Int.

   C   An unsigned char (octet) value.

        An Int.

   W   An unsigned char value (can be greater than 255).

        An Int.


   s   A signed short (16-bit) value.

        An Int.

   S   An unsigned short value.

        An Int.


   l   A signed long (32-bit) value.

        An Int.

   L   An unsigned long value.

        An Int.


   q   A signed quad (64-bit) value.

        An Int.

   Q   An unsigned quad value.
         (Quads are available only if your system supports 64-bit
          integer values _and_ if Perl has been compiled to support those.
          Causes a fatal error otherwise.)

        An Int. Not sure the restriction applies, since we are doing this
        above the metal.


   i   A signed integer value.

        An Int.

   I   A unsigned integer value.
         (This 'integer' is _at_least_ 32 bits wide.  Its exact
          size depends on what a local C compiler calls 'int'.)

        An Int. (We have uint but no UInt. And if we allow Int we implicitly
        allow uint since uint gets promoted into Int.)


   n   An unsigned short (16-bit) in "network" (big-endian) order.

        An Int.

   N   An unsigned long (32-bit) in "network" (big-endian) order.

        An Int.

   v   An unsigned short (16-bit) in "VAX" (little-endian) order.

        An Int.

   V   An unsigned long (32-bit) in "VAX" (little-endian) order.

        An Int.


   j   A Perl internal signed integer value (IV).

        An Int. (Would be fun to support this, but need to find out
        how the format works.)

   J   A Perl internal unsigned integer value (UV).

        An Int. (See note on 'j'.)


   f   A single-precision float in the native format.

        A Num. (But what is "the native format" here? The float format
        in the VM? Is that even exposed anywhere?)

   d   A double-precision float in the native format.

        A Num.


   F   A Perl internal floating point value (NV) in the native format

        A Num. (See note on 'j'.)

   D   A long double-precision float in the native format.
         (Long doubles are available only if your system supports long
          double values _and_ if Perl has been compiled to support those.
          Causes a fatal error otherwise.)

        A Num. (See note on 'j'.)


   p   A pointer to a null-terminated string.

        NO! Do not even *think* about it!

   P   A pointer to a structure (fixed-length string).

        NO!


   u   A uuencoded string.

        A Str. (Unsure about this one.)

   U   A Unicode character number.  Encodes to a character in character mode
       and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in byte mode.

        An Int.


   w   A BER compressed integer (not an ASN.1 BER, see perlpacktut for
       details).  Its bytes represent an unsigned integer in base 128,
       most significant digit first, with as few digits as possible.  Bit
       eight (the high bit) is set on each byte except the last.

        An Int.


   x   A null byte.
   X   Back up a byte.
   @   Null fill or truncate to absolute position, counted from the
       start of the innermost ()-group.
   .   Null fill or truncate to absolute position specified by value.

        None of these require any arguments.