Skip to content

Instantly share code, notes, and snippets.

@retorillo
Last active July 14, 2017 10:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save retorillo/1bb238023ecc29eed6cef636db930ba0 to your computer and use it in GitHub Desktop.
Save retorillo/1bb238023ecc29eed6cef636db930ba0 to your computer and use it in GitHub Desktop.
GIF and LZW References

LZW References

LZW data compression algorithm (US PAT.)

Compression of Individual Sequences via Variable Rate Coding (IEEE)

A Technique For High-Performance Data Compression (IEEE)

LZW Data Compression

The LZW algorithm (Data Compression Reference Center)

GIF Specification (W3C)

@retorillo
Copy link
Author

retorillo commented May 28, 2017

Variable-Length-Code LZW for GIF

  • The Variable-Length-Code aspect of the algorithm is based on an initial code size (LZW-initial code size), which specifies the initial number of bits used for the compression codes.
  • When the number of patterns detected by the compressor in the input stream exceeds the number of patterns encodable with the current
    number of bits, the number of bits per LZW code is increased by one.
  • The first byte of the Compressed Data stream is a value indicating the minimum number of bits required to represent the set of actual pixel values. Normally this will be the same as the number of color bits. Because of some algorithmic constraints however, black & white images which have one color bit must be indicated as having a code size of 2. This code size value also implies that the compression codes must start out one bit longer.

         7 6 5 4 3 2 1 0
        +---------------+
        | LZW code size |
        +---------------+

        +---------------+ ----+
        |  block size   |     |
        +---------------+     |
        |               |     +-- Repeated as many
        |  data bytes   |     |   times as necessary.
        |               |     |
        +---------------+ ----+

        . . .       . . . ------- The code that terminates the LZW
                                  compressed data must appear before
                                  Block Terminator.
        +---------------+
        |0 0 0 0 0 0 0 0|  Block Terminator
        +---------------+

Main differences

  • A special Clear code is defined which resets all compression/decompression parameters and tables to a start-up state. The value of this code is 2 ^ <CODE SIZE>. For example if the code size indicated was 4 (image was 4 bits/pixel) the Clear code value would be 16 (10000 binary). The Clear code can appear at any point in the image data stream and therefore requires the LZW algorithm to process succeeding codes as if a new data stream was starting. Encoders should output a Clear code as the first code of each image data stream.
  • An End of Information code is defined that explicitly indicates the end of the image data stream. LZW processing terminates when this code is encountered. It must be the last code output by the encoder for an image. The value of this code is <Clear code> + 1.
  • The first available compression code value is <Clear code> + 2. (<Clear code> + 1 is "End of Information code")
  • The output codes are of variable length, starting at <CODE SIZE> + 1 bits per code, up to 12 bits per code. This defines a maximum code value of 4095(0xFFF). Whenever the LZW code value would exceed the current code length, the code length is increased by one. The packing/unpacking of these codes must then be altered to reflect the new code length.

Clear Code

As the specification says, they may appear at anytime. There is not a requirement to send a clear code when the string table is full.
It is the encoder's decision as to when the table should be cleared. When the table is full, the encoder can chose to use the table as is, making no changes to it until the encoder chooses to clear it. The encoder during this time sends out codes that are of the maximum Code Size.

As we can see from the above, when the decoder's table is full, **it must not change the table until a clear code is received. ** The Code Size is that of the maximum Code Size. Processing other than this is done normally.

How to build byte array from variable-length-codes

Because the LZW compression used for GIF creates a series of variable length codes, of between 3 and 12 bits each, these codes must be reformed into aseries of 8-bit bytes that will be the characters actually stored or transmitted. This provides additional compression of the image. The codes are formed into a stream of bits as if they were packed right to left and then picked off 8 bits at a time to be output.

Assuming a character array of 8 bits per character and using 5 bit codes to be packed, an example layout would be similar to:


     +---------------+
  0  |               |    bbbaaaaa
     +---------------+
  1  |               |    dcccccbb
     +---------------+
  2  |               |    eeeedddd
     +---------------+
  3  |               |    ggfffffe
     +---------------+
  4  |               |    hhhhhggg
     +---------------+
           . . .
     +---------------+
  N  |               |
     +---------------+

Note that the physical packing arrangement will change as the number of bits per compression code change but the concept remains the same.

https://www.w3.org/Graphics/GIF/spec-gif89a.txt

@retorillo
Copy link
Author

retorillo commented May 30, 2017

GIF Grammar

<GIF Data Stream> ::=     Header <Logical Screen> <Data>* Trailer

<Logical Screen> ::=      Logical Screen Descriptor [Global Color Table]

<Data> ::=                <Graphic Block>  |
                          <Special-Purpose Block>

<Graphic Block> ::=       [Graphic Control Extension] <Graphic-Rendering Block>

<Graphic-Rendering Block> ::=  <Table-Based Image>  |
                               Plain Text Extension

<Table-Based Image> ::=   Image Descriptor [Local Color Table] Image Data

<Special-Purpose Block> ::=    Application Extension  |
                               Comment Extension

@retorillo
Copy link
Author

retorillo commented May 31, 2017

GIF Color Tables

Both color tables, the Global and the Local, are optional; if present, the Global Color Table is to be used with every image in the Data Stream for which a Local Color Table is not given; if present, a Local Color Table overrides the Global Color Table. However, if neither color table is present, the application program is free to use an arbitrary color table. If the graphics in several Data Streams are related and all use the same color table, an encoder could place the color table as the Global Color Table in the first Data Stream and leave subsequent Data Streams without a Global Color Table or any Local Color Tables; in this way, the overhead for the table is eliminated. It is recommended that the decoder save the previous Global Color Table to be used with the Data Stream that follows, in case it does not contain either a Global Color Table or any Local Color Tables. In general, this allows the application program to use past color tables, significantly reducing transmission overhead.

      7 6 5 4 3 2 1 0        Field Name                    Type
     +===============+
  0  |               |       Red 0                         Byte
     +-             -+
  1  |               |       Green 0                       Byte
     +-             -+
  2  |               |       Blue 0                        Byte
     +-             -+
  3  |               |       Red 1                         Byte
     +-             -+
     |               |       Green 1                       Byte
     +-             -+
 up  |               |
     +-   . . . .   -+       ...
 to  |               |
     +-             -+
     |               |       Green 255                     Byte
     +-             -+
767  |               |       Blue 255                      Byte
     +===============+

@retorillo
Copy link
Author

Animated-GIF Looping extension (unofficial)

http://www.vurdalakov.net/misc/gif/netscape-looping-application-extension

Syntax

    +---------------+
 0  |     0x21      |  Extension Label
    +---------------+
 1  |     0xFF      |  Application Extension Label
    +---------------+
 2  |     0x0B      |  Block Size
    +---------------+
 3  |               | 
    +-             -+
 4  |               | 
    +-             -+
 5  |               | 
    +-             -+
 6  |               | 
    +-  NETSCAPE   -+  Application Identifier (8 bytes)
 7  |               | 
    +-             -+
 8  |               | 
    +-             -+
 9  |               | 
    +-             -+
10  |               | 
    +---------------+
11  |               | 
    +-             -+
12  |      2.0      |  Application Authentication Code (3 bytes)
    +-             -+
13  |               | 
    +===============+                      --+
14  |     0x03      |  Sub-block Data Size   |
    +---------------+                        |
15  |     0x01      |  Sub-block ID          |
    +---------------+                        | Application Data Sub-block
16  |               |                        |
    +-             -+  Loop Count (2 bytes)  |
17  |               |                        |
    +===============+                      --+
18  |     0x00      |  Block Terminator
    +---------------+
  • Extension Label - Defines this block as an extension. This field contains the fixed value 0x21 (33).
  • Application Extension Label - Identifies the block as an Application Extension. This field contains the fixed value 0xFF (255).
  • Block Size - Number of bytes in this extension block, following the Block Size field, up to but not including the beginning of the Application Data. This field contains the fixed value 0x0B (11).
  • Application Identifier - Sequence of eight printable ASCII characters used to identify the application owning the Application Extension. This field contains the fixed value "NETSCAPE".
  • Application Authentication Code - Sequence of three bytes used to authenticate the Application Identifier. An Application program may use an algorithm to compute a binary code that uniquely identifies it as the application owning the Application Extension. This field contains the fixed value "2.0". Sometimes Application Identifier and Application Authentication Code fields are referred as one "NETSCAPE2.0" field.
  • Sub-block Data Size - Indicates the number of data bytes to follow. The size of the block does not account for the size byte itself. This field contains the fixed value 0x03 (3).
  • Sub-block ID - Identifies the Netscape Looping Extension. This field contains the fixed value 0x01 (1).
  • Loop Count - Indicates the number of iterations the animated GIF should be executed. This field is an unsigned 2-byte integer in little-endian (least significant byte first) byte order. 0x00 (0) means infinite loop.
  • Block Terminator - This zero-length data block marks the end of the Application Extension. This field contains the fixed value 0x00 (0).

@retorillo
Copy link
Author

Netscape Buffering Application Extension (GIF Unofficial Specification)

    +---------------+
 0  |     0x21      |  Extension Label
    +---------------+
 1  |     0xFF      |  Application Extension Label
    +---------------+
 2  |     0x0B      |  Block Size
    +---------------+
 3  |               | 
    +-             -+
 4  |               | 
    +-             -+
 5  |               | 
    +-             -+
 6  |               | 
    +-  NETSCAPE   -+  Application Identifier (8 bytes)
 7  |               | 
    +-             -+
 8  |               | 
    +-             -+
 9  |               | 
    +-             -+
10  |               | 
    +---------------+
11  |               | 
    +-             -+
12  |      2.0      |  Application Authentication Code (3 bytes)
    +-             -+
13  |               | 
    +===============+                      --+
14  |     0x05      |  Sub-block Data Size   |
    +---------------+                        |
15  |     0x02      |  Sub-block ID          |
    +---------------+                        |
16  |               |                        |
    +-             -+                        | Application Data Sub-block
17  |               |                        |
    +-             -+  Buffer Size (4 bytes) |
18  |               |                        |
    +-             -+                        |
19  |               |                        |
    +===============+                      --+
20  |     0x00      |  Block Terminator
    +---------------+

http://www.vurdalakov.net/misc/gif/netscape-buffering-application-extension

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment