Skip to content

Instantly share code, notes, and snippets.

@vshankar
Created April 3, 2012 19:00
Show Gist options
  • Save vshankar/2294718 to your computer and use it in GitHub Desktop.
Save vshankar/2294718 to your computer and use it in GitHub Desktop.
Compression/DeCompression Translator

Compression/De-Compression Translator

Usefulness

This translator minimizes the data that is transferred over the wire by compressing (deflate) it before it's written to the network. This compressed data is decompressed (inflate) on the client side. Hence, this translator is needed to be loaded on client and well as server with inverse operation modes.

Compression and Decompression would be referred as deflate and inflate (respectively) further ahead in this document.

What is Deflated/Inflated

Data transferred over the network as a result of file read operation is deflated, i.e. the translator implements a readv fop that is responsible for deflating the data before passing it to server protocol. On the client side, this deflated data is inflated after client protocol hands over the data read from the network to the translator.

It may be worthwhile to deflate data for writev fop too. In this case, the operation mode would need to be flipped as the client would deflate and the server would inflate the data (for this fop).

Data format

Deflated data is transferred in Gzip format. Zlib library is used to deflate/inflate data and for data checksums.

Gzip Format: <Gzip header> + <compressed data> + <Gzip trailer>

Gzip Header

10 byte header consisting of: '\037', '\213', Z_DEFLATED, 0, 0, 0, 0, 0, 0, 0x03

As of now, we transfer this header as an aid to assist debugging. In memory deflated data can be written to a file on disk and can be examined for correctness in deflation. (See Debugging later in this document)

Identification of deflated data on client side is done by the presence of a key:value pair in dictionary and not by checking bits in the Gzip header. Commit 9d3af... introduces a extra dict in all fops which can exchanges b/w all layers. Presence of a chosen key indicates deflated content.

Deflated Content

Deflation is taken care by deflate() routine in Zlib library. Interested people may look at this. Similarly, Inflating data is taken care by inflate() routine in Zlib library.

Both APIs need correct pointers (Zlibs stream structure) to input and output buffers along with the length of the buffers.

Gzip trailer

Trailer is 8 bytes in length; first 4 bytes is the checksum of the original data and the next 4 bytes is it's length. This is primarily used to validate the correctness of the inflated data on the client side. gzip also makes use of the trailer for the same.

Volfile Configuration

Brick volfile

Load the translator above server protocol with operation mode as compress. I stripped of other translators for brevity. (Translator options are mentioned further ahead in this document)

volume colon-d-posix
    type storage/posix
    option directory /d0
    option volume-id 928515dd-fc50-4612-a87a-7440cb87c258
end-volume

volume colon-d-cdc
    type features/cdc
    option mode compress
    option buffer-size 16384
    subvolumes colon-d-posix
end-volume

volume /d0
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes colon-d-cdc
end-volume

volume colon-d-server
    type protocol/server
    option transport-type tcp
    option auth.addr./d0.allow *
    subvolumes /d0
end-volume

Client volfile

Load the translator below dht with operation mode as decompress. Note, loading below client protocolmakes more sense as dht readv callback should see the actual data. Try it out if you want that.

volume colon-d-client-0
    type protocol/client
    option remote-host 192.168.1.75
    option remote-subvolume /d0
    option transport-type tcp
end-volume

volume colon-d-client-1
    type protocol/client
    option remote-host 192.168.1.75
    option remote-subvolume /d1
    option transport-type tcp
end-volume

volume colon-d-dht
    type cluster/distribute
    subvolumes colon-d-client-0 colon-d-client-1
end-volume

volume colon-d-cdc
    type features/cdc
    option mode decompress
    option buffer-size 16384
    subvolumes colon-d-dht
end-volume

volume colon-d
    type debug/io-stats
    option latency-measurement on
    option count-fop-hits on
    subvolumes colon-d-cdc
end-volume

Xlator options

Some of the translator options directly correspond to the Zlib options mentioned here. The most relevant options are:

buffer-size: Internal buffer size used by Zlib (best is 16K).

cdc-level: The compression level; Ranges from 0 (No compression), 1 (Best speed, not-so-good compression) upto 9 (Best compression, slow speed). Defaults to -1 which provides a good compromise between compression and speed.

mode: compress or decompress; depending on where the translator is loaded.

Debugging

It's a pain to read gzip data since you cannot make much sense about it. One way to check if the data was deflated properly is to gdb to one of the brick processes and do the bits outlined below:

gdb -p <pidof-glusterfsd>
(gdb) cdc_readv_cbk
Breakpoint 1 at 0x7ff918b54832: file cdc.c, line 566.
(gdb) c
Continuing.

At this point cat a file on Gluster mount.

Breakpoint 1, cdc_readv_cbk (frame=0x7ff91b86e0d8, cookie=0x7ff91b86e184, this=0x67e5b0, op_ret=14, op_errno=2, vector=0x7fffd70231d0, count=1, buf=0x7fffd7023150, iobref=0x678d60, xdata=0x0) at cdc.c:566
(gdb) # 'n' a number of times till you reach
587
(gdb) ret = cdc_compress (this, priv, &ci);
588
(gdb) if (ret)
(gdb) call cdc_dump_iovec_to_disk (this, &ci, "/tmp/file.gz")

At this point the compressed data would be written to /tmp/file.gz. gzip -d /tmp/file.gz would inflate the file and produce the actual data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment