Skip to content

Instantly share code, notes, and snippets.

@bbhavsar
Last active May 6, 2020 18:49
Show Gist options
  • Save bbhavsar/0872562ab5f0ca94b4fe89fadb707c3d to your computer and use it in GitHub Desktop.
Save bbhavsar/0872562ab5f0ca94b4fe89fadb707c3d to your computer and use it in GitHub Desktop.
The _for128 and _for256 basically uses blocks of 128/256 input integers to calculate the diff
and min across the block simulating the mechanism used in Kudu's encoding implementation.
$ python codec-test.py repeat_small_range.csv
+--------------------------+----------------------+------------------------+--------------+
| codec | comp_time(millisecs) | decomp_time(millisecs) | bits_per_int |
+--------------------------+----------------------+------------------------+--------------+
| bitshuffle | 42.511 | 165.078 | 0.4868 |
| simdbinarypacking | 33.127 | 34.271 | 7.0821 |
| simdbinarypacking_for128 | 3237.606 | 1569.723 | 0.8262 |
| simdbinarypacking_for256 | 1721.381 | 815.763 | 0.7167 |
+--------------------------+----------------------+------------------------+--------------+
bankim@bankim-desktop:~/scripts$ python codec-test.py seq.csv
+--------------------------+----------------------+------------------------+--------------+
| codec | comp_time(millisecs) | decomp_time(millisecs) | bits_per_int |
+--------------------------+----------------------+------------------------+--------------+
| bitshuffle | 44.718 | 165.852 | 0.8019 |
| simdbinarypacking | 45.394 | 42.402 | 22.4628 |
| simdbinarypacking_for128 | 3182.910 | 1443.879 | 7.3125 |
| simdbinarypacking_for256 | 1678.570 | 749.743 | 7.6876 |
+--------------------------+----------------------+------------------------+--------------+
bankim@bankim-desktop:~/scripts$ python codec-test.py seq_small_range.csv
+--------------------------+----------------------+------------------------+--------------+
| codec | comp_time(millisecs) | decomp_time(millisecs) | bits_per_int |
+--------------------------+----------------------+------------------------+--------------+
| bitshuffle | 42.254 | 165.552 | 0.4180 |
| simdbinarypacking | 32.660 | 34.476 | 8.0626 |
| simdbinarypacking_for128 | 3127.136 | 1381.600 | 7.8125 |
| simdbinarypacking_for256 | 1655.436 | 720.877 | 8.1876 |
+--------------------------+----------------------+------------------------+--------------+
bankim@bankim-desktop:~/scripts$ python codec-test.py random.csv
+--------------------------+----------------------+------------------------+--------------+
| codec | comp_time(millisecs) | decomp_time(millisecs) | bits_per_int |
+--------------------------+----------------------+------------------------+--------------+
| bitshuffle | 150.301 | 168.620 | 24.2334 |
| simdbinarypacking | 47.270 | 43.378 | 24.0626 |
| simdbinarypacking_for128 | 3214.220 | 1440.086 | 24.3126 |
| simdbinarypacking_for256 | 1697.012 | 761.511 | 24.1876 |
+--------------------------+----------------------+------------------------+--------------+
bankim@bankim-desktop:~/scripts$ python codec-test.py random_small_range.csv
+--------------------------+----------------------+------------------------+--------------+
| codec | comp_time(millisecs) | decomp_time(millisecs) | bits_per_int |
+--------------------------+----------------------+------------------------+--------------+
| bitshuffle | 79.284 | 162.275 | 8.3802 |
| simdbinarypacking | 36.865 | 35.018 | 8.4536 |
| simdbinarypacking_for128 | 3270.415 | 1522.634 | 8.4654 |
| simdbinarypacking_for256 | 1727.991 | 810.403 | 8.4349 |
+--------------------------+----------------------+------------------------+--------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment