Skip to content

Instantly share code, notes, and snippets.

@dereksz
Last active July 24, 2021 00:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dereksz/e48d95305e67fe86fc7d433fc0fab40c to your computer and use it in GitHub Desktop.
Save dereksz/e48d95305e67fe86fc7d433fc0fab40c to your computer and use it in GitHub Desktop.
Compression Tool Test

Base

436M Jul 23 22:59 julia5.1e10.varint

plzip

446.48s user 0.87s system 678% cpu 1:05.90 total
257M

pigz

37.69s user 0.42s system 512% cpu 7.444 total
277M

pbzip2

77.14s user 1.61s system 623% cpu 12.630 total
257M

lrzip

519.72s user 3.70s system 537% cpu 1:37.31 total
255M

pixz

464.59s user 1.10s system 683% cpu 1:08.15 total
257M

gzip

26.21s user 0.21s system 91% cpu 28.726 total
276M

bzip2

39.26s user 0.22s system 99% cpu 39.684 total
257M

xz

328.05s user 0.42s system 99% cpu 5:28.64 total
255M

lz4

0.40s user 0.38s system 16% cpu 4.657 total
435M

lzop

0.23s user 0.32s system 13% cpu 4.044 total
435M

Size Summary

-rw-r--r-- 1 derek derek 436M Jul 23 22:59  julia5.1e10.varint
-rw-r--r-- 1 derek derek 435M Jul 24 09:58  julia5.1e10.varint.lz4
-rw-r--r-- 1 derek derek 435M Jul 24 09:58  julia5.1e10.varint.lzop
-rw-r--r-- 1 derek derek 277M Jul 24 09:49  julia5.1e10.varint.pigz     <<< 7.4s !!!
-rw-r--r-- 1 derek derek 276M Jul 24 09:52  julia5.1e10.varint.gzip     <<< 28.7s
-rw-r--r-- 1 derek derek 257M Jul 24 09:49  julia5.1e10.varint.pbzip2   <<< 12.6s
-rw-r--r-- 1 derek derek 257M Jul 24 09:53  julia5.1e10.varint.bzip2    <<< First sub-1-minute (40s)
-rw-r--r-- 1 derek derek 257M Jul 24 09:52  julia5.1e10.varint.pixz
-rw-r--r-- 1 derek derek 257M Jul 24 09:48  julia5.1e10.varint.plzip
-rw-r--r-- 1 derek derek 255M Jul 24 09:58  julia5.1e10.varint.xz
-rw-r--r-- 1 derek derek 255M Jul 24 09:50  julia5.1e10.varint.lrzip
#!/usr/bin/zsh
# Test the performance of various compression tools.
#
# `julia5.1e10.varint` is a delta-compressed binary
# list of prime using le-7-varint encoding,
# which is dominated by bytes with low values.
BASE=${1:-julia5.1e10.varint}
alias ls='ls -lh'
COMPS=(plzip pigz pbzip2 lrzip pixz gzip bzip2 xz lz4 lzop)
for COMP in ${COMPS[@]}
do
which xz > /dev/null || sudo apt-get install $COMP
done
echo '# Base'
echo
echo '```'
ls $BASE
echo '```'
echo
for COMP in ${COMPS[@]}
do
echo "# $COMP"
echo
echo '```'
time $COMP < $BASE > $BASE.$COMP
ls $BASE.$COMP
echo '```'
echo
done
for COMP in ${COMPS[@]}
do
echo $COMP
time $COMP < $BASE > $BASE.$COMP
ls $BASE.$COMP
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment