Skip to content

Instantly share code, notes, and snippets.

@jolynch
Last active March 19, 2021 14:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jolynch/55185e455351d6b7febb266499207afa to your computer and use it in GitHub Desktop.
Save jolynch/55185e455351d6b7febb266499207afa to your computer and use it in GitHub Desktop.
Hashing and Compression Benchmarks
#!/bin/bash
echo "Setting performance governor @ 4GHz"
cd /sys/devices/system/cpu
FREQ=4000000
for c in ./cpu[0-9]* ; do
echo $FREQ | sudo tee -a ${c}/cpufreq/scaling_max_freq
echo $FREQ | sudo tee -a ${c}/cpufreq/scaling_min_freq
echo "performance" | sudo tee -a ${c}/cpufreq/scaling_governor
done
# First the slow ones ... I need some coffee
$ time gzip yelp_academic_dataset_review.json
gzip yelp_academic_dataset_review.json 295.90s user 2.56s system 99% cpu 4:58.64 total
$ du -shc yelp_academic_dataset_review.json.gz
2.7G yelp_academic_dataset_review.json.gz
2.7G total
$ time gunzip yelp_academic_dataset_review.json.gz
gunzip yelp_academic_dataset_review.json.gz 38.42s user 4.29s system 92% cpu 46.354 total
$ time xz yelp_academic_dataset_review.json
yelp_academic_dataset_review.json (1/1)
4.9 % 89.0 MiB / 323.7 MiB = 0.275 1.3 MiB/s 4:08 1 h 30 min
5.3 % 96.6 MiB / 351.4 MiB = 0.275 1.3 MiB/s 4:29 1 h 20 min
5.9 % 107.5 MiB / 390.8 MiB = 0.275 1.3 MiB/s 4:59 1 h 20 min
# I got bored and how to kill it ... let's do a smaller file
$ time xz yelp_academic_dataset_checkin.json
yelp_academic_dataset_checkin.json (1/1)
22.5 % 15.8 MiB / 85.6 MiB = 0.185 1.3 MiB/s 1:05 3 min 50 s
90.2 % 64.3 MiB / 342.9 MiB = 0.187 1.3 MiB/s 4:31 30 s
xz yelp_academic_dataset_checkin.json 299.30s user 0.41s system 99% cpu 4:59.79 total
$ time xz -d yelp_academic_dataset_checkin.json.xz --stdout > /dev/null 1 ↵
xz -d yelp_academic_dataset_checkin.json.xz --stdout > /dev/null 4.02s user 0.02s system 99% cpu 4.049 total
# Now the fast ones!
$ time lz4 -q yelp_academic_dataset_review.json
lz4 yelp_academic_dataset_review.json 17.87s user 3.73s system 89% cpu 24.269 total
$ du -shc yelp_academic_dataset_review.json.lz4
4.2G yelp_academic_dataset_review.json.lz4
4.2G total
time lz4 -d yelp_academic_dataset_review.json.lz4 --stdout > /dev/null
lz4 -d yelp_academic_dataset_review.json.lz4 --stdout > /dev/null 3.67s user 1.51s system 71% cpu 7.289 total
$ time zstd yelp_academic_dataset_review.json
zstd yelp_academic_dataset_review.json 47.99s user 3.57s system 108% cpu 47.534 total
$ du -shc yelp_academic_dataset_review.json.zst
2.5G yelp_academic_dataset_review.json.zst
2.5G total
$ time zstd -d yelp_academic_dataset_review.json.zst --stdout > /dev/null
zstd -d yelp_academic_dataset_review.json.zst --stdout > /dev/null 8.84s user 0.30s system 99% cpu 9.140 total
$ time zstd -10 yelp_academic_dataset_checkin.json
yelp_academic_dataset_checkin.json : 26.69% (398272056 => 106301789 bytes, yelp_academic_dataset_checkin.json.zst)
zstd -10 yelp_academic_dataset_checkin.json 24.13s user 0.14s system 100% cpu 24.090 total
$ time zstd -d yelp_academic_dataset_checkin.json.zst
yelp_academic_dataset_checkin.json.zst: 398272056 bytes
zstd -d yelp_academic_dataset_checkin.json.zst 0.52s user 0.15s system 54% cpu 1.231 total
$ time zstd -19 yelp_academic_dataset_checkin.json 2 ↵
yelp_academic_dataset_checkin.json : 20.75% (398272056 => 82627859 bytes, yelp_academic_dataset_checkin.json.zst)
zstd -19 yelp_academic_dataset_checkin.json 319.36s user 0.23s system 100% cpu 5:19.41 total
$ time zstd -d yelp_academic_dataset_checkin.json.zst
yelp_academic_dataset_checkin.json.zst: 398272056 bytes
zstd -d yelp_academic_dataset_checkin.json.zst 0.94s user 0.20s system 53% cpu 2.152 total
# oof so slow.
$ time sha256sum yelp_academic_dataset_review.json 127 ↵
a0da717437a033b688d89dda9a27a8d864f9583cf32ba5af239092896e04a6b4 yelp_academic_dataset_review.json
sha256sum yelp_academic_dataset_review.json 27.50s user 0.83s system 99% cpu 28.353 total
$ time md5sum yelp_academic_dataset_review.json
9b1f8fe95d1c589539fa0d2f2fe6b1e9 yelp_academic_dataset_review.json
md5sum yelp_academic_dataset_review.json 9.10s user 0.84s system 99% cpu 9.960 total
$ time sha1sum yelp_academic_dataset_review.json
16527d9a865f906b2b980e3fb76d04433d6db2fa yelp_academic_dataset_review.json
sha1sum yelp_academic_dataset_review.json 10.65s user 0.79s system 99% cpu 11.456 total
$ time openssl dgst -sha3-256 yelp_academic_dataset_review.json 1 ↵
SHA3-256(yelp_academic_dataset_review.json)= c2834f8b7d687e456ff39d9c399decddbbca8b1783334fb22e2a624ff756dfc8
openssl dgst -sha3-256 yelp_academic_dataset_review.json 16.81s user 0.83s system 99% cpu 17.644 total
$ time crc32 yelp_academic_dataset_review.json
e3472813
crc32 yelp_academic_dataset_review.json 4.80s user 0.89s system 99% cpu 5.721 total
# I have a need ... a need ... for speed :-D
$ time xxh64sum yelp_academic_dataset_review.json
72f18f574b9d3959 yelp_academic_dataset_review.json
xxh64sum yelp_academic_dataset_review.json 0.47s user 0.62s system 99% cpu 1.084 total
$ time xxh128sum yelp_academic_dataset_review.json
cb92d56c8eacfc50ead92ccd04736f85 yelp_academic_dataset_review.json
xxh128sum yelp_academic_dataset_review.json 0.39s user 0.63s system 99% cpu 1.022 total
# This is even faster if you let it use more than one thread
$ time b3sum --num-threads 1 yelp_academic_dataset_review.json
11f9fbe7dbe17591b5c02adab8e00277b7f36a60a812f372b44cae352e992691 yelp_academic_dataset_review.json
b3sum --num-threads 1 yelp_academic_dataset_review.json 1.81s user 0.37s system 99% cpu 2.178 total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment