Congratulations on the Badger 2.0 release!
Being 'interested' in data compression I have a little feedback:
It uses the
zstd
compression algorithm when Badger is built withCgo
enabled. When built without Cgo enabled, it uses thesnappy
algorithm.
This seems like an odd choice. IMO compatibility shouldn't be determined by compilation settings. I would humbly like to point out that there is pure Go zstd implementation. While performance is only close to the cgo version, if you go for the fastest setting there shouldn't be much of a difference.
I don't know the reason for choosing Snappy, but LZ4 typically outperforms Snappy.
If you are willing to go with a "non-standard" scheme I have written S2 which is a Snappy extension that compresses better than Snappy and typically decompresses faster. You can see direct comparisons in the Block compression section. S2 can decompress Snappy blocks but not the other way around.
I assume you have verification of blocks since snappy blocks have no integrity check and datadog zstd doesn't read or write CRC info.
If you have some representative 'blocks' I can do a comparison between the different schemes.