Skip to content

Instantly share code, notes, and snippets.

@shyamsalimkumar
Last active March 14, 2023 21:59
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save shyamsalimkumar/49a61e5bc6f403d20c55 to your computer and use it in GitHub Desktop.
Save shyamsalimkumar/49a61e5bc6f403d20c55 to your computer and use it in GitHub Desktop.
Cassandra SSTable Format Version Numbers

Original Source

Finding all sstables not matching version “ib”

find /var/lib/cassandra/data/ -type f | grep -v -- -ib- | grep -v "/snapshots"

The version numbers, to date are:

Version 0

  • b (0.7.0): added version to sstable filenames
  • c (0.7.0): bloom filter component computes hashes over raw key bytes instead of strings
  • d (0.7.0): row size in data component becomes a long instead of int
  • e (0.7.0): stores undecorated keys in data and index components
  • f (0.7.0): switched bloom filter implementations in data component
  • g (0.8): tracks flushed-at context in metadata component

Version 1

  • h (1.0): tracks max client timestamp in metadata component
  • hb (1.0.3): records compression ration in metadata component
  • hc (1.0.4): records partitioner in metadata component
  • hd (1.0.10): includes row tombstones in maxtimestamp
  • he (1.1.3): includes ancestors generation in metadata component
  • hf (1.1.6): marker that replay position corresponds to 1.1.5+ millis-based id (see CASSANDRA-4782)
  • ia (1.2.0):
    • column indexes are promoted to the index file
    • records estimated histogram of deletion times in tombstones
    • bloom filter (keys and columns) upgraded to Murmur3
  • ib (1.2.1): tracks min client timestamp in metadata component
  • ic (1.2.5): omits per-row bloom filter of column names

Version 2

  • ja (2.0.0):
    • super columns are serialized as composites (note that there is no real format change, this is mostly a marker to know if we should expect super columns or not. We do need a major version bump however, because we should not allow streaming of super columns into this new format)
    • tracks max local deletiontime in sstable metadata
    • records bloom_filter_fp_chance in metadata component
    • remove data size and column count from data file (CASSANDRA-4180)
    • tracks max/min column values (according to comparator)
  • jb (2.0.1): switch from crc32 to adler32 for compression checksums
    • checksum the compressed data
@hkroger
Copy link

hkroger commented Feb 13, 2019

// md (3.0.18, 3.11.4): corrected sstable min/max clustering

@ahmedjami
Copy link

ahmedjami commented Feb 25, 2021

Cassandra JAVA class that contains some newer versions number: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java#L118

Switch between Branches/Tags to view all versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment