Skip to content

Instantly share code, notes, and snippets.

@jeffjirsa
Created June 11, 2016 22:01
Show Gist options
  • Save jeffjirsa/284e35574937f4e84d4746eb6288ef2a to your computer and use it in GitHub Desktop.
Save jeffjirsa/284e35574937f4e84d4746eb6288ef2a to your computer and use it in GitHub Desktop.
## TimeWindowCompactionStrategy
TimeWindowCompactionStrategy is designed specifically for workloads where it's beneficial to have data on disk grouped by the timestamp of the data, a common goal when the workload is time-series in nature or when all data is written with a TTL. In an expiring/TTL workload, the contents of an entire SSTable likely expire at approximately the same time, allowing them to be dropped completely, and space reclaimed much more reliably than when using SizeTieredCompactionStrategy or LeveledCompactionStrategy. The basic concept is that TimeWindowCompactionStrategy will create 1 sstable per file for a given window, where a window is simply calculated as the combination of two primary options:
* `compaction_window_unit`: A Java TimeUnit (MINUTES, HOURS, or DAYS). The default value is DAYS
* `compaction_window_size`: The number of units that make up a window. The default value is 1
Taken together, the operator can specify windows of virtually any size, and TimeWindowCompactionStrategy will work to create a single sstable for writes within that window. For efficiency during writing, the newest window will be compacted using SizeTieredCompactionStrategy.
Ideally, operators should select a `compaction_window_unit` and `compaction_window_size` pair that produces approximately 20-30 windows - if writing with a 90 day TTL, for example, a 3 Day window would be a reasonable choice (`'compaction_window_unit':'DAYS','compaction_window_size':3`).
### Changing TimeWindowCompactionStrategy Options
Operators wishing to enable TimeWindowCompactionStrategy on existing data should consider running a major compaction first, placing all existing data into a single (old) window. Subsequent newer writes will then create typical SSTables as expected.
Operators wishing to change `compaction_window_unit` or `compaction_window_size` can do so, but may trigger additional compactions as adjacent windows are joined together. If the window size is decrease
d (for example, from 24 hours to 12 hours), then the existing SSTables will not be modified - TimeWindowCompactionStrategy can not split existing SSTables into multiple windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment