Skip to content

Instantly share code, notes, and snippets.

@sachin-j-joshi
Last active March 30, 2021 20:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sachin-j-joshi/fb59208fe08ea48f040000959e54ef84 to your computer and use it in GitHub Desktop.
Save sachin-j-joshi/fb59208fe08ea48f040000959e54ef84 to your computer and use it in GitHub Desktop.
SLTS Garbage Collection Design

Background

Key Features

  • SLTS operations do not immediately delete the chunks that are not needed.
  • Instead Chunks to be deleted are are
    • First marked for deletion in metadata and that change is committed as part of SLTS operations (Eg during truncate, concat or write etc)
    • Name of the chunk is put into GC queue.
    • Each container has dedicated background thread that polls this GC queue and deletes the chunks and the associated metadata
    • The actual deletion task happens on storage thread.

Garbage Collection Background Thread

In memory GC queue.

  • Each container has dedecated a background collector instance
  • Each Garbage collector instance has a delay queue instance that holds list of names of garbage chunks.
  • When a chunk becomes eligible it is deleted and it's metadata removed from metadata store.
  • Each instance of garbage collector also has a overflow list that is periodically flused to persistent GC queue.

Persisted GC queue

  • The persisted GC queue is formed by linked list of metadata records (just like normal segments)
    • By serializing List from overflow buffer to a new chunk on LTS.
    • Note that metadata about the persistent GC queue chunks are stored in table segment just like metadata for normal chunks
  • When in-memory queue is empty, the garbage collector populates the in-memory queue from such Persisted GC queue.
  • As the persistent GC queue is processed, the already processed chunks (containing GC queue data) are now eligible for delete and are added at the tail of the GC queue just like any normal chunk.

Throttling

  • Garbage collector uses no more than fixed percentage of storage threads at any time.
  • When the size of in-memory queue reaches max size.
    • No more items are added to the in memory queue.
    • New items are added to a oveflow buffer which is then periodically drained into a persisted chunk.

Garbage Discovery Background Thread

  • This is a background thread that periodically (once a day/ or once few hrs) scans entire storage metadata table segment by enumerating all entries and discovers all chunks that are marked for deletion, but still haven't.
  • This thread also scans and enqeues system journal chunks

Garbage Admin Tool

  • This is an admin tool that uses ChunkStorage::listChunks API to scan all chunks on LTS and discover any orphan chunks that are not in metadata.
  • The orphan chunks thus found are added to Persisted GC chunks.

Deletion of System Journal Chunks

  • Once the SLTS instance boots up successfully then all the system journal chunks created by previous epochs are added to the Persisted GC list.
  • When new snapshots are created chunks containing older snapshots and truncated chunks from journal are added to the Persisted GC list.

Garbage Collection Config Values

    /**
     * Minimum delay in seconds between when garbage chunks are marked for deletion and actually deleted.
     */
    @Getter
    final private Duration garbageCollectionDelay;

    /**
     * Number of chunks deleted concurrently.
     * This number should be small enough so that it does interfere foreground requests.
     */
    @Getter
    final private int garbageCollectionMaxConcurrency;

    /**
     * Max size of garbage collection queue.
     */
    @Getter
    final private int garbageCollectionMaxQueueSize;

    /**
     * Duration for which garbage collector sleeps if there is no work.
     */
    @Getter
    final private Duration garbageCollectionSleep;


    /**
     * Max number of attempts per chunk for garbage collection.
     */
    @Getter
    final private int garbageCollectionMaxAttempts;

Failure Modes

Failure before data is added to queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment