Compaction works transactionally with the following algorithm:
- We prepare a transaction, whereby all i/o referenced objects through the API are tracked.
- We walk the chain and mark reachable objects, keeping 4 finalities of state roots and messages and all headers all the way to genesis.
- Once the chain walk is complete, we write these objects to a new block store. (depending on performance this step may be better done while walking the chain in the previous step)
- The new block store that was just created and freshly loaded with the hot items is promoted atomically to the hot store.
- Now there are basically two cold stores that this algorithm has created which can be handled by any of the following:
- The old cold store is deleted
- Old cold stores are kept around and we maintain a list of old cold stores that we can query
- The newly demoted cold store is copied into the pre-existing cold store and then deleted from disk. For the duration of this copying there is code that will query both datastores for read operations.
This process allows us to regularly create new datastores with just the needed hot items for consensus in the database and removes the need to periodically GC a hot store because we are creating new hot stores regularly