Skip to content

Instantly share code, notes, and snippets.

What would you like to do?


  • VCL

    • Volume Complete LSN
    • Guarantee availability of all prior log records
    • During storage recovery, every log record with an LSN larger than the VCL must be truncated
  • VDL

    • Volume Durable LSN
    • Consistency Point LSNs
    • Smaller than or equal to VCL
    • Able to truncate all log records with LSN greater than the VDL
    • Completeness and durability are therefore different
  • SCL

    • Segment Complete LSN
    • Identifies the greatest LSN below which all log records of the PG have been received
    • Used by the storage nodes when they gossip with each other in order to find and exchange log records that they are missing
  • Page LSN

    • Identifys the log record associated with the latest change to the page

    • Protection Group Min Read Point LSN
    • The “low water mark” below which all the log records of the PG are unnecessary


  • The database receives acknowledgements to establish the write quorum for each batch of log records, it advances the current VDL
  • The database allocates a unique ordered LSN
  • The LSN needs to be le the current VDL + LSN Allocation Limit


  • Completed asynchronously
  • When a client commits a transaction, the thread handling the commit
    • Request sets the transaction aside by recording its “commit LSN” as part of a separate list of transactions waiting on commit
    • Moves on to perform other work
  • Complets a commit, if and only if, the latest VDL is greater than or equal to the transaction’s commit LSN
  • As the VDL advances, the database identifies qualifying transactions that are waiting to be committed
  • Uses a dedicated thread to send commit acknowledgements to waiting clients (Like Early Lock Release?)
  • Worker threads do not pause for commits, they simply pull other pending requests and continue processing


  • The database guarantees that a page in the buffer cache must always be of the latest version
  • Evicting a page from the cache only if its “page LSN” is greater than or equal to the VDL
    • Not write the data page back to the storage system, but simply the corresponding memory space marked as free
  • This protocol ensures
    • All changes in the page have been hardened in the log
    • On a cache miss, it is sufficient to request a version of the page as of the current VDL to get its latest durable version.
  • The database does not need to establish consensus using a read quorum under normal circumstances
  • The database knows which segment is capable of satisfying a read (the segments whose SCL is greater than the read-point)
  • The database thus can issue a read request directly to a segment that has sufficient data
  • Storage node segment is guaranteed that there will be no read page requests with a read-point that is lower than the PGMRPL
  • Each storage node is aware of the PGMRPL from the database and can advance the materialized pages on disk
  • The actual concurrency control protocols are executed in the database engine exactly as though the database pages
  • Undo segments are organized in local storage of the database

See also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment