LSNs
-
VCL
- Volume Complete LSN
- Guarantee availability of all prior log records
- During storage recovery, every log record with an LSN larger than the VCL must be truncated
-
VDL
- Volume Durable LSN
- Consistency Point LSNs
- Smaller than or equal to VCL
- Able to truncate all log records with LSN greater than the VDL
- Completeness and durability are therefore different
-
SCL
- Segment Complete LSN
- Identifies the greatest LSN below which all log records of the PG have been received
- Used by the storage nodes when they gossip with each other in order to find and exchange log records that they are missing
-
Page LSN
- Identifys the log record associated with the latest change to the page
-
PGMRPL
- Protection Group Min Read Point LSN
- The “low water mark” below which all the log records of the PG are unnecessary
Write
- The database receives acknowledgements to establish the write quorum for each batch of log records, it advances the current VDL
- The database allocates a unique ordered LSN
- The LSN needs to be le the current VDL + LSN Allocation Limit
Commit
- Completed asynchronously
- When a client commits a transaction, the thread handling the commit
- Request sets the transaction aside by recording its “commit LSN” as part of a separate list of transactions waiting on commit
- Moves on to perform other work
- Complets a commit, if and only if, the latest VDL is greater than or equal to the transaction’s commit LSN
- As the VDL advances, the database identifies qualifying transactions that are waiting to be committed
- Uses a dedicated thread to send commit acknowledgements to waiting clients (Like Early Lock Release?)
- Worker threads do not pause for commits, they simply pull other pending requests and continue processing
Read
- The database guarantees that a page in the buffer cache must always be of the latest version
- Evicting a page from the cache only if its “page LSN” is greater than or equal to the VDL
- Not write the data page back to the storage system, but simply the corresponding memory space marked as free
- This protocol ensures
- All changes in the page have been hardened in the log
- On a cache miss, it is sufficient to request a version of the page as of the current VDL to get its latest durable version.
- The database does not need to establish consensus using a read quorum under normal circumstances
- The database knows which segment is capable of satisfying a read (the segments whose SCL is greater than the read-point)
- The database thus can issue a read request directly to a segment that has sufficient data
- Storage node segment is guaranteed that there will be no read page requests with a read-point that is lower than the PGMRPL
- Each storage node is aware of the PGMRPL from the database and can advance the materialized pages on disk
- The actual concurrency control protocols are executed in the database engine exactly as though the database pages
- Undo segments are organized in local storage of the database
See also: http://dbaplus.cn/news-21-1241-1.html