Created
November 10, 2013 07:45
-
-
Save pervognsen/7395107 to your computer and use it in GitHub Desktop.
lsm_tree.c
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
** MVCC NOTES | |
** | |
** The in-memory tree structure supports SQLite-style MVCC. This means | |
** that while one client is writing to the tree structure, other clients | |
** may still be querying an older snapshot of the tree. | |
** | |
** One way to implement this is to use an append-only b-tree. In this | |
** case instead of modifying nodes in-place, a copy of the node is made | |
** and the required modifications made to the copy. The parent of the | |
** node is then modified (to update the pointer so that it points to | |
** the new copy), which causes a copy of the parent to be made, and so on. | |
** This means that each time the tree is written to a new root node is | |
** created. A snapshot is identified by the root node that it uses. | |
** | |
** The problem with the above is that each time the tree is written to, | |
** a copy of the node structure modified and all of its ancestor nodes | |
** is made. This may prove excessive with large tree structures. | |
** | |
** To reduce this overhead, the data structure used for a tree node is | |
** designed so that it may be edited in place exactly once without | |
** affecting existing users. In other words, the node structure is capable | |
** of storing two separate versions of the node at the same time. | |
** When a node is to be edited, if the node structure already contains | |
** two versions, a copy is made as in the append-only approach. Or, if | |
** it only contains a single version, it is edited in place. | |
** | |
** This reduces the overhead so that, roughly, one new node structure | |
** must be allocated for each write (on top of those allocations that | |
** would have been required by a non-MVCC tree). Logic: Assume that at | |
** any time, 50% of nodes in the tree already contain 2 versions. When | |
** a new entry is written to a node, there is a 50% chance that a copy | |
** of the node will be required. And a 25% chance that a copy of its | |
** parent is required. And so on. | |
** |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment