isaacdavis/id_abstract.md

## id_abstract.md

      
    Raw
  

              id_abstract.md
            
          
    Intro

This is a very rough sketch of an idea for a reimplementation of the illumos
ID Space mechanism. Take everything I say here with a grain of salt, please...
Overview

ID spaces will be implemented using two structs:
id_space_t
    node_t *root; /* root of tree */
    size_t height; /* the current height of the tree */
    id_t low;
    id_t high;
    size_t size; /* number of allocated IDs */
    id_t last_allocated; /* ID that was last allocated */
    size_t branch_factor /* not modifiable after creation; probably should be a compile-time constant */

id_node_t
    /* maybe not a long[] if branch_factor < 64 */
    long bitfield[branch_factor/(8 * sizeof(long))];
    node_t *children[branch_factor];
    size_t inverse_depth; /* this is weird: it equals tree height minus actual depth */

An id space will consist of a tree of id_node_ts, pointed to by an
id_space_t. The high and low ranges will be specified by the user, as normal.
The branch_factor is the number of children per parent node - this will either
be a compile-time constant, a tunable parameter, or optimized automatically for
the predicted size of the tree.
Each node has a bitfield, which will serve two different purposes depending on
whether or not the node is a leaf. If the node is a leaf, zeroes in the bitfield
will indicate unallocated IDs, and ones will indicate allocated IDs. If the node
is not a leaf, each bit will correspond to a child node. A zero will indicate
that the corresponding child (or one of the child's children or its childrens'
children, etc.) has an unallocated ID. A one will indicate that the child has
no children with unallocated IDs - its branch is completely full.
Example

The tree starts with just the root node and grows (bottom-up!) as
more space is required. Here's an example. For simplicity's sake, we'll use
a branching factor of 2 and assume that the ID range starts at 0.
In my hacked-together notation, N signifies a null pointer.
We start with a single node (labeled node A):
A: bitfield:[0 0]
   children:[N N]
   inverse_depth: 0

We allocate our first id, which is 0:
A: bitfield:[1 0]
   children:[N N]
   inverse_depth: 0

And another id, 1:
A: bitfield:[1 1]
   children:[N N]
   inverse_depth: 0

On the next allocation, we find that the root's bitfield is completely full -
this is how we know we've run out of space. We add a new root, node B:
             B: bitfield:[1 0]
                children:[A N]
                inverse_depth: 1

        /

A: bitfield:[1 1]
   children:[N N]
   inverse_depth: 0

Note that we didn't have to modify A at all.
When we add a new layer like this, we must also update the root and height
fields in our id_space_t struct.
On the next allocation, we see that B has a child with space remaining. This
child hasn't actually been allocated yet, so we allocate it:
             B: bitfield:[1 0]
                children:[A C]
                inverse_depth: 1

        /                       \

A: bitfield:[1 1]            C: bitfield:[1 0]
   children:[N N]               children:[N N]
   inverse_depth: 0             inverse_depth: 0

On the next allocation, we see that we have filled up C. This means that B is
now also full, so we update both nodes. Furthermore, the root is now full, so
we add another layer:
                         D: bitfield:[1 0]
                            children:[B N]
                            inverse_depth: 2

                      /

             B: bitfield:[1 1]
                children:[A C]
                inverse_depth: 1

        /                       \

A: bitfield:[1 1]            C: bitfield:[1 1]
   children:[N N]               children:[N N]
   inverse_depth: 0             inverse_depth: 0

And so on and so on...
Allocation/Deletion/Runtime

Note that the IDs that correspond to specific spots in each node's bitfield
never change - for example, the second slot in node C will always refer to ID
3. This allows us to be precise when traversing the tree, for first-fit
allocation, next-fit allocation, and deletion alike.
For first-fit allocation, we start at the root and iterate through the bitfield
until we find a non-full child. We then iterate through the child's bitfield,
and so on, until we reach a leaf. We can use each node's inverse depth and the
tree's total height to figure out when we reach a leaf. We can calculate the ID
corresponding to that leaf using the branching factor, current depth, and
path we take as we iterate.
For next-fit allocation, we do exactly the same thing, but we only consider
nodes that have the possibility of holding IDs larger than the last allocated
ID. This will occasionally result in a "miss" where we end in a leaf whose only
free IDs are smaller than the last allocated ID, but this case will be rare and
will just require one additional tree traversal.
For ID deletion, we know the exact location of the ID to be deleted, so we can
traverse the tree precisely until we reach that location.
In all of these cases, we almost always can get away with only performing one
tree traversal of length log_{branch_factor}n, iterating through at
most branch_factor elements at each node, for a total runtime of
branch_factor * log_{branch_factor}n, where n is
branch_factor^tree_height.
When adding a layer, we don't have to touch any nodes except those in the
direct path up to the root from the last filled-up leaf. This is why we use
inverse_depth instead of depth - otherwise, we would have to touch all nodes
to update their depth.
Range limiting

In id_space_create, the user may specifiy a minumum ID value larger than zero. In this case, we
should simply avoid allocating IDs from that part of the tree. We may initially
have to allocate a single node for each layer until the tree is tall enough to
support the user's minimum value, but we do not have to "flesh out" these layers
preemptively.
As an example, suppose our branching factor is 2 and the user specifies a
minimum ID value of 14. We begin with an empty tree:
A: bitfield:[0 0]
   children:[N N]
   inverse_depth: 0

But discover upon allocation that our tree isn't big enough to support ID 14.
We therefore expand the tree via our bottom-up new-root method until it is:
                                D: bitfield:[0 0]
                                    children:[C N]
                                    inverse_depth: 3

                                /

                         C: bitfield:[0 0]
                            children:[B N]
                            inverse_depth: 2

                      /

             B: bitfield:[0 0]
                children:[A N]
                inverse_depth: 1

        /

A: bitfield:[0 0]
   children:[N N]
   inverse_depth: 0

And then allocate as we would normally - ID 14 belongs in node G in the diagram
below:
                                D: bitfield:[0 0]
                                    children:[C E]
                                    inverse_depth: 3

                                /                      \

                         C: bitfield:[0 0]           E: bitfield:[0 0]
                            children:[B N]              children:[N F]
                            inverse_depth: 2            inverse_depth: 2

                      /                                           \

             B: bitfield:[0 0]                               F: bitfield:[0 0]
                children:[A N]                                  children:[N G]
                inverse_depth: 1                                inverse_depth: 1

        /                                                                 \

A: bitfield:[0 0]                                                G: bitfield:[1 0]
   children:[N N]                                                   children:[N N]
   inverse_depth: 0                                                 inverse_depth: 0

From here, we can allocate normally - we just need to ensure that we don't allow
allocations less than 14. Note that there is an overhead of at most an extra
log _{branch_factor}n nodes (in this case, nodes A, B, and C), where
n is the power of 2 immediately below the user's minimum ID value (in this
case, 8). This extra allocation could take place at the time of tree creation or
ID allocation, but would only have to take place once. If the user then reduced
the minimum ID, we would already have the structure in place to begin allocating
from the restricted portion of the tree.
It will be simple to prevent ID allocation beyond the user's specified maximum
value - we just avoid allocating IDs past that point and avoid unnecessarily
expanding the tree to allow IDs past that point. If the user expands the ID
range by increasing the maximum value, the tree can be allowed to grow normally.
Space complexity

A tree of height h will support branch_factor^h+1 IDs and
require at most branch_factor^h + branch_factor^h-1+
... + branch_factor⁰ nodes. This does not seem efficient for a low
branching factor (e.g. 2) but becomes increasingly efficient as the branching
factor increases. At any rate, this is more efficient than allocating a separate
vmem_seg_t for every ID, which is what we do now.
For example, suppose our branching factor is 16, and we want to allocate 4096
IDs. This means we need a tree of height 2, which will have at most 273 nodes.
Each node will take up 138 bytes (2 bytes for the bitfield, 16 child pointers,
and a size_t for the depth) for a total of (138 * 273) == 37674 bytes, or
about 37 kilobytes. As noted in OS-6013, the current ID space implementation uses
232 bytes per ID - this would entail a total of (232 * 4096) == 950,272
bytes, or about 950 kilobytes.
As another example, suppose our branching factor is 64, and we want to allocate
262144 IDs. This means we need a tree of height 2, which will have at most 4161
nodes. Each node will take up 528 bytes (1 long for the bitfield,
64 child pointers, and a size_t for the depth) for a total of
(528 * 4161) == 2,197,008 bytes, or about 2 megabytes. The current
implementation would require (232 * 262144) == 60,817,408 bytes, or about 60
megabytes.