paniq/twistpool.md

## twistpool.md

      
    Raw
  

              twistpool.md
            
          
    Twisting Pool Queue

by Leonard Ritter leonard.ritter@duangle.com, 2019-05-12
An implementation written in Scopes can be found here
We have the task of distributing IDs to objects in a pool that we can use to look up their contents. We want to ensure that:

Insert, access, and remove are operations with O(1) complexity.
The object array is compact, i.e. without any gaps, so we can parallelize work on it with full occupancy.
IDs are ideally allocated in sequential order, so that connected blocks of IDs can be allocated.
Sequential IDs ideally resolve to sequentially ordered objects, so that cache utilization is optimal, both in the lookup table and in the object array (at the cost of zero to one data move per obtain/release).
There is a simple way to resolve object index to its mapped ID (because why not).
We can tell if an ID is invalid even if it has been reissued, supporting weak referencing.

To achieve this, all we need (in addition to our circular object array) is

A LUT (Lookup table), which is a circular dynamic array that maps IDs to object indices (and as we will later see, simultaneously maps object indices to IDs) and to generations (in a data oriented design, they would likely be stored in separate arrays). It should be able to grow as big as the object array.
Two offsets into the object array: a push and a pop offset, labeled push_offset and pop_offset. These offsets are circular, i.e. they are modulated by lut_size.

Conceptually, Our ID is a tuple of two values:

The reusable LUT index ID.lut_index (which should use as many bits as we need for object array indices),
Its generation ID.generation (for N bits, we can reissue the LUT index 2^N times).

In an actual implementation, the ID tuple, as well as the LUT tuples could be encoded as a single integer.
Initialization

In the beginning, our LUT has zero elements. Push and pop offsets are set to zero as well.
Next, we will cover the four basic operations: Obtain, Access and Release.
Obtain I

There are two ways to obtain an ID: we can either get a new one, or reuse one. In the beginning we are only going to be able to obtain fresh IDs, so we will cover this part first. Reusing IDs will be covered after we have released a few IDs.
To obtain N new IDs, we grow the LUT so that all indices up to push_offset + N - 1 are valid indices in the LUT. Each new LUT element is initialized with an ID whose object index is the same as its index in the LUT, and whose generation is zero. We call this an identity mapping: reusable ID and index are the same. Finally, we increase push_offset by N and modulate it by lut_size.
After obtaining five fresh elements from a fresh LUT, the LUT will have this configuration (for easier reading, we split the LUT index and generation part of each ID into their own row):


LUT Index
00
01
02
03
04


Object Index
00
01
02
03
04


Generation
0
0
0
0
0


The push offset is 5. The pop offset remains at 0.
We obtain five more elements, resulting in this table:


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
00
01
02
03
04
05
06
07
08
09


Generation
0
0
0
0
0
0
0
0
0
0


The push offset is 10. The pop offset remains at 0.
Release

Releasing an ID involves swapping out ("twisting") an item to close the gap, increasings its generation, and finally increasing the pop offset by 1.
Lets start with an example. We wish to release five of the ten IDs we have obtained in the previous example: {00, 0}, {01, 0}, {05, 0}, {09, 0} and {02, 0}.
Releasing {00, 0} is simple, as it is located right at the beginning of the LUT, where our pop offset is, and so releasing it won't create any gaps. We increase its generation counter by 1. We increase our pop offset by 1, and the new LUT now looks like this (unused indices into the object array are prefixed with an F for "free"):


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
01
02
03
04
05
06
07
08
09


Generation
1
0
0
0
0
0
0
0
0
0


The pop offset is now 1. As you can see, we still have a contiguous array of used objects.
Releasing {01, 0} is equally simple, as we have nearly the same situation, creating no gaps. We increase our pop offset by 1, and the LUT looks like this now:


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
F01
02
03
04
05
06
07
08
09


Generation
1
1
0
0
0
0
0
0
0
0


The pop offset is 2.
The next ID to be released is {05, 0}. Unfortunately, this index lies right in the middle of the array, so we will have to swap it with an item from the left side, located at pop offset 2. This is called a twist. The swap will be between LUT index 2 and LUT index 5. We swap out object indices but not generation fields. Likewise, in the object array, we need to move the object at index 2 to index 5 (index 5 has just been freed, so we don't need to do a swap here). Finally we increase the pop offset by 1:


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
F01
05
03
04
F02
06
07
08
09


Generation
1
1
0
0
0
1
0
0
0
0


The pop offset is now 3.
Releasing {09, 0} requires the same operation, even though it is located at the right border of the contiguous object array. This is so we can maintain the semi-FIFO ordering of our queue. We swap out the content at LUT index 3 with the content at LUT index 9 and increase the pop offset:


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
F01
05
09
04
F02
06
07
08
F03


Generation
1
1
0
0
0
1
0
0
0
1


The pop offset is now 4.
Finally, we release {02, 0}. This is the trickiest case: 2 has been remapped to object index 5: it is already twisted. The ID {05, 0} with object index 2 has been previously freed, and we are about to free the object at index 5, so we can untwist this mapping without needing to move any objects (This untwist operation can always be performed by the way, even when the entries in question have not been twisted, because an identity swap has no effect):


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
F01
F02
09
04
F05
06
07
08
F03


Generation
1
1
1
0
0
1
0
0
0
1


But we are not entirely done. In the second step, we still need to move the object at pop offset 4 to our gap at object offset 5, and increase the pop offset once more:


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
F00
F01
F02
09
05
F04
06
07
08
F03


Generation
1
1
1
0
0
1
0
0
0
1


the pop offset is now 5.
If we wanted to release one more object, we would have to place it at pop offset 5, which, as you see, is already twisted - we would need to untwist these slots before we produce a new twist, but still only one move operation will be required in the end. By untwisting unused entries in this way, we also slowly disentangle the array mapping so that when the array is empty, all entries are back to their identity configuration.
Access

We are interested in three modes of access: we want to resolve an ID to its object index, we want to iterate the object array and we want to resolve an object index to its ID.
ID To Object

To check the validity of an ID, we compare its generation field ID.generation to the generation field in the LUT at LUT[ID.lut_index].generation. If the generations match, the ID is still valid.
To retrieve the index of the object that this ID is mapped to, we use LUT[ID.lut_index].object_index.
In addition, we can check if an ID is currently in use regardless of whether it has been reissued. If its associated object index lies between the modulated push and pop offset, that is if LUT[ID.lut_index].object_index >= (push_offset mod lut_size) and LUT[ID.lut_index].object_index < (pop_offset mod lut_size), then the ID is currently free.
Object Iteration

Object iteration begins at pop_offset and ends at push_offset (excluding push_offset itself). As our object buffer is circular, that is push_offset can be smaller than pop_offset, we need to modulate our current index by the total size of the LUT (which should be equal to the object buffer size) after each increment.
The iteration will always be contiguous, that is: as specified, each slot in our sequence will hold a valid object.
Object to ID

As LUT entries are either in identity mapping or twisted, our mapping is conservative and the predicate LUT[LUT[i].object_index].object_index == i holds. Therefore we can simply use LUT[object_index].object_index == lut_index to retrieve the LUT index for this object and query its generation.
Obtain II: Reuse

Now that we have seen how releases can produce twists and generate reusable IDs, we need to look at the second way of obtaining IDs, which is to reuse them. You can choose to either reuse objects right away, or, at the cost of needing more storage, wait until the object array has reached a threshold size before you start reusing them, to avoid depleting the generational counter too quickly.
Reusing IDs involves modulating the push offset by the LUT array size so that it becomes smaller than the pop offset, and untwisting the object index at the modulated push offset.
Let's look at our example. Our pop offset is 5, which means we can reuse up to five IDs before we need to generate fresh ones. Let's obtain all of those IDs, step by step.
Our current push offset, as we left it in Obtain I, is still 10. Modulated by the LUT size (which is 10), its real offset is computed by push_offset mod lut_size to the actual offset 0. This is the ID we are going to obtain. It is identity mapped, so no untwist is necessary. Its generation has already been incremented in the previous release step. We increment the push offset by 1. We obtain ID {00, 1}.


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
00
F01
F02
09
05
F04
06
07
08
F03


Generation
1
1
1
0
0
1
0
0
0
1


The push offset is now 11. The pop offset remains at 5.
The next two IDs starting at push_offset mod lut_size are 1 and 2, and they are also identity mapped, so we can treat them the same way. We obtain IDs {01, 1} and {02, 1}.


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
00
01
02
09
05
F04
06
07
08
F03


Generation
1
1
1
0
0
1
0
0
0
1


The push offset is now 13.
Finally an interesting case: Our newest ID is already mapped to object index 9, but ID 9 is still free (this is an invariant property: in a twist, the other ID is not in use). We untwist the object index and move the object from index 9 to index 3. We increase the push offset by 1, and obtain ID {09, 1}.


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
00
01
02
03
05
F04
06
07
08
09


Generation
1
1
1
0
0
1
0
0
0
1


The push offset is now 14.
Once again, for our last obtain, we encounter another twisted entry. We handle it the same way as we have before. The object is moved from index 5 to index 4, and we obtain ID {05, 1}.


LUT Index
00
01
02
03
04
05
06
07
08
09


Object Index
00
01
02
03
04
05
06
07
08
09


Generation
1
1
1
0
0
1
0
0
0
1


And have you noticed? Once we reuse all IDs, the table is sorted again.
When The Array Is Full

When increasing the push offset would collide with the pop offset, i.e. the array contains (capacity - 1) elements, we need to append one more object to the object array, and another LUT entry with identity mapping. When the new id lies in the range between pop and push offset, it is already obtained, and we can return it. Otherwise we continue as usual.