Skip to content

Instantly share code, notes, and snippets.

@no-defun-allowed
Created August 8, 2021 23:25
Show Gist options
  • Save no-defun-allowed/d66146c4768a17feddc01763bc4a3f7e to your computer and use it in GitHub Desktop.
Save no-defun-allowed/d66146c4768a17feddc01763bc4a3f7e to your computer and use it in GitHub Desktop.
An immodest proposal for array-of-structs

Vaguely idiomatic array-of-structs in Common Lisp

Goals and non-goals

The aim is to nail down stuff in cache. The additions should all be safe still; no pointers to unused memory should be producable, types still needs to be checked, etc.

We would still like to play nicely with uniform reference semantics, so that things like setf will still work. This entails that it is not a goal to pass unboxed values around. Ideally, references should still work after redefinition or resizing of the array.

Ideally redefinition would just work, but change-class on an element of a packed array obviously won’t.

The space overhead should be fairly low, else we will blow out caches with the space overhead again.

Idea 1

For the sake of argument, let’s say each array has a storage vector, and it looks something like

---------------------------------------------------
| element-type | length | element | element | ... |
---------------------------------------------------

and we want to pack a structure like (defstruct point x y z).

Our first idea is to pack each struct in, with a back-pointer as a “header” for each object.

  ___________________________________________
 /                        \                  \
\v/                       |                  |
---------------------------------------------------------
| element-type | length | head | x | y | z | head | ... |
---------------------------------------------------------
                        /^\                /^\
                         |                  |
                 (aref array 0)     (aref array 1)

To take the aref of such an array, we compute the address of the appropriate object, and add a special tag to it. To access a slot, we merely have to add an offset to the address. To access type information, we can use the head address to recover the element-type of the vector.

The main problem with this design is that it is impossible to update the instances for a redefined class, or otherwise adjust the array. We can also see that the back-pointer per element also removes some space per cache line.

Idea 2

We instead handle each structure more like a standard instance, using a handle of sorts. Each handle has a reference to the packed array, and an offset measured in struct lengths:

-------------------------------------------------------
| element-type | length | x | y | z | x | y | z | ... |
-------------------------------------------------------
 /^\
  |
--+------------
| s-v | class |    <- the array
---------------
 /^\
  |
--+---------------
| array | offset |  <- an "element"
------------------

aref produces a handle rather than an interior pointer now. While we seem to have more indirections to the struct in question, we believe that the same optimisations used to, say, cache the rack of a standard instance or the storage vector of an array, are applicable to eliminating the use of handles.

Handles would have to be allocated once when an array is allocated, and partially reused when resizing an array. (Shrinking would allow us to have invalid handles, come to think of it.) We have more space overhead due to handles, and handles may not be contiguously allocated, but there are no backpointers wasting space in between packed structs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment