Skip to content

Instantly share code, notes, and snippets.

@paniq
Last active April 13, 2024 15:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paniq/429c96446e3e7e01f0c860987049d1cf to your computer and use it in GitHub Desktop.
Save paniq/429c96446e3e7e01f0c860987049d1cf to your computer and use it in GitHub Desktop.

64-bit Pointer Basing in O(1)

A problem that occasionally comes up when walking a dynamically typed heap is to find the base address for a given pointer i.e. the start of an allocation. With this ability, the following features are possible:

  • Reference-counting slices of dynamic arrays
  • Faster lookup during conservative garbage collection
  • A copy-on-write that changes the pointer written to
  • Bidirectional edges within objects (not just only at the boundary)

A traditional way to support this is to maintain a search tree in the allocator, for which the search complexity is O(log n) for n allocations, or to use a block allocation scheme in conjunction with a hashmap.

But it is also possible to sacrifice some memory overhead towards a local and efficient O(1) solution: The allocator reserves 46 bits of addressing, and then maintains 2^6 pools (36 in practice, see further down) of 2^40 bytes each. We further require that an allocation of size S must be aligned to A(S) = ceil(log2(S)), where A(S) also determines in which pool the memory must be allocated. The resulting addresses will factually encode A(S) in the top 6 bits of the address range, and that allows us to base any offset address simply by aligning it.

The cost of this method however is memory overhead: each allocation of size S may have up to (2^A(S))/2-1 unused bytes, but less than 4096, as unused bytes that cover entire VRAM pages will not cause any memory to be committed. For the same reason, the same addressing scheme can be used for files when the file system supports sparse allocation (both ext4 and NTFS do), but since some file systems subtract the apparent size from the available space, it is recommended to map the pools individually.

Optimizations: as allocations must be aligned to 16 bytes on x86-64, and our upper bound is 2^40 bytes due to pool capacity, only 36 pools are required in practice.

Pointer Basing in Foreign Heaps

For situations where we have to work with an existing heap allocator and cannot provide our own, it is still possible to construct baseable pointers, using pointer tagging. See the following implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment