attractivechaos/2025-02-04_When-to-use-a-memory-pool.md

## 2025-02-04_When-to-use-a-memory-pool.md

      
    Raw
  

              2025-02-04_When-to-use-a-memory-pool.md
            
          
When to use a memory pool?

A memory pool is a data structure for allocating fixed-size memory blocks.
This definition has stayed same since I learned it around 2010.
Although you can find more general definitions via Google, those are the minority.
The APIs of memory pool implementations vary, but they generally look like:
void *mp_init(size_t block_size); // initialize a memory pool
void mp_destroy(void *mem_pool); // free everything
void *mp_alloc(void *mem_pool); // allocate from a pool
void mp_free(void *mem_pool, void *ptr); // return the memory ptr points to
When you need to allocate/free a memory block of block_size in length, you use mp_alloc/mp_free instead of malloc/free.
A memory pool is the preferred choice for implementing a binary tree or a linked list
(including chaining-based hash tables) when deletions are frequent.
Without a memory pool, you would use free() to deallocate deleted items.
Millions of such calls are slow.
A memory pool effectively buffers the memory used by deleted items
and quickly returns them when the tree/list needs a new item.
It is usually faster than generic memory management.
I have just implemented a simple memory pool in kmempool.h and kmempool.c in <100 lines of code.
To see the effect of memory pools, I generated 10 million random 64-bit integers with duplications and processed them sequentially:
I insert an integer to an intrusive AVL tree if it is not in the tree,
or delete it if the integer is already in the tree.
There are 1.25 million integers in the tree in the end.
You can find the test code here.
The following table shows timing on three different machines, measured by hyperfine:


malloc
without pool
with pool
speedup


Darwin libc
8.38 s
7.21 s
16%


Linux glibc
8.43 s
7.53 s
12%


Linux mimalloc
7.24 s
6.88 s
5%


Note that in this microbenchmark, AVL tree operations are the performance bottleneck.
The speedup to memory management is much larger than the speedup of the entire task shown in the table.
A memory pool is largely an arena specialized for fixed-size allocation.
It is much simpler yet tends to be faster than general arena on deallocation-heavy loads.
A small data structure to know about.
PS: my memory pool implementation dynamically grows with input.
If the max pool size is known at initialization, this preprint provides a better algorithm.
PSS: at Reddit, someone pointed out that variable-size memory pool is more common than I thought.
Then the definition of memory pool is also controversial.
In general, terminology used in memory allocation is a holy mess.
In my experience, more sources define memory pool to have fixed sizes.
Pool allocators to some devs are often called arena allocators which are synonymous to linear allocators to many.
In malloc implementations, allocation in an arena is far more capable than linear allocation.
In glibc, for example, an arena contains heaps with each heap being a continuous region.
Then in mimalloc, a heap means thread-local storage.
Both heap definitions probably also differ from what we typically mean.
It is too late to sort these out.
Just beware the same terminology has different meanings depending on the context.
malloc	without pool	with pool	speedup
Darwin libc	8.38 s	7.21 s	16%
Linux glibc	8.43 s	7.53 s	12%
Linux mimalloc	7.24 s	6.88 s	5%