✔️ /❌ = satisfied/not satisfied by std.experimental.allocator
's allocator
interface
- ✔️ Enable generic containers that work with any allocator
- ✔️ User-defined allocators
- ❌ Combine
@safe
and@nogc
- ✔️ Composable allocators
- ✔️ Runtime polymorphism (
IAllocator
) - ✔️ No exceptions
- ✔️ BetterC compatibility
- ✔️ Replace direct calls to
new
/malloc
/etc. in existing code. - ✔️ Configurable global allocator.
D is a general-purpose programming language that aims to support a wide variety of use cases, from low-level system programming to high-level scripting.
Phobos's current container library, std.container
, fails to support many of
these use cases, largely because of its hard-coded dependency on the druntime
GC for memory allocation. As a result, we see a proliferation of container
libraries on code.dlang.org, many of which are incomplete or receive little
maintenance.
Enabling the creation of generic containers that work with any allocator would allow this shortcoming to be remedied.
Different approaches to memory allocation are suitable for different applications. To support as many use cases as possible, users must be able to define their own allocators.
Users of std.experimental.allocator
's allocator interface (and of libraries
like emsi_containers
that depend on it) are currently forced to choose
between @safe
and @nogc
. They should not have to.
The ability to define a new allocator through composition of reusable components vastly reduces the amount of code necessary to create complex allocators, or to experiment with different approaches to memory allocation.
While this is useful, few users are likely to take advantage of it, and the same results can be achieved by other means, so it is only "nice to have," not "must have."
Runtime polymorphism of allocators allows users of generic containers to write
code using a single concrete type, like Array!(int, IAllocator)
, rather that
dealing with a wide variety of types that differ only by choice of allocators
(e.g., Array!(int, Mallocator)
, Array!(int, GCAllocator)
, etc.).
While this is convenient, Phobos already has a feature for solving this
problem: ranges. Specifically, the interface types in std.range.interfaces
allow any container that implements the appropriate interface to be used by
code that accepts the concrete type InputRange
(or ForwardRange
,
RandomAccessRange
, etc).
Ranges do not cover every possible use case for allocator polymorphism, but they cover enough to demote this goal from "must have" to "nice to have."
While the choice of whether to use exceptions or not ultimately rests with the author of each individual allocator, the allocator interface should be designed to accomodate both choices.
Ideally, the allocator interface should not require the use of exceptions unless memory safety cannot possibly be guaranteed any other way. Since memory safety is a "must have," that makes this goal "nice to have."
Again, the choice of whether to depend on druntime rests ultimately with the author of each individual allocator, but the interface itself should be possible to use in BetterC.
The features necessary to make the allocator interface @safe
will also make
it significantly more cumbersome to use. Code that is already using a specific
allocator, like new
or malloc
, will not be able to easily adopt the new
interface, and likely would not benefit from doing so.
Global state is generally considered a code smell, so a compelling use case would be necessary to justify the inclusion of a configurable global allocator.
In practice, there is no evidence that such a use case exists.
std.experimental.allocator
provides both theAllocator
and
processAllocator
, but nobody uses them.
Allocation is inherently memory-safe; only deallocation (and by extension reallocation) has the potential to cause memory corruption or UB.
Since we want to support generic containers and user-defined allocators, we
cannot ask containers to use @trusted
when callling deallocate
--the
deallocate
method itself must be @safe
(or @trusted
). See this forum
thread for more detailed discussion.
For an allocator to free a block of memory safely, without GC, the following conditions must be met:
- Uniqueness: there must not be any other references to the block.
- Liveness: the block must not have been deallocated already.
- Matching: the block must have originated from the allocator's
allocate
method.
To make deallocate
@safe
, the allocator API must be designed such that
@safe
code which would violate these conditions does not compile.
The use of void[]
for blocks in std.experimental.allocator
's allocator
interface makes all 3 of the above conditions impossible for the compiler to
enforce.
- A
void[]
block can be freely copied. - Multiple copies of the same
void[]
block can be passed todeallocate
. - Any
void[]
block can be passed to anydeallocate
method, no matter where it originated.
This means that we cannot simply make std.experimental.allocator
@safe
by
changing its implementation. The allocator interface itself must be redesigned.
std.experimental.allocator
's allocator interface does not have any features
that depend on D's runtime, and much of its implementation is, in principle,
compatible with BetterC already. It should not present much challenge to ensure
that a new version of the allocator interface remains BetterC-compatible.
However, because std.experimental.allocator
is part of Phobos, and is
compiled into the Phobos shared library (libphobos.so
or equivalent), many
essential parts of it are excluded from BetterC programs at the linking stage.
So in spite of being BetterC-compatible in principle, it is not usable in
BetterC in practice.
The subject of this document is the design of a new allocator interface. Since the challenges here are implementation challenges, not interface design challenges, they will not be discussed further.
To guarantee uniqueness, we can replace void[]
blocks with a non-copyable
wrapper type:
struct Block
{
@system void[] payload;
@system this(void[] payload) { this.payload = payload; }
@disable this(ref inout Block) inout;
}
A @system
variable is used for payload
to prevent @safe
code from
creating untracked references to the block's memory, and a @system
constructor is used to prevent @safe
code from creating a Block
that
aliases an existing void[]
.
To allow temporary, controlled access to the memory in @safe
code, we can use
the "borrow pattern":
auto borrow(alias callback)(ref Block block)
{
scope void[] tmp = null;
() @trusted { swap(block.payload, tmp); }();
scope(exit) () @trusted { swap(block.payload, tmp); }();
return callback(tmp[]);
}
Swapping the payload with a null slice for the duration of the borrow maintains the invariant that there is only ever one reference to the block's underlying memory.
With uniqueness established, all that's necessary to guarantee liveness is to
ensure that the single, unique Block
referring to an allocation cannot be
used again after it's been successfully deallocated.
The simplest way to do this is to have deallocate
take its Block
argument
by reference, and overwrite it with the null Block
upon successful
deallocation:
void deallocate(ref Block block);
To check for a successful deallocation, instead of checking a boolean return
value, we now compare the block to Block.init
:
Block block = someAllocator.allocate(123);
// do stuff with block...
someAllocator.deallocate(block);
if (block == Block.init)
{
// deallocation was successful
}
else
{
// deallocation failed
}
Since checking for null is a common operation, let's add a helper method to make it less verbose:
struct Block
{
// ...
bool isNull() => this == typeof(this).init;
}
There are two cases to consider here:
- Allocators that implement
owns
. - Allocators that don't implement
owns
.
In case (1), guaranteeing matching is easy: we can have deallocate
call
owns
internally. For example:
void deallocate(ref Block block) @safe
{
if (block.isNull) return;
if (!this.owns(block)) assert(0, "Invalid block");
// perform actual deallocation...
}
Note that this requires owns
to always return true
or false
, not
Ternary.unknown
. In practice, this requirement is met by every allocator in
std.experimental.allocator
that implements owns
.
Case (2) requires a bit more work.
Fundamentally, in order to guarantee matching of allocations and deallocations,
we have to be able to answer the question, "did this block come from this
allocator?" Since this is the same question answered by owns
, it should be
clear that any method we come up with to answer it can also be used to
implement owns
.
The solution, then, is to find some way of implementing owns
for any
allocator, even one that doesn't support it "natively."
If possible, we would also like to do this with minimal runtime overhead, and
to avoid penalizing allocators that already implement owns
.
Let's start with an easy example: Mallocator
. Because Mallocator
only has a
single global instance, implementing owns
only requires us to associate one
bit of information with each block: was this block allocated by a Mallocator
,
or by something else?
One way to do this, without incurring any runtime overhead, is to give
Mallocator
its own block type:
struct MallocatorBlock { /* ... */ }
struct Mallocator
{
MallocatorBlock allocate(size_t n) @trusted
{
if (n == 0) return MallocatorBlock.init;
void* p = malloc(n);
if (p) return MallocatorBlock(p[0 .. n]);
else return MallocatorBlock.init;
}
void deallocate(ref MallocatorBlock block) @trusted
{
if (block.isNull) return;
if (!this.owns(block)) assert(0, "Invalid block");
free(block.payload.ptr);
block = MallocatorBlock.init;
}
// ...
}
If the only way to get a MallocatorBlock
is from Mallocator.allocate
, then
only blocks allocated by Mallocator
can be passed to Mallocator.deallocate
.
In other words, allocations and deallocations are guaranteed to match.
Because this is enforced at compile time, by the type system, owns
is trivial
to implement:
struct Mallocator
{
// ...
bool owns(ref MallocatorBlock block) @safe
{
return !block.isNull;
}
// ...
}
If every allocator we wished to support were either global or capable of
implementing owns
without assistance, then custom block types alone would be
sufficient to guarantee matching. And in practice, most real-world allocators
do indeed fall into one of these two categories--including every allocator
currently provided by std.experimental.allocator
.
Nevertheless, it's worth considering what it would take to implement owns
for
a non-global allocator that lacks it out of the box. If it's easy to do, then
we might as well; and if it's difficult, we'll have a good reason not to
bother.
As it happens, there is at least one real-world example of such an allocator: the Win32 private heap API.
To implement owns
for this allocator, we need enough information to uniquely
identify a specific instance of it, and we need to somehow associate that
information with each block it allocates.
Broadly speaking, there are two ways we could do this. One is to add the necessary information directly to a custom block type:
struct PrivateHeapBlock
{
// ...
@system HANDLE handle;
}
struct PrivateHeap
{
@system HANDLE handle;
bool owns(ref PrivateHeapBlock block) @trusted
{
return !block.isNull && block.handle == this.handle;
}
// ...
}
According to the Win32 API docs, each private heap is uniquely identified by
the handle returned from HeapCreate
upon its creation. So, if we extend our
custom block type to store the handle of the heap that allocated it, we can
implement owns
by simply checking whether the handles match.
The advantage of this approach is that it makes owns
extremely cheap to
execute. The disadvantage is that it increases the size of the block by 50%,
and this storage overhead will be imposed on any code that uses PrivateHeap
to allocate memory.
The other approach is to use an auxilliary data structure to keep track of the association between blocks and heaps:
struct PrivateHeap
{
@system bool[Block] ownedBlocks; // AA used as a set
bool owns(ref Block block) @trusted
{
return !block.isNull && block in ownedBlocks;
}
// ...
}
The advantage of this approach is that it does not require a custom block type,
and does not impose any storage overhead on outside code. The disadvantage is
that it adds significant storage overhead to PrivateHeap
itself, and makes
owns
a more expensive operation.
Interestingly, neither approach requires any changes to our proposed allocator
interface beyond what's already been done to support Mallocator
. In
particular, once we've decided to allow custom block types, we get the ability
to store extra data in them "for free."
-
D's allocator interface cannot be made
@safe
and@nogc
without a full redesign. -
The new design must replace
void[]
with a block type that maintains the invariants necessary for safe deallocation. -
Allowing allocators to define custom block types will allow all allocators to support
@safe
deallocation with no more runtime overhead than is strictly necessary.
@p0nce
The goal of this project is to design a
@safe
allocator interface that enables the creation of generic containers. If you don't care about@safe
and have no interest in generic containers, then yes, this project has nothing to offer you.Personally, I think
@safe
generic containers that are compatible with@nogc
and BetterC would be a valuable addition to D's standard library, so I'm going to continue working towards that goal.