As I understand it, the core of the Rust GC support functionality will be a function that looks like fn gimme_roots() -> Vec<..>
or fn walk_roots<F>(f: F) where F: Fn(..)
. This can be implemented via LLVM stack roots and then walking them.
The first question is: What does the compiler identify as a root? On the flipside, how do GC library writers tell the compiler that something should be included in walk_roots
?
(this solution is flawed, skip to the trace trait one if you want)
One way to do this is an attribute, #[root]
or something.
#[root]
struct Gc<T> {..}
If, for example, I have a struct Foo {x: Gc<T>, y: u8}
on the stack, walk_roots
will pick up x
.
But this design doesn't take into account heap objects. We could declare that putting a GCd object on the heap is invalid, create our own BoxGC<T>
and VecGc<T>
objects that are counted as roots, and introduce a !Gc
bound on Box and Vec.
This solution feels restrictive and hacky to me.
An alternative in this model is to treat pointers as special, if we notice a *const <thing-that-contains-gc>
transitively on the stack, stick extra information in the stack frame about the offset of the GC object or some such. This might get cumbersome, especially with double-pointers and such. It also will not distinguish between "borrow" pointers and "owning" pointers (we could involve Unique if we wished), though this will only make it a bit more conservative than it needs to be, which is ok.
This scheme also doesn't understand vectors. :(
It's pretty clear at this point that we need some way to include arbitrary functions which tell you how a GCthing is to be traversed. The best thing to do here is a trait, which I'll call Trace<T>
:
trait Trace<T: Tracer> {
fn trace(&self, tracer: &mut T);
}
trait Tracer {} // marker
fn walk_roots<T: Tracer>(tracer: &mut T) { /* dark llvm magic */ }
We have two traits for object safety reasons; we want to be able to do Box<Trace<MyTracer>>
to use trait objects with the GC. The : Tracer
bound could be replaced with a Fn
bound
Tracer itself is a trait because we don't want to restrict GC implementations. Multiple GC implementations may exist, each with their own Tracer
. Trace will be implemented generically on most types (impl<T> Trace<T> for ...
), however GC implementations will only implement the specific Trace
on their GC (impl<T> Trace<MyTracer> for MyGc<T>
).
When something that implements Trace
is encountered whilst compiling, the stack frame gets two entries. One has an offset pointing out where the object is, the second has function pointer to the Trace
implementations for each type of tracer being used in the program. (It could also have vtable offsets for each type of tracer being used)
Instead of the function pointer, we could store a single vtable offset (and structure all Trace vtables the same -- this bloats the vtable more, but bloats the stack less).
Note that we're actually using the Trace
trait to traverse roots, which is a distinct operation from tracing, though it is performed in a similar way. Some GC implementations may need two tracers, one for rooting and one for tracing. rust-gc handles this by having root()
and trace()
methods on Trace
, which both walk the struct but do different things when they come across a Gc<T>
.
Trace<T>
could be auto-derived, relying on the optimizer to get rid of trace
calls to structs not actually containing Gc<T>
.
An improvement on this is an OIBIT-like scenario, where Trace
is autoimplemented in most cases. However, OIBITs can't have methods. Something like this would be nice:
unsafe trait Trace<T> {
default fn trace<T>(&self, tracer: &mut T) {
intrinsics::trace_fields::<T>(&self, &mut T)
}
}
unsafe impl Trace for .. {}
/// GC code
unsafe impl<T: Trace<MyTracer>> Trace<MyTracer> for Gc<T> {
fn trace(&self, tracer: MyTracer) {
// ...
}
}
// should this be allowed?
// it may have coherence implications (??)
impl<T> !Trace<MyTracer> for RefCell<T> {}
This of course doesn't work right now, but would be a neat way to extend OIBITs. Furthermore, this lets us optimize and only look for types which directly implement Trace
.
Whether or not we want to allow negative impls is another question. rust-gc needed to be able to exclude RefCell
from the set of types allowed inside Gc<T>
, which it did by not implementing Trace
on that type. With the rust GC hooks, GC crates will no longer have this liberty since Trace
will be auto-implemented by derive or OIBIT. In the case of derive there isn't any solution here, but with OIBITs we can allow negative impls. However, negative impl'ing a specific instance of a generic type sounds like something that would break coherence badly. Code in parent crates can't assume that Foo<T>: Trace<U>
if T: Trace<U>
, etc -- we could disallow using Trace<U>
(over a concrete Trace<MyTracer>
) in bounds outside of libstd, but that might be too restrictive?
In the case of rust-gc, the "negative impl" of Trace
on RefCell
was necessary because we manually kept track of roots and refcell lets you undermine this tracking. With the rust GC hooks this will no longer be an issue, since you don't need to manually keep track of roots. However, for safety reasons it might still be necessary to exclude types like refcell to avoid mutable aliasing whilst tracing.
If we are to stick to pay-for-what-you-need, the above designs by default will not work; since most types will be Trace
, and the compiler will stick trace functions in the stack for all of them, which will bloat the stack massively.
We need to modify compilation to be aware of the GC types being used and affect the monomorphizations accordingly. Non-generic functions will contain GC stack maps if they use a type that indirectly contains something with a nontrivial Trace
implementation. Most types will have an effectively empty Trace
implementation (which we know because of the OIBIT) so most of these will go away.
There still is a bit more optimization that would need to be done -- types like Box<T>
and Vec<T>
have a nontrivial Trace
implementation, but most instances won't actually need it since T
won't have a nontrivial trace implementation. Since Box<T>
and Vec<T>
are used in libstd, not performing this optimization means that we must distribute libstd-with-gc binaries or have GC users recompile libstd with GC.
I'm not sure how this optimization is to be done. We could have an attribute (or OIBIT) #[gc]
, which is used to mark types which the stack map creator should care about. Types which are transitively #[gc]
(e.g. Box<type which is gc>
) are all put into the stack map.
There is a sort of distinction between Trace types that's cropping up here (and crops up later too):
- Types like
Gc<T>
which actually need tracing. They have a non-default trace impl, which is always important. They know what a GC type is, and can mess with it. This includes types which explicitly contain aGc<T>
(transitively). - Types like
Box<T>
, which sometimes need tracing. They may have a non-default trace impl too, but in many cases that trace impl is unnecessary ifT
itself has a nontrivial trace impl. They do not know what a GC type is, they only know what the operation of tracing is and faithfully pass through this operation. This includes types which don't contain any GC types at all (and have no type parameters) and thus have an effectively empty Trace impl.
Basically, the distinction is between "types that know they contain a GC" and "types that may or may not contain a GC".
The former category does not exist in libstd. The latter does, but the latter category is something that can be sorted out during monomorphization.
A major problem is destructors. If I have a cycle made from struct Foo {next: Gc<Foo>}
, and Foo itself has a destructor which reads next
, we have a problem. At some point whilst collecting the cycle, the struct being collected will have its destructor run, but its next
value will already have been collected, causing a segfault.
Solutions:
GC implementations could make this a panic, but that means they have to have a deref()
method that isn't actually a deref, which will probably hurt performance a lot.
(design by eternaleye)
It's actually impossible to create a cycle of struct Foo {next: Gc<Foo>}
. Rust doesn't support usable uninitialized types, so at some point while creating this you need to be able to refer to a value you haven't initialized yet. Usually, this is done with struct Foo {next: Option<Gc<Foo>>}
or some such. This leads to an observation that there's always a way to detach a cycle safely. You then add an unsafe detach
method which is implemented by all Trace things, with a nontrivial implementation on anything that can be detached (Some -> None for option, emptying a vector, etc). This is called on every object that is slated to be collected before the action
This is an interesting design, but it puts a lot of burden on the library user, which I don't think we want. If one was to use this, it would anyway be better to use a cycle collector.
We could make it so that an explicit trace
impl (and types containing these) are not allowed to have Drop
impls. This is overly restrictive, since it catches Box
and Vec
too.
We need a way to distinguish between types that know they contain a GC, and types which are oblivious to the GC they contain, as discussed in the previous section. We can then forbid Drop
impls on types that contain a GC.
This may still be overly restrictive. It would be nice if we could structure it so that GC implementors can decide to use this strategy for their GC types only. I'm not sure if this is possible without major specific changes to the type system.
mem::forget
is already safe. So API authors already assume that Drop::drop
may or may not get called (but will be called at most once). This is reminiscent of finalizers in GC languages, which may or may not be called, and should not be relied upon for cleanup.
In this situation, Rust does not need to change its type system at all. GC implementors are required to deal with this. A mark-and-sweep collector will have two mark cycles, and a finalizer cycle. After the first mark cycle, a finalizer will get called: each unmarked Gc<T>
will have Drop::drop(&mut *gc)
called (the type system doesn't let you call this directly, but there are intrinsics for this iirc -- if not we can add them). Now, you run the mark phase again, and sweep any Gc<T>
s (mem::forget
ing their interiors) that were marked in the second phase. Drop::drop
may alter the graph, making some reachable objects unreachable and vice versa. We need to avoid collecting the latter. We do collect the former -- they did not get a run of Drop::drop
however, so they may leak internals. This can't be avoided in any finalizer design that wishes to halt.