hotsphink/gc-embed.md

## gc-embed.md

      
    Raw
  

              gc-embed.md
            
          
    Embedding Spidermonkey GC: Managing pointers into the GC heap

When you have GC pointers (pointers to JS objects, strings, etc.; anything stored directly in the GC heap), you need to handle them specially if they could ever be alive (still valid/reachable) during a collection. I'll first describe what you need to do, then how to do it, and finally why and what happens if you get it wrong.


tracing - strong GC references (eg a struct member containing a GC pointer) must be traced so that the GC knows what to keep alive.


pre-write barriers aka delete barriers - if you overwrite a GC pointer, you'll need to let the GC know.


post-write barriers - after storing a GC pointer somewhere, you need to inform the GC of the new value (or more specifically, the address of the new value or the address of something it knows how to scan to find the value.)


read barriers - some GC pointers have different rules that make the above 3 mechanisms inadequate, and must be read-barriered instead (or in addition). In Gecko, these are GC pointers that participate in cycle collection (i.e., it is possible for a cycle to pass through these GC cells), or weak GC references. You need to inform the GC whenever you read one of these.


How to follow the rules

Tracing can be either unconditional or conditional. Unconditional tracing (or "rooting") is when you know that something is live for reasons other than it being pointed to by the GC heap. Conditional tracing is when you know that something is live because its containing GC thing is live. The two require different mechanisms.
rooting

stack rooting

If a GC pointer of type T is on the stack (eg it's in a local variable), hold onto it with a Rooted<T>, an RAII class that will add it to a list of things to trace on demand -- it will be traced if and only if there is a GC within that Rooted's lifetime. If the type T is not a GC cell subtype, you must also define a trace() function on that class/struct (or the class/struct it points to); see "regular tracing", below. Rooted<T>s must nest properly; if that is not doable (eg due to Maybe<> or similar), use PersistentRooted<T> instead. It's a little slower, may be used on the heap as well as the stack, and is much riskier in terms of accidentally keeping too much alive. (PersistentRooted is unconditionally traced, so it will keep things alive as long as it is alive, regardless of whether its container is alive.)
When using Rooted<T>, you will want to use Handle<T> as a parameter type. Handle<T> is essentially just T* (so Handle<JSObject*> is JSObject**), but wrapped in a C++ class so that the type system knows it points to a rooted pointer. Return values should be plain T; they will need to be stored somewhere (often in a Rooted<T>) before doing anything that might GC. Locals are Rooted<T>, parameters are Handle<T>, and return values are T. Pointers to rooted values that you will want to change -- outparams, basically -- are MutableHandle<T>.
heap rooting

Register a callback with the GC to be invoked at the beginning of a collection. In this callback, trace everything you know to be live. (Typically, this will involve iterating through your own (non-GC-heap) data structures to trace the GC pointers found therein.) For simple cases, use PersistenRooted<T> fields in your data structures; no callback is needed, as the GC knows to trace all PersistentRooteds itself. Be sure these are true roots; the containing structure must be known to be alive.
Various callback registration APIs are available depending on exactly which phase of GC requires your roots to be known:

JS_SetGCCallback
JS::SetGCSliceCallback
JS::SetDoCycleCollectionCallback
JS_AddExtraGCRootsTracer (and JS_RemoveExtraGCRootsTracer)

and for weak references

JS_AddWeakPointerZonesCallback / JS_RemoveWeakPointerZonesCallback
JS_AddWeakPointerCompartmentCallback / JS_RemoveWeakPointerCompartmentCallback

tracing

Use JS::TraceEdge to trace each GC pointer. Note that this takes a Heap<T>* or TenuredHeap<T>* because you nearly always need to barrier your GC pointers too. In the cases where this is not appropriate, you may need UnsafeTraceRoot or UnsafeTraceManuallyBarrieredEdge.
For weak pointers, call JS_UpdateWeakPointerAfterGC for Heap<JSObject*> or JS_UpdateWeakPointerAfterGCUnbarriered for JSObject** during your callback (registered with JS_AddWeakPointer{Zones,Compartment}Callback).
barriers

Store your GC pointers in a Heap<T> to have the appropriate write barriers fire automatically. If you know that what you are storing will never be nursery-allocated, TenuredHeap<T> will be a little faster because it can skip post-write barriers.
TODO read barriers
collections

TODO (GCVector, GCHashMap, GCHashSet)
Putting it all together (examples)

TODO
How/Why

tracing

The GC needs to know what is alive and what isn't, and figures that out by starting from a set of roots and tracing through everything reachable in the heap. Also, both generational and compacting GC will move cells, so all pointers into and within the heap may need to be updated.
pre-write barriers

Pre-write barriers (deletion barriers) are for incremental GC. The GC needs to trace through the entire heap, but it's mutating as we traverse it. We start from a set of roots and advance methodically, but if the graph is mutated so that we can't get to part of the live heap, we'll incorrectly discard the portion we can't get to. (eg you have an object graph A -> B -> C. You trace A, discovering B and pushing it onto the mark stack. Then you run some more code, which removes C from B and sticks it on A instead. Now when you trace B, you won't find C, and you'll never go back to A because you've already traced it. C will get thrown out even though it is still reachable from A.)
But to be part of the live heap, you must have come from somewhere. The only possibilities are things that were allocated since we started the incremental collection, and things that have been read out of a different part of the heap. We handle the first by saying that everything you allocate since the start of the incremental GC is live. We handle the second by observing that things you read out of the heap will eventually get marked, unless the pre-existing edge to them is removed from the graph (probably by being overwritten with something else). So whenever we remove an edge (i.e., write to a location in the heap), we immediately mark the cell being removed.
post-write barriers

Post-write barriers (insertion barriers) are for generational GC. The GC wants to be able to throw out garbage in the nursery quickly, by only tracing live cells (we want nursery collection time to be proportional to the number of live cells, independent of the number of garbage cells). We can do this if we know all roots pointing into the nursery, as well as any pointers from the tenured heap into the nursery. Roots are handled by tracing all roots. Tenured -> nursery pointers are the job of the insertion barrier: whenever you write into a tenured cell, we check if that cell is in the nursery, and if so add it to a "store buffer". When doing a minor (nursery) collection, we trace through all roots, followed by everything in the store buffer, recursing until we encounter a tenured pointer or run out of work.
read barriers

cycle collector barriers (unmark gray)

The cycle collector (CC) needs to be able to detect garbage cycles that pass through the GC heap. After marking all live objects black, we do another traversal from "gray roots" to mark everything that is reachable and non-black as "CC-gray" (GC literature commonly uses "gray" to refer to the frontier between the unmarked and marked portions of the GC heap; this is different.) Then when looking for cycles, the CC will traverse only CC-gray objects in the GC heap.
But a problem occurs if a CC-gray object is read out of one place in the heap and inserted as the child of a live object, or perhaps stored into some C++ (non-GC) object that is reachable from a regular root. We do not barrier such writes, so the CC could trace through this object, find out that it is a member of a cycle that is the only thing holding the cycle pieces alive, and free it. Freeing things that are reachable from live things is bad. To prevent this, we barrier all reads from within Gecko. If we read a CC-gray object, we un-gray it (recursively).
weak references

By definition, a weak reference should not hold a target live by itself. That means that we can't trace weak references when we encounter them during GC graph traversal. Instead, we append them to a list. After marking everything live we scan through the list to null out references to garbage.
But this also means that pre-barriers are inadequate to prevent incremental GC from missing edges due to graph mutation. Live cells will be marked either because they're reachable or because they were marked by a pre-barrier before being removed from the graph. But reachable weak references aren't traced! So you could read one out and write it behind the frontier and no pre-barriered removal will mark it (it's not being removed, just read). So we barrier the read instead; if you ever read a weak reference in the middle of an ongoing incremental GC, then the target of the weak ref will be marked.
Implementation

Rooted<T>

Rooted<T> needs to know how to trace its contained T. For builtin types (eg JSObject*, JS::Value), it uses JS::MapTypeToRootKind<T> to look up an appropriate root sublist to insert onto, and then at collection time it will perform the appropriate action for each sublist.
For anything else, JS::MapTypeToRootKind<T>::kind will be set to JS::RootKind::Traceable, in which case the root lists will store a trace function alongside each rooted value. The trace function will be determined by the relevant specialization of JS::GCPolicy<T>, specifically it will be JS::GCPolicy<T>::trace. You can specialize this however you like for your own types, though usually you will want to stick with the default, which is to invoke a trace(JSTracer*) function on your struct.
TODO: trace through templates and calls for other GCPolicy users? Sweeping?
Performance Notes

Rooted<T> is cheap. It is a singly-linked list append. Most of the time, you will not actually GC within its scope, so the total cost is one insertion and one deletion.
Pre-barriers are only needed when in the midst of an incremental collection, so most of the time they should be a single predictable branch.
Post-barriers add records to the store buffer, and come in two main flavors: edge records and whole cell records. A whole cell record means you just store the containing object, and trace the entire thing looking for nursery pointers (eg if it's a JSObject, you'd trace all of its properties.) This is simple, fast, and small to store (a single pointer), but tracing a stored object could be slow for large objects.
Edge records store the addresses of locations containing GC pointers, and are given both the previous and new values. If the previous value was already a nursery thing, they don't need to do anything (we know we've already stored that address). To save space and speed minor GC, if you replace a nursery pointer with a tenured pointer, the address will be removed from the store buffer. In practical terms, that means the common case is looking up whether the old and new values are in the nursery or not, which is done by masking the address to get to an arena, then loading in a footer. (Both the nursery and the tenured heap use arenas, and store a bit in their footer telling you which is which.) So an edge post-write barrier costs at a minimum two memory reads, probably far enough from the pointers that they won't be on the same cache line, plus a few branches. In the uncommon case, the store buffer will then need to be updated (for insertion or removal.)