asajeffrey/uprooting.md

## uprooting.md

      
    Raw
  

              uprooting.md
            
          
    Uprooting

We root way too much in DOM. Let's not do that.
Magic DOM might fix a lot of problems here, and we're not certain what the performance hit of this is, so we shouldn't really do anything about this until after that. This is more of a dump of all the ideas that came up in this discussion.
Low hanging fruit: Escape analysis

There are lot of .root()s left over from the pre-dereffable-JS<T> era. These are transitively rooted and aren't being moved, so just using Deref should work. Simple escape analysis can catch these.
Interior mutability

Both the Cell (copying, no interior references) and RefCell (interior references with manual checks) models are compatible with transitive rooting/Dereffable JS<T>. (We whiteboarded this, can't find any flaw in it)
Currently MutHeap is a Cell, but it must root on .get(). This is space efficient, but introduces more unnecessary roots for transitively rooted things. A .get() -> JS<T> API is unsafe, however,
and .get() -> &T introduces interior references which can cause unsafety.
For MutHeaps which can afford to pay the cost of an extra flag, we can avoid rooting in these situations.
The idea is to introduce two MutHeaps:
MutHeapCell<T>:

Identical to current MutHeap
.get() -> Root<T>, roots every time
.set(&T)
thin wrapper around T
basically Cell<T>+rooting

MutHeapRefCell<T>:

Identical to RefCell<JS<T>> (?)
.borrow() -> &JS<T>
.borrow_mut() -> &mut JS<T> (It is important to return &mut JS<T> instead of &mut T so that we can manipulate the tree. This is safe and won't invalidate other transitively rooted things since other interior references through the same path aren't possible due to the runtime check in the refcell)
Has extra field. Can't be used in common structs like Node.
Has runtime check overhead. Ew.
No rooting

We use the MutHeaps in different places depending on the situation.
It's very unclear if the tradeoff is worth it here.
Parsing

We root twice in parsing. As long as the tree is constructed in preorder and the Document is rooted this should be fine.

get_or_create in servohtmlparser.rs, called by append in parse.rs (removed by Ms2ger)
create_element in parse.rs

Rooting into javascript

Often we create a value, root it, and directly return it to Javascript. This isn't very optimal.
One common occurrence for this is constructors. We root in Constructor() via new(), and the root is immediately returned. Perhaps we should have separate new() and new_unrooted() (where the latter is used by constructors)? Ensuring that new_unrooted() is called last can be done by a lint.
We can also reduce rooting in methods returned to JS similarly. This can be done with an unwrap_ptr() method that converts an &JS<T> into a JS<T>, only allowed at the last position of a DOM method. We also forbid DOM methods from being called directly by Rust code without rooting.
Destructors can cause problems with this design since they run after the return value is computed.
fn SomeDOMMethod() -> JS<Foo> {
    let bomb = something_rooted;
    let x : &JS<T> = bomb.x;

    unwrap_ptr(x)
    // bomb unroots, triggers GC, GC now thinks x is unreachable.
    // boom.
}

(not sure if this can be made safer with lints. Probably.)
Rooting constructors

For constructors, NEWBI<- (example) might be useful.
The rough idea is to ensure that no code can run between the constructor returning and the object being made reachable from a root. What we're trying to avoid is cases like:
fn bad(thing1: &Thing) {
    let thing2 = Thing::new();
    code_which_may_call_gc();
    *thing1.child.borrow_mut() = thing2;
}

The problem is that thing1 is reachable, but thing2 isn't reachable until we make it a child of thing1, so might get GCd between being created and being attatched.
The possible fix is to replace *thing1.child.borrow_mut() = thing2 by thing1.child.borrow_mut() <- thing2 and then implement the placement protocol as follows:

Thing::new() returns a Factory<Thing>, not a Thing or a Root<Thing>.
Factory<T> contains a method build(mut self) -> T.
thing1.child.borrow_mut() <- thing2 calls thing2.build() and immediately attaches the result.

As a result, there is no gap between thing2 being built and attached.
Some code by pnkfelix which shows how this can be implemented using the placement protocol is at https://gist.github.com/anonymous/0cbea83ee7fc018260ef. Note that it's the finalize method at https://gist.github.com/anonymous/0cbea83ee7fc018260ef#file-playground-rs-L48 that calls the build method on the factory.