Skip to content

Instantly share code, notes, and snippets.

@Gankra
Last active October 31, 2018 19:50
Show Gist options
  • Save Gankra/64c451ff30782a071c72001054483697 to your computer and use it in GitHub Desktop.
Save Gankra/64c451ff30782a071c72001054483697 to your computer and use it in GitHub Desktop.

----- Problem -----

When running the display list painting code, we have each item in the display list individually lookup all the live selections and try to determine if its frame is selected. This code (nsRange::IsNodeSelected) is really expensive. Expensive enough that doing this for every display item takes seconds on pages with lots of sibling nodes (such as 1000 comma seperated links). In this case I believe we take ~O(n^2) time computing selections, where n is the number of nodes in the DOM.

Mostly this isn't a problem for gecko because we have display item invalidation, letting us skip this work on nodes outside the range of the selection. However webrender has no such feature, and as such becomes incredibly unresponsive (checkerboarding due to rendering at ~0.2 fps).

Adding more caching to this code has improved it a lot, but we're still orders of magnitude off of where we need to be.

----- Solution -----

I believe we should be able to overcome this by amortizing the selection code by batching the selection computation. In particular, once we have all the state set up to compute whether a given node is selected, it becomes much simpler to compute if its neighbour is selected, as we just need to check if we're entering/leaving any of the selection ranges by moving over one node.

At that point we can store the computed selected-ness on the frame or its display items. Note that this does not propose caching and invalidating this information across multiple display lists. Rather the selectedness would only be cached for the current display list build.

However if desirable it would be a natural extension for us to start preserving and invalidating this information across display lists. As mattwoodrow noted: we already are technically caching/invalidating selection state in the display list, in that selection state is visible in the rasterization, and we are able to correctly invalidate the rasterization in a fine-grain way when selection changes.

But let's assume we're not doing any invalidation more fine-grain than "something changed, all selection state must be recomputed". The question then becomes in what way do we do this amortized selection pass. Here are a few options we discussed in IRC:

----- Range-Directed (bad) -----

The most naive option would be to simply iterate the selection ranges and mark all the frames inside of them as selected. We don't think this is a practical/viable solution because it would not scale well to very large pages, where most of the DOM is not actually visible. O(size-of-selection) is probably too expensive.

(Note that we could at least avoid the O(size-of-page) cost of clearning the selection bit on every frame by doing a sort of epoch-based approach, where each frame stores the last epoch where it was selected, and so IsSelected = mSelectionEpoch == sCurrentEpoch)

A better solution is to base our amortization off the display list, which is only the visible frames.

----- Naive-DL-Directed (impossible?) -----

Naively trying to run an amortized analysis on the DL can't really work because z-sorting clobbers the relationship between DL order and DOM order.

----- StackingContext-Directed (maybe ok?) -----

However z-sorting is a pass that is done in chunks: we build the sub-DL for a stacking context, z-sort it, then concatenate it onto the full DL. As such we have a point where we have a chunk of contiguous items in DOM-order. The kind of page we are struggling on tends to be sparse in stacking contexts (e.g. the comma seperated links example is all in one stacking context), and so this may work well enough in practice.

----- DL-Construction-Directed (maybe best?) -----

Alternatively, it may be possible to amortize this analysis over the entire DL construction by just holding onto this state as we walk over the frame tree and compute which ones should produce display items. Not sure if this would be easier or harder than the StackingContext-Directed approach.

It is unclear how exactly skipping over hidden frames should work -- should we be forced to walk over them to maintain our amortized state, or simply hard reset our state, and hope that visibility is "usually" contiguous enough that we don't need to reset state much? The latter sounds plausibly true?

---- Conclusions ----

The high level idea seems promising, but this is all based on a pretty fuzzy understanding of the whole system. In particular I personally have no idea how to implement this, and what assumptions actually hold true. Any insights would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment