Skip to content

Instantly share code, notes, and snippets.

@bkardell
Last active May 6, 2024 05:07
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bkardell/e4848d58096202cad6540e521172d5b6 to your computer and use it in GitHub Desktop.
Save bkardell/e4848d58096202cad6540e521172d5b6 to your computer and use it in GitHub Desktop.
Quick reply to Peter Rushforth's tweet https://twitter.com/prushforth/status/995743137225101312

Think about how rules are written, applied and work.

In talking about a tree, you have very few basic constructs: Element name, attribute, ancestor/descendant relationships.

You write a CSS selector like .foo bar * bat. These are based on, effectively, element names and attributes (tho there are some 'special' kinds of attributes like ID which is really just an attribute with unusually high specificity and class which is really just an attribute that contains a serialized DOMTokenList). Aside from these the only other thing in your vernacular is "descendant of" or "child of" or some few things about your immediate siblings.

Now, think about when the document is parsing, what does the browser have to do? At some point, it needs to figure out "which rules apply to this element right now" and then make sure that they are applied, in specificity order so that it can figure out what to paint.

Let's imagine that you have a few hundred rules, that's not entirely unusual. Let's imagine that your DOM tree has 2k elements (the WHATWG HTML living standard single page version has very few extraneous elements that exist for styling purposes and it's not highly interactive, but it has > 155k - so, in my experience, ~2k elements is not really a big ask). Every time it parses an element it needs to ask 'does this apply to me'? So, if we took a very simple approach to this, that is 2k checks times a few hundred, right? Well, it depends on what you mean by 'checks' because a check involves "does this meet this criteria" which means considering that the entire selector is 'true' - walking up the tree from whatever depth, very frequently all the way to the root before you can really paint.

Just doing this performantly is wildly difficult - it takes all sorts of gymastics to make it plausible to do this. In fact, for a long time there until we hit on lots of common tricks people actively worried really a lot about selector performance for very good reasons.

Now, note that this could have been much worse still because the thing you might have noticed is that CSS selectors are constructed in such a way to be pretty -stable- during this process. It's designed to really limit (eliminate in many common cases) the chance that as you go, something happens that invalidates the stuff it figured out earlier and causes it to have to be re-evaluated. Because it always goes in one direction, for example, you don't need more than what has very probably already been parsed, and the limit of your selection is more or less within your parent node. Having these kinds of rules enables all sorts of optimizations that make actually respecting your CSS be somewhat practical to accomplish by taking advantage of ways to short-circuit lots of dead ends.

This is the primary reason why, for example, CSS still has no :has(...) despite the fact that this is an obvious need/feature in describing how to attach style to a tree and having been in specs and discussion since 1998. Because adding it would mean that parsing any element could have larger ripple effects.

Because it is allowed to express less than something like XPath, it is also considerably simpler in number of concepts that you need to understand and recognize. In that sense, it is just "simpler for everyone". Because it was possible to explain and possible to implement and optimize, it was very successful with a large number of people, despite being notably under-powered.

If your only criteria was expressive power, XPath is the clear winner - just as if your only criteria were quality, Betamax is technically superior to VHS. But that wasn't the fitness function that mattered.

So, could we make a style language in which XPath was the selector language? Sure. In fact, there were some and there were even implementations. They were quite a lot different from CSS, to be sure, but they lost. So maybe the question is "could we make CSS, but simply in which the selector language is XPath and that is the only difference?" Here too, I don't think so because there is no proof that the complex bits of it could get implemented and work performantly and there is an argument to be made that we could simply add those features to CSS if it could and that that would be a smoother transition anyway.

@prushforth
Copy link

Thank you for this explanation! I think this makes sense, and I can see why it is as it is. Also, if you don't / can't get the elements you want selected, you can always apply JavaScript. Actually, it was using querySelectorAll that has made me wish for xpath, not so much CSS issues. And I see that xpath does not address what people are talking about global scope: the same thing would exist using xpath, afaict. Cheers!

@bkardell
Copy link
Author

Two things: first, this is why css has introduced profiles recently (practically speaking, selectors that only have to work in qsa). second, browsers still have xpath implementations.

@prushforth
Copy link

I have played a little with the xpath implementation, will also have a look at this profiles thing of which you speak ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment