Skip to content

Instantly share code, notes, and snippets.

@bkardell
Last active May 13, 2019 21:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bkardell/4615f4627c011d5916a6c428f934abd9 to your computer and use it in GitHub Desktop.
Save bkardell/4615f4627c011d5916a6c428f934abd9 to your computer and use it in GitHub Desktop.

As I mentioned in Harold Crick and the Web Platform, the HTML specification contains a section on “Other Embedded Content” which specifically mentions SVG and MathML. In that piece I explained their unique histories and how they wound up being “special”.

I think that we need to talk about how that "specialness" relates to the larger Web Platform and I'd like to make the case that we need to move toward a common vision for how to move forward as One Platform. Let me explain what I mean and why this is currently problematic...

Let's talk about parsing

Imagine that I have a document:

<!-- this is the entirety of my.html -->
<p>This is awesome</p>
<em>This is awesome too
<awesome>And this is awesome</awesome>
<script>
  function logConstructor(selector) {
      let el = document.querySelector(
                selector
               )
               
      console.log(el.constructor)
  }
  console.log(logConstructor('p'))
  console.log(logConstructor('em'))
  console.log(logConstructor('awesome'))
  console.log(logConstructor('body'))
</script>

It is a fabulously under-celebrated feat that this will log the same thing in all of the browsers:

HTMLParagraphElement() { [native code] }
HTMLElement() { [native code] }
HTMLUnknownElement() { [native code] }
HTMLElement() { [native code] }
HTMLBodyElement() { [native code] }

And it will yield the same tree: Implied elements like <body> are placed in the tree appropriately, unclosed elements like the <em> are corrected the same, and so on..

Today, we just take all of this for granted. It seems like such a simple thing, but, it very much isn't and before "HTML 5" that wasn't so.

While we frequently celebrate the new features that 'HTML 5' efforts gave us, one of the biggest accomplishments of HTML5 was in finally codifying and requiring interoperable parsing rules.

Consider that for roughly half of the time the Web has been in existence, we lacked very basic and fundamental agreement on what today seem like very simple and basic things. The most popular browser in the world, by far, for example, didn't even agree on something as basic as the idea that the DOM was necessarily 'tree shaped'.

If this interests you at all, Simon Peters has started writing a book about this and his early intro chapter contains a lot of good stuff.

The "First Era of Polyfills" and Ways Forward

When 'The HTML 5 era' began, the extremely dominant browser in the market had disbanded the team. This left a big chicken and egg problem for new efforts. Despite the work to make things as backward compatible as possible, and as exciting as it all seemed, there was no obvious way forward that seemed plausible for wide adoption... That is, until polyfills.

I'm going to call this "The first era of polyfills".

We learned a lot during this era, about what was great about it, and what was frustrating and problematic. Many element "polyfills" weren't so much polyfilling as they were 'progressive enhancement transforms' that created wholly different DOM, sometimes.

Since every aspect of the platform reasons about things through the DOM, the fact that it would (maybe) disappear and become completely different DOM, at some arbitrary points in time created a lot of frustration and problems. If you're polyfilling one feature by changing the DOM, you're breaking other features at the same time: It becomes unclear how or when it is safe to reason about it, and may imply that you need to reason about two potentially very different DOMs.

And so, it was around this time - somewhere around 2009/2010 - that a lot of fundamental new conversations began about how to address a whole lot of new issues. This led to a ton of new efforts and a kind of philosophical shift that was eventually laid out in the Extensible Web Manifesto: The platform should be layered, well explained and self-consitent. It has to eventually be possible for developers to polyfill by using similar machinery and 'plugging' things right into the right parts of the platform.

Since that shift, these things have informed development after development explaining both 'up' and 'down' the platform. Fetch, streaming, custom elements, CSS custom properties, shadow dom, modules, HTML modules, CSS modules, Constructable Stylesheets, Houdini, common parsers like esprima and transpilers like Babel, Import Maps, standard libraries and built in modules and a hundred other in process efforts and reforms all somehow geared toward thinking our way through related and common things about how we move forward. The path forward for most new elements has to lead through custom elements.

Except... <record-scratch><record-scratch>

The fork in the road...

The HTML parser itself codified how to parse those other two "special" things as well. If you open a page containing some markup that is broken like this (because <p> is an HTML element which isn't allowed here):

<svg><p>Whoops</p></svg>

What you're going to get in all browsers is the same tree: A paragraph in HTML that is a next sibling of the svg element that it implied the close of and the </svg> is ignored. Further, the same thing is true if you replace <svg> with <math> - whether we say your browser 'supports' said element or not - because that's all part of the HTML parser's way of consuming the inputs, dealing with errors and understanding this 'special' embedded content. These two are special.

The trouble is, that all of this happened at just the wrong time in history and in just the right 'other' way to leave these two things largely excluded from all of those years of conversation imagining how we move forward.

So, as of right now, SVG has no concept of custom elements. As of right now, no element in SVG can have a programatically defined shadow dom. There aren't really conversations specifically around SVG's relationship to modules or anything else.

The rub

The real rub here is that all of that effort and thought about how to move forward leaves anything beyond what is implemented in SVG currently or as part of our new MathML refresh effort to establish the common MathML Core and get an implementation in Chromium is kind of left to figure things out on their own.

This seems really unfortunate. SVG, for example, is extraordinarly popular. For some context on just how popular it is, according to the (best information we currently have to work with from the HTTPArchive](https://discuss.httparchive.org/t/use-of-html-elements/1438) (note: we're working on better info soon), SVG appears as an element in markup (not even counting all uses via all methods) on more URLs in the HTTPArchive than the <main> element. It appears on (way) more than twice as many as the <video> element - about six times as many as <pre> and on more than 22 times as many as <canvas> all of which are thought of as generally successful, common and useful elements in the main HTML vocabulary.

One Platform'ing

I believe that because of their special historical circumstances, efforts toward resolving the "other-ness" of these two things and bringing things toward the ability to move forward as "one platform" is a vision we should work toward. They are part of the platform and we should work to make Math and SVG become less special, not more.

It's true that we haven't figured all of the things out in the main platform either, but we're well on our way and I believe that this would be a significant win, and doing it later will only be harder. This way, we can be sure we have these uses covered, central advancements can lift all boats, authors can reapply lessons as much as possible, we can reduce confusion and we can have a a real 'starting point' from which all of these can move foward.

But it is almost certainly a lot of work.

As the parser parses, what is actually created and exposed (and how you create and manage imperatively) is specifed in WebIDL (Web Interface Definition Language). Some IDL alignments seem like they would be fairly easy on the surface, but they do raise interesting and potentially difficult questions we'd have to sort out. While the parser ignores namespaces, and CSS will match in similar fashion by default - the underlying mechanics of what is created and how relies on XML era namespacing. Sorting all of this out would take some time and investment. This is a topic that I have briefly approached in some small detail with my friend Amelia Bellamy-Royds and in less depth with some others.

It seems that before making such an argument or attempting such work, it would be prudent to verify and demonstrate that a significant number of people believe that this is potentially important or valuable work. And so that is what I am asking of all of you: Is this a thing that you think is good and necessary? Do you think that this is a thing that you would like to see happen? If you are involved in standards, would you be supportive of such efforts? Would you be willing to help out? What are your thoughts?

Let's start the discussion. You can tag @briankardell or @AmeliasBrain with your comments (positive or negative), statements of support (or opposition) on twitter, and we'll also try to watch the hashtag #oneplatform and see where things land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment