Skip to content

Instantly share code, notes, and snippets.

@bkardell
Last active January 30, 2020 22:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bkardell/e43390351a99f3dba7a76c5a2f816f62 to your computer and use it in GitHub Desktop.
Save bkardell/e43390351a99f3dba7a76c5a2f816f62 to your computer and use it in GitHub Desktop.

MathML Core Explainer

Authors:

  • Frédéric Wang
  • Brian Kardell

Draft specification:

MathML Core - W3C Editor's Draft

Abstract

MathML Core is a definition of a fundamental subset of features described in the current MathML 3 recommendation. It attempts to resolve several problems created by MathMLs origins, history and complex status, and properly define its integration in the modern Web Platform in rigorous ways. The specific subset is derived based on what is widely developed, deployed, proven and used in practice.

Table of Contents

Goals

  • To provide users with efficient, natural, readable and high-quality rendering of mathematical notations, consistent with other text they encounter in the browser.

  • To provide authors with native, efficient and interoperable rendering of mathematical notations that they are able to reason about in a manner consistent with the rest of the Web Platform.

  • To rigorously define the necessary subset, how it works and properly integrates into the Web Platform and ensure testable and interoperable implementations.

  • To establish a productive and agreeable starting point for additional work and conversation going forward and make it possible to more easily explore more, consistent with the rest of the platform.

Non-Goals

  • To provide a self-contained solution to problems ultimately better explored through another area of the platform. MathML Core relies as much as possible on existing Web Platform features and provides a platform-aligned starting point to solve more problems. Examples include, but are not limited to:

    • Specific elements or attributes for styling which are better described by existing or new CSS features.

    • Complete and explicit description of semantics which are better described by extending ARIA.

    • Open-ended elements to allow implementation-specific features instead of standard techniques for customizations and extensions.

    • Native support for editing, interaction, exploration, simple input syntax or other advanced features that are better handled by DOM/JavaScript and math libraries.

    • Complex graphical layout which can instead be performed by embedding MathML in HTML/CSS or SVG.

      Figure 2: Formulas in a commutative diagram.

      Commutative diagram for the 'first isomorphism theorem'

  • To fully explain mathematical rendering via as-yet-to-be-defined low-level primitives. Rather, these serve as inputs to their possible definition and provide valuable insight into needs.

Background: MathML

  • MathML is the standard developed at the W3C in the mid/late-1990's XML/XHTML era.

  • It received much attention and has created a vibrant ecosystem of implementations and integration outside of web browsers

  • CSS, the DOM, the way we write specifications or prove support and interoperability was considerably under-defined. As a result, the MathML specifications contain several co-evolutionary overlapping approaches better solved elsewhere in the modern platform and lack important levels of detail.

  • MathML was supported via a plugin in early IE, it was integrated into the HTML / Parser specifications by WHATWG in the mid-2000's. All HTML compliant parsers parse MathML specially whether they support anything to do with rendering or not. All browsers (until now) present these uniquely in DOM as simply "Element". MathML was thus explicitly disadvantaged.

  • It was implemented in Firefox about the same time. It gained an implementation in Webkit shortly before the blink split, when it was removed due to complexity and early issues requiring significant attention while Chrome engineers were trying to rework the engine.

  • Spec-work continued, without implementation. As a result, it contains much that is theoretical, including over 150 elements.

Basic example...

The <math> element provides a standard for authors to express and work with text containing generalized relationships about mathematics, in a way very similar to how <table> does for expressing text containing relationships about tabular data.

<math>
  <mfrac>
    <msup>
      <mi>x</mi>
      <msqrt>
        <mn>5</mn>
      </msqrt>
     </msup>
    <mrow>
      <mi>α</mi>
      <mo>×</mo>
      <mn>7</mn>
    </mrow>
  </mfrac>  
</math>

*Figure 1: MathML/DOM for the above

Visual MathML rendering as nested boxes representing the DOM tree, with corresponding tag name annotated for each box.

What is MathML-Core?

MathML Core is an attempt to create a minimal version of MathML that is well aligned with the modern web platform. It aims to resolve long-standing issues with the split evolution of philosophies between MathML specifications and the larger web platform and create a well-defined starting point based on what is currently widely implemented and increase testability and interoperability.

The elements of MathML-Core

MathML 3 contained 195 elements. MathML-Core focuses on just 32. Several of these elements exist in deprecated form and simply exist to map the elements and their attributes to newer concepts (let them explain the actual magic) in much the same way font remains. It provides a recommended UA stylesheet for implementation, and adds a couple of new Math oriented display types.

Here is a brief rundown of what those elements are...

  • the math element itself
  • 3 elements called semantics, annotation and annotation-xml which simply provide other annotations or potential semantics in existing content but are generally not rendered.
  • 6 token elements - "Token elements in presentation markup are broadly intended to represent the smallest units of mathematical notation which carry meaning. Tokens are roughly analogous to words in text. However, because of the precise, symbolic nature of mathematical notation, the various categories and properties of token elements figure prominently in MathML markup. By contrast, in textual data, individual words rarely need to be marked up or styled specially." These are (mtext, mi (identifier), mn (number), mo (operators in a broad sense), mspace, ms (string literal - for things like computer algebra systems)
  • Layout/Relationship elements mrow(for grouping sub-expressions), mfrac (for fractions and fraction-like objects such as binomial coefficients and Legendre symbols), msqrt and mroot for radicals
  • mstyle (legacy compat, deprecated - just maps to css)
  • merror (legacy compat - displays its contents as an ”error message”. The intent of this element is to provide a standard way for programs that generate MathML from other input to report syntax errors in their input.)
  • mpadded - a row-like grouping container which has attributes that map to CSS
  • mphantom - a co-evolutionary/legacy row-like container that just adds a UA style that maps to visibility: hidden;
  • menclose - a row-like element for various types of 'enclosure' renderings (see examples at https://developer.mozilla.org/en-US/docs/Web/MathML/Element/menclose)
  • 3 elements about subscripts and superscripts msub, msup and msubsup
  • 3 elements about underscripts and overscripts munder, mover and munderover
  • 1 element about prescripts and tensor indexes (mmultiscripts)`
  • 3 elements about tabular math (mtable, mtr and mtd)

Design Discussion

Not reinventing the wheel

  • As explained in the introduction, MathML is already integrated into numerous standards and shipped in two Web engines. Consequently, a new format to replace MathML would be a drastic change of direction and a source of backward compatibility and interoperability issues.

  • For a native mathematical rendering to be possible, it must adhere to modern browser designs and a significant effort is being made to ensure that MathML Core achieves that goal. For example, all browsers use internal tree structures, follow CSS invariants or try to keep code size minimal to facilitate security, maintenance, testing, etc

  • One must not duplicate existing Web Platform features. As explained in the non-goals section, MathML Core tries to rely as much as possible on existing Web Platform concepts from HTML5 or CSS to describe its implementation. Non-fundamental mathematical features that can be easily replaced with polyfills or extensions are removed.

  • Rendering of mathematical formulas follow well-established rendering rules from TeX and OpenType which are integrated into MathML Core. A naive box layout would be enough to get interoperable rendering but is likely to lead to poor spacing, placement or text rendering inside mathematical formulas.

    Figure 3: Top: Chrome 23 using MathML3 rules and internal heuristics ; Bottom: Igalia's Chromium build using only MathML Core rules.

    Screenshot of MathML in Chrome 23 and Igalia's Chromium build, showing the visual improvements when following MathML Core instead of MathML3.

Applying Extensible Web principles

The biggest design decisions centered on how to apply Extensible Web principles in our own work, as MathML sits in a very unique place in history, and how it "fits" into the platform. Not only does it have existing implementations, very wide adoption and expectations and integration through the HTML parser, but we are approaching it while standards that in the future might theoretically expose the magic for mathematical layout, such as the CSS Layout API and related Houdini standards, are still developing and significantly in flux.

In order to balance all of this we decided on the following:

  • Normalize the DOM. Because of when and how it was defined, MathML in all browsers was exposed to the DOM (in all browsers, through the parser) as simply Element. MathML is historically uniquely disadvantaged in this way. All elements in HTML descend from HTMLElement or are HTMLUnknownElement. All elements, even SVG define, some common surface (through a mixin which was called HTMLOrSVGElement). Without remedy, this means that MathML elements lack over 100 bits of API surface. They have no .style property, but are stylable with CSS, for example. This is unpredictable and confusing for authors who come to MathML and fundamentally limiting for the application of any real Extensible Web ideas. Aligning the IDL for MathML with the rest of the platform, however, allows that all of our principles and separations (for example ARIA, AOM, Houdini, etc) can move forward in tandem.

  • Acknowledge that some minimal math magic exists in the platform already in two browsers. Our goal then is to not simply block a final implementations of high-level features but to apply Extensible Web principles reasonably and pragmaticaly: Keep it minimal and carefully develop what serves as useful input to the ultimate definition of lower level Houdini APIs.

  • Increase compatibility with CSS. We provide a design compatible with CSS layout and describe how CSS properties are interpreted, so that authors can reliably use them to customize math layout.

  • Where possible, attempt to expose information to authors which would be necessary in polyfilling, libraries or extending the platform through platform consistent mechanisms.

Figure 4: Example of using CSS, JavaScript or the Layout API to enhance MathML Core with user-defined features.

<style>
  math {
     font-family: STIX Two Math;
     color: blue;
  }
  mfrac {
     border: 1px solid dotted;
     padding: 1em;
  }
  .myFancyMathLayout {
     display: layout(myFancyMathLayout);
  }
</style>
<math>
  <mfrac>
    <mrow class="myFancyScriptedElement">
        ...
    </mrow>
    <mrow onclick="myInteractiveAction()">
        ...
    </mrow>
  </mfrac>
</math>

Considered Alternatives

Leave math reliant on SVGs and/or JavaScript libraries

Writing systems define how we share information. Mathematical notations form a fundamental aspect of writing systems. Math is text, and it is a normal part of text: Mathematical notations are found in all civilizations. They have been instrumental throughout history for the diffusion and development of scientific and technical knowledge. The need for browsers to natively render this kind of text was evident from the earliest days of the Web at CERN. We believe that according to the W3C TAG's Ethical Web Principles it is not good for either the Web, the directly impacted communities of authors, or ultimately society to specially disadvantage such an important aspect of communication.

Abandon MathML in favor some new thing

There are numerous criticisms of MathML. Like all aspects of the existing platform, for example, more succinct forms of expression exist that many authors are more comfortable writing (e.g. linear text syntax used in LaTeX or Computer algebra systems). Like other aspects of the platform, it is also possible to be more semantic than MathML currently provides.

A few things don't change though and among them is the difficulty in rendering interoperable mathematical formulas with good quality. Abandoning MathML would be a rejection of an entire ecosystem and decades of work in standardization and advancement with little hope that any of the current state would change in any reasonable timeframe. This would be tragic as we don't generally require that authors use complex libraries in order to layout text, or recommend that they be inserted as images. We believe that getting native math rendering is the right thing to do and that a tree is good.

Why a tree is good...

Trees of text relationships aren't the most succinct or easy to type ways to express things. However, this is true of all HTML too. That's why a lot of HTML is generated from simpler forms like markdown or tools like rich text editors or templating. A rich ecosystem of tooling has been developed over many years for generating and editing MathML too.

But expressing the content is only part of the challenge and the platform is heavily oriented toward solving these problems via just such a tree. Many benefits flow naturally from simply matching the platform here and expressing mathematics as a standard tree of relationships:

  • Browser implementations can natively handle their rendering, as text, efficiently and fluidly.
  • Authors can style individual aspects of the equation, for example for educational purposes.
  • Authors can ensure that their text, colors, etc match and scale appropriately
  • Authors can create interactivity with those elements or manipulate them (educational purposes are a good example here too)
  • Software can be used to derive more meaning from context in much the same way that search engines do (there are in fact, applications that do this)
  • We are granted common, platform-fitting places to attach additional semantics through existing mechanisms.
  • 'Find' text works

Building atop...

Given these abilities and approach, building atop additional semantics, extensions, conversions and further explorations** becomes very plausible. It is even entirely plausible to support shorthand expansion from forms like LaTeX or ASCII Math, in much the same we can for Markdown. Patterns for extending shorthand notations like these are a common class of problem that should be well explored and, still, probably rendered into a Shadow tree natively if ever supported natively.

Figure 5: LaTeX source in a custom element rendered using MathML in a shadow DOM, with the Latin Modern Math font ; From top to bottom: Blink (Igalia's build), WebKit (r249360) and Gecko (Firefox 68)

<la-tex>
  {\Gamma(t)}
  = {\int_{0}^{+\infty} x^{t-1} e^{-x} dx}
  = {\frac{1}{t}
     \prod_{n=1}^\infty
     \frac{\left(1+\frac{1}{n}\right)^t}{1+\frac{t}{n}}}
  \sim {\sqrt{\frac{2\pi}{t}} \left(\frac{t}{e}\right)^t}
</la-tex>

Screenshot of a MathML formula in different browsers.

Focus instead solely on lacking primitives

A big part of the challenge of focusing on lacking primitives is that it leaves open the question of what is lacking. The main proposals here of things to focus on have to do with additional semantics, 'stretchy characters' and complex alignments. While we agree that these are all excellent goals, we believe that they are also very independently pursuable, and that both causes are boosted by doing so.

However, without also providing a detailed layout specification, pursuing native rendering in all browsers or performing interoperability tests it becomes very hard to design a full browser-compatible math rendering implementation and to introduce necessary web platform primitives. Thus we again relegate ourselves to the current state of one of the hardest problems in a way that we don't for other forms of text.

Enhance MathML3 but keep all or most features

Another approach would be to integrate the TeX/OpenType and HTML5/CSS improvements but at the same time preserving all or most features from MathML3. We discarded this approach for several reasons:

  • Some MathML3 features don't integrate well within the web platform and it is not clear how to keep them and at the same time try to align MathML with browser design. Other features duplicate existing web platform primitives without re-using them. As explained in section "Not reinventing the wheel", these are strong blockers to get new features accepted. Indeed, many of these have never been implemented in browsers or have been removed.

  • Some MathML3 features have very low or almost null usage. This means it is very difficult to justify effort for implementing them natively and maintaining them while the rest of the codebase evolves. Instead, we prefer to focus on a small subset that is used in practice, and in agreement with Extensible Web principles, add the necessary APIs to let users build extension on top of MathML Core.

  • MathML3 has many features, is underspecified, lacks automated tests and is only partially implemented in browsers. This means that keeping all the features and at the same time achieving interoperability would require a huge effort. Again, the choice was instead to consider a subset of manageable size, corresponding to what is used on web pages and implemented in two WebKit and Gecko.

Stakeholder Feedback

"What is MathML Core?" for people unfamilliar...

The <math> element provides a standard for authors to express and work with text containing generalized relationships about mathematics, in a way very similar to how <table> does for expressing text containing relationships about tabular data.

What does it look like?

Here is some MathML that renders a simple, but "big" sigma notation:

<math>
  <mrow>
    <munderover>
      <mo>∑</mo>
      <mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow>
      <mrow><mn>83924300</mn></mrow>
    </munderover>
    <msup>
        <mi>k</mi>
        <mn>1</mn>
    </msup>
    <mo>=</mo>
    <mn>3,521,644,107,207,150</mn>
  </mrow>
</math>

That seems verbose/Why a tree?

Trees of text relationships aren't the most succinct or easy to type ways to express things. However, this is true of all HTML too. That's why a lot of HTML is generated from simpler forms like markdown or tools like rich text editors or templating. A rich ecosystem of tooling has been developed over many years for generating and editing MathML too.

But expressing the content is only part of the challenge and the platform is heavily oriented toward solving these problems via just such a tree. Many benefits flow naturally from simply matching the platform here and expressing mathematics as a standard tree of relationships:

  • Browser implementations can natively handle their rendering, as text, efficiently and fluidly.
  • Authors can style individual aspects of the equation, for example for educational purposes.
  • Authors can ensure that their text, colors, etc match and scale appropriately
  • Authors can create interactivity with those elements or manipulate them (educational purposes are a good example here too)
  • Software can be used to derive more meaning from context in much the same way that search engines do (there are in fact, applications that do this)
  • We are granted common, platform-fitting places to attach additional semantics through existing mechanisms.
  • 'Find' text works

Text Rendering and writing systems

Regardless of how it is expressed, the rendering of mathematical text is an important part of many writing systems. How it is rendered matters quite a bit. Below is a comparatively simple equation that can be understood as mathematics as rendered in the block direction.

 same comparatively simple bit of typically rendered sigma notation as a reader might expect, inline

Equations are also frequently inline with text and note that the space is optimized differently. The size is different, baselines are managed appropriately with the text. Note how as the overscript moves into superscript position inline, the k identifier shifts right and moves the text.

 comparatively simple bit of typically rendered sigma notation as a reader might expect

In practice, the rendering math is considerably a mix of the two with non-trivial rules to do so - as shown below...

a screenshot of some actual rendered mathematics text

If these visual relationships of text are wrong, they are no longer visually understandable.

So what is MathML Core?

MathML has been in development since the mid-1990s.
It enjoyed a lot of success early on in tools, and then slowly in some browsers via extensions, and then some parts even in browsers themselves. Because of this success, development continued and authors to attempt to solve increasingly hard problems.

However, its complex history leaves a lot to be resolved in practice. As just a few example:

  • Because of its co-evolutionary past with the rest of the platform, it created early 'solutions' that were later better handled or specified elsewhere in the platform. It had to invent and define (not especially well) units for measurement, for example - while CSS now has a range of great units.
  • Because it came from an XML past, it had concepts from XML that didn't entirely fit - like an 'error' state
  • Because of how and when it was integrated into browsers it lacked proper interface definitions to match the platform (no .style property is an easy example)

MathML Core, then is an attempt to resolve all of these things for a minimum, important subset that doesn't suffer from these problesm - and find a way forward such that:

  • vast amounts of existing content (there is a considerable amount - Wikipedia alone has over half a million math elements) continues to work
  • vast amounts of tooling continue to be useful
  • things realign with the modern platform in all other ways.
  • things are very well defined

The elements of MathML

MathML Core focuses on 32 elements. It provides a recommended UA stylesheet for implementation, and adds a couple of new Math oriented display types. Several of these elements exist in deprecated form and simply exist to map the elements and their attributes to newer concepts (let them explain the actual magic) in much the same way font remains. Here is a brief rundown of what those elements are...

  • the math element itself
  • 3 elements called semantics, annotation and annotation-xml which simply provide other annotations or potential semantics in existing content but are generally not rendered.
  • 6 token elements - "Token elements in presentation markup are broadly intended to represent the smallest units of mathematical notation which carry meaning. Tokens are roughly analogous to words in text. However, because of the precise, symbolic nature of mathematical notation, the various categories and properties of token elements figure prominently in MathML markup. By contrast, in textual data, individual words rarely need to be marked up or styled specially." These are (mtext, mi (identifier), mn (number), mo (operators in a broad sense), mspace, ms (string literal - for things like computer algebra systems)
  • Layout/Relationship elements mrow(for grouping sub-expressions), mfrac (for fractions and fraction-like objects such as binomial coefficients and Legendre symbols), msqrt and mroot for radicals
  • mstyle (legacy compat, deprecated - just maps to css)
  • merror (legacy compat - displays its contents as an ”error message”. The intent of this element is to provide a standard way for programs that generate MathML from other input to report syntax errors in their input.)
  • mpadded - a row-like grouping container which has attributes that map to CSS
  • mphantom - a co-evolutionary/legacy row-like container that just adds a UA style that maps to visibility: hidden;
  • menclose - a row-like element for various types of 'enclosure' renderings (see examples at https://developer.mozilla.org/en-US/docs/Web/MathML/Element/menclose)
  • 3 elements about subscripts and superscripts msub, msup and msubsup
  • 3 elements about underscripts and overscripts munder, mover and munderover
  • 1 element about prescripts and tensor indexes (mmultiscripts)`
  • 3 elements about tabular math (mtable, mtr and mtd)
@alice
Copy link

alice commented Jan 28, 2020

Some thoughts:

  • The "What does it look like?" heading is a bit of a non-sequitur. What does what look like? Particularly if this is going to be folded into the Explainer, maybe a restructuring could be something like

    • Abstract
      MathML Core is a minimal subset of MathML... (single paragraph, 2-3 sentence explanation of MathML Core)
    • Table of contents
    • Goals
    • Non-Goals
    • Background: MathML
      (Succinct explanation of MathML, with example.)
    • What is MathML Core?
      MathML Core is an attempt to create a minimal version of MathML based on modern web technology, in order to resolve long-standing issues with the increasingly dated existing MathML spec.
      • Issues with MathML today
        MathML has been in development since the mid-1990s. ...
      • The elements of MathML Core
        MathML Core is an attempt to resolve these issues via a minimum subset ...
        MathML Core focuses on 32 elements ...
      • Why these specific elements?
        (Brief explanation of the logic behind choosing those ones)
    • Design Discussion
    • Considered Alternatives
      • LaTeX
        (Discussion of why a tree is good, actually)
  • I don't know what the Text Rendering and writing systems section adds... is it an argument in favour of using Elements as opposed to some other syntax?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment