Skip to content

Instantly share code, notes, and snippets.

@wbamberg
Created April 20, 2021 03:39
Show Gist options
  • Save wbamberg/37bdd9161c81d4938a10e1d150f6b75b to your computer and use it in GitHub Desktop.
Save wbamberg/37bdd9161c81d4938a10e1d150f6b75b to your computer and use it in GitHub Desktop.

There are three pieces we need to think about in migrating MDN to Markdown:

  1. how we should represent MDN content using Markdown
  2. how we should convert MDN's HTML content into that Markdown representation
  3. updating/adapting Yari tooling to work with Markdown as the authoring format

Representing MDN in Markdown

For this we need to think of the places in MDN where we're currently using features of HTML that can't be represented in our chosen Markdown (GFM, very likely). For each of these features, we can choose between about four options:

  • stop doing the thing (for example, removing <div class="hidden">)
  • add custom extensions to GFM to support the thing (for example, supporting a custom syntax for notes)
  • drop into raw HTML (for example, using HTML for complex tables)
  • generate the thing using a macro (for example, generating spec tables using a macro)

The output of this is a specification describing how people should author content in Markdown in MDN, and how Yari should render that authored content as HTML. We've made a start on that here: https://developer.mozilla.org/en-US/docs/MDN/Contribute/Markdown_in_MDN and will continue to flesh it out as we work through the issues.

Converting MDN HTML to MDN Markdown

But the "MDN Markdown" spec is only part of the story: it assumes that we're already in Markdown. We also need to explain how we will convert our content from HTML to Markdown. So we need a spec for this, that will say what the converter should do when it encounters various problematic constructs.

As for the "MDN Markdown" specification, the content of this conversion spec should come out of the issues I've been filing.

HTML elements

This could list all HTML elements and for each one describe the behaviour of the converter. This could be one of the following things, more or less:

  1. convert it to the Markdown form: this is the easy category, for elements that have a Markdown representation, like <li>
  2. throw an error: this is for elements that we need to have hand-cleaned out of the source before conversion
  3. take the raw HTML
  4. discard the element
  5. something special (most likely, some combination of the above depending on some other factors)

HTML attributes

This could list all the HTML attributes and, again, describe the behaviour of the converter. This could be one of the following things, more or less:

  1. throw an error: this is for attributes that we need to have hand-cleaned out of the source before conversion. For example, I'd expect style to be in this category
  2. discard the attribute
  3. something special (for example, an id attribute could throw an error except when it's attached to a heading, and then it could be discarded.)

Updating Yari tooling

This is a thing I've not thought about at all, and the dev team definitely has more insight than me. For example, the flaws system assumes it's dealing with HTML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment