Skip to content

Instantly share code, notes, and snippets.

@yvanzo
Last active July 30, 2018 22:56
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yvanzo/c3ffa4dd8410c2dbb4ff73bb41af8935 to your computer and use it in GitHub Desktop.
Save yvanzo/c3ffa4dd8410c2dbb4ff73bb41af8935 to your computer and use it in GitHub Desktop.
Brainzed Markdown Draft #84884

Brainzed Markdown

  • MetaBrainz flavored Markdown
  • Markdown flavor for use in MetaBrainz projects

It aims to replace existing markup languages in *Brainz platforms with the following goals in mind:

  • Ease text editing and copying/pasting
  • Improve text markup capabilities
  • Simplify code development

It is based on GitHub Flavored Markdown (GFM), that is, CommonMark with the following extensions: autolink table, strikethrough, disallowed raw HTML. It additionally features automatic link conversion of *Brainz references such as MetaBrainz ticket, MusicBrainz documentation, and so on.

Table of Contents

Specific features

MetaBrainz ticket reference (postprocessing)

Any reference to a ticket will be rendered as a link to that ticket. (Actually, it is an existing feature of BrainzBot as a plugin.)

OTHER-77
https://tickets.metabrainz.org/browse/OTHER-77

will both be rendered as:

<a href="https://tickets.metabrainz.org/browse/OTHER-77" class="improvement-issue" title="OTHER-77: Unify all markup syntax to simple markdown syntax">OTHER-77</a>

which looks like:

OTHER-77
  • Pros:
    • Automatically link to the issue.
    • Issue title is displayed on link rollover.
    • Displayed text is still short.
    • class attribute can be used to display issue type icon too.
    • Gracefully degrade on platforms not supporting this feature.
  • No con.

Implementation note: It requires to call JIRA’s REST API, already done in BrainzBot’s plugin: https://github.com/metabrainz/brainzbot-plugins/blob/master/botbot_plugins/plugins/jira.py

MusicBrainz edit URL (postprocessing)

Any raw (unbracketed) URL of MusicBrainz edit will be rendered as a link with title and short content.

https://musicbrainz.org/edit/123456

will be rendered as

<a href="https://musicbrainz.org/edit/123456" title="Edit #123456">#123456</a>

which looks like:

#123456

Beyond host musicbrainz.org, beta.musicbrainz.org (and current host for MusicBrainz server) will be recognized too. Rendered host will be the original one (or current host for MusicBrainz server).

Optionally, rendered link title attribute can be localized.

To escape this special rendering, URL can be angle bracketed <https://musicbrainz.org/edit/123456>.

  • Pros:
    • Rendered link content (hash and edit number) is shorter than the original link.
    • Rendered link is still self-explanatory with its title displayed on rollover.
    • Rendered link content integrates almost correctly into sentences of any language.
  • Cons:
    • Rendered link content cannot be localized, unless locale of the rest of the text is known.

MusicBrainz edit note URL (postprocessing)

Any absolute URL to a MusicBrinz edit will be displayed as a link titled with an hash and edit number:

https://beta.musicbrainz.org/edit/12345670#note-12345670-1

will be rendered as:

<a href="https://beta.musicbrainz.org/edit/12345670#note-12345670-1" title="Note #1 to edit #123456">#123456 (note #1)</a>

which looks like:

#123456 (note #1)

Optionally, rendered link title attribute and parenthesis content can be localized.

To escape this special rendering, URL can be angle bracketed <https://musicbrainz.org/edit/123456#note-12345670-1>.

  • Pros:
    • Rendered link content is shorter than the original link.
    • Rendered link is still self-explanatory with parenthesis content and the title displayed on rollover.
    • Rendered link content integrates approximately into sentences of any language.
  • Cons:
    • Rendered link content cannot be completely localized, unless locale of the rest of the text is known.

MusicBrainz entity URL (postprocessing)

https://tickets.metabrainz.org/browse/MBS-764

It would be very interesting to have automatic link with entity name as title.

https://musicbrainz.org/artist/6a0b0138-dc06-4d5c-87b3-fab64f0fd326

will be rendered as:

<a href="https://musicbrainz.org/artist/6a0b0138-dc06-4d5c-87b3-fab64f0fd326" class="artist group-artist" title="Group: MusicBrainz Sound Team">MusicBrainz Sound Team</a>

which looks like:

MusicBrainz Sound Team

To escape this special rendering, URL can be angle bracketed <https://musicbrainz.org/edit/123456#note-12345670-1>.

  • Pros:
    • Automatically link to the entity with entity name.
    • Entity type is displayed on link rollover.
    • class attribute can be used to display entity type icon too.
    • Gracefully degrade on platforms not supporting this feature.
  • Cons:
    • Rendered link content cannot be completely localized.
    • MBID is hidden.

Implementation note: Discourse MusicBrainz Onebox plugin already retrieves such information: https://github.com/phw/discourse-musicbrainz-onebox

MusicBrainz documentation reference (shortcut syntax, autocompletion and postprocessing)

It is the only *Brainz reference that users might be typing by hand. It makes sense to have shortcuts and autocompletion for it.

Syntax

Reference can be made using URI with doc scheme.

doc:WikiDocs
doc:Release_Group

It currently is case-sensitive, because of MediaWiki limitation.

It should be case-insensitive, once we move the documentation to Brainzed Markdown files.

Auto-completion

Case-insensitive search as possible, of every WikiDocs page: https://wiki.musicbrainz.org/Category:WikiDocs_Page

Postprocessing

doc:Release_Group

will be rendered as:

<a href="https://musicbrainz.org/doc/Release_Group" title="Documentation: Release Group">Release Group</a>

which looks likes:

Release Group

Existing markup languages in *Brainz

Community forums

  • Discourse implementation of Markdown supports CommonMark and many GFM extensions (autolink table, strikethrough).

    It is based on markdown-js which is extensible.

    Discourse supports plugins too.

    It seems feasible and useful to add support for automatic link conversion of *Brainz references.

    TODO: Check Discourse (plugin?) support for disallowed raw HTML.

Tickets issue tracker

Wiki

BookBrainz

AFAIK, it doesn’t support any markup language. Annotation, editor biography, and revision note are unformatted plain text fields.

CritiqueBrainz

  • Review already supports CommonMark syntax, using Python-Markdown. It doesn’t support GFM extensions though. Some features could probably be useful: autolink and automatic link conversion of MusicBrainz entity reference. On the contrary, table and strikethrough seem to be worthless here.

MusicBrainz

  • Annotation comes with its own limited wiki formatting https://musicbrainz.org/doc/Annotation#Wiki_formatting

    Each feature has an equivalent in CommonMark:

    • italics ''…'', bold '''…''', and bold italics '''''…''''' can be respectively replaced with emphasis *…*, strong **…**, and strong emphasis ***…***
    • horizontal rule ---- is compatible with CommonMark
    • title three levels = … =, == … ==, and === … === can be respectively replaced with ATX headers # …, ## …, and ### …
    • bulleted list * (with four leading spaces) can be replaced with bulleted list * (without any leading space)
    • code block (with height leading spaces) is compatible with CommonMark
    • link:
      • autolink url is compatible with GFM’s autolink extension to CommonMark, with small differences:
      • untitled bracketed link [url] can be replaced with angle bracketed link <url> with same small differences than above
      • undocumented wiki link [CamelCase] can be replaced with descriptive link [CamelCase](https://wiki.musicbrainz.org/CamelCase] Note: there are many wiki links that contain spaces, e.g. [Fictitious Artist]. Only wiki links that actually link to the wiki shall be replaced.
      • descriptive link [url|description] can be replaced with [description](url)
      • undocumented double-bracketed descriptive wiki link [[CamelCase|description]] can be replaced with [description](https://wiki.musicbrainz.org/CamelCase)
    • square bracket HTML characters entity references &#91; and &#93; are compatible with CommonMark, but they are not needed to write special entity name such as [unknown] in CommonMark

    Migration path (of about 280K latest annotations and 229K previous annotations):

    • Display annotations prior to the syntax change with wiki markup.

    • Gradually enter edits with ModBot to convert latest annotations only.

    • Drop support for wiki markup rendering of annotations prior to the change.

    • Pros:

      • Keep track of changes, thus allowing editors to manually fix incorrect conversion by comparing with the original version.
      • No need to maintain wiki markup rendering code.
    • Cons:

      • A lot of annotation edits (should these not be notified to subscribers?).
      • Previous annotations won’t be correctly displayed anymore.
      • Heterogeneous data.
  • Edit note syntax is close to the one for annotation with few differences: https://musicbrainz.org/doc/Edit_Note#Edit_note_syntax

    Basic features italics, bold, and autolink are the same as for annotation and have an equivalent in CommonMark, see above section.

    Specific features are:

    • Linking to an edit can be replaced with plain edit URLs.

      More specifically,

      edit #123456
      edit:123456
      edit 123456
      

      can all be replaced with:

      edit https://musicbrainz.org/edit/123456

      See the section “MusicBrainz edit URL” for rendering.

    • Linking to WikiDocs page has two forms:

      • URI doc:Release_Group which is still supported, with autocompletion eventually.
      • WikiLink [Release_Group] which can be replaced by the above doc:Release_Group form.

    Migration path (of about 29M edit notes):

    • Create a table old_edit_note to backup converted edit notes.

    • Gradually convert every edit note that requires any change. (The most reliable conversion is HTML rendering since limited HTML tags are allowed in Markdown.)

    • Pros:

      • Keep track of changes, thus allowing coders to eventually fix incorrect conversion by comparing with the original version.
      • No need to maintain wiki markup rendering code.
      • Homogeneous data.
    • Cons:

      • Potential conversion bug might damage edit notes, until reported by editor and fixed by coder.
      • Massive updates of the table edit_note.
      • Additional legacy table old_edit_note.
  • Editor biography syntax is close to the one for annotation too. (It is undocumented.)

    Migration path (of about 750K editor biographies, mostly spam) based on editor last updated date:

    • Display biography of outdated editor with wiki markup rendering.

    • When editor visits the website:

      • Make the conversion
      • Send both the previous version and the new one by email
    • After six months, drop support for wiki markup.

      • Optionally, every outdated editor biography may be converted without notification then.
    • Pros:

      • Keep track of changes, thus allowing editors to manually fix incorrect conversion by comparing with the original version.
      • Only one notification email sent to active editors only.
      • No need to maintain wiki markup rendering code.
    • Cons:

      • Potential conversion bug might break editor biography presentation, until noticed and fixed by editor.
      • Inactive editors won’t be notified of this change.
  • Event setlist comes with its own limited syntax: https://musicbrainz.org/doc/Event/Setlist#Syntax

    The syntax is mostly about the constrained structure of setlist. It is a loose legacy from setlist.fm which doesn’t matter anymore. We can use CommonMark instead and implement a constraints validator.

    Since event setlist is constrained, it makes it very similar to tracklist. It would make more sense to have a dedicated visual editor instead of fiddling around a text editor. It would still be possible to have a setlist parser to copy/paste setlist. It would even make more sense to store setlist in dedicated tables (with artist credit, durations, bells and whistles) rather than in text field, but it would require far more work.

    Although this migration is an opportunity to modify the structure of event setlist at the same time, the following is only about 1:1 equivalents as a subset of CommonMark.

    • @ for artists: It can be replaced with CommonMark’s third level ATX header ###. If necessary, “Artist:” label can be displayed through additional postprocessing.

    • * for works/songs: It can be either dropped (just using a simple line) which has similar rendering, or kept as CommonMark’s bulletted list * with a better rendering.

    • # for additional info: It can be replaced either with parenthesis (…) around the line with a different (but convenient?) rendering, or replaced with blockquotes > .

    • links [MusicBrainz ID|name to be displayed] It can be easily replaced to CommonMark’s link with description, depending on the context. Examples:

      1. @ [e1f1e33e-2e4c-4d43-b91b-7064068d3283|KISS] would be converted into [KISS](https://musicbrainz.org/artist/e1f1e33e-2e4c-4d43-b91b-7064068d3283)
      2. * [8dddc197-9ef4-4b5a-9908-9310230c1237|What’s It Gonna Take?] would be converted into [What’s It Gonna Take?](https://musicbrainz.org/work/8dddc197-9ef4-4b5a-9908-9310230c1237)

      What are implications:

      • Syntax is a bit more verbose. However, it is easier to copy/paste an URL rather than an MBID.
      • Any link of the form (((scheme)?host)?/)?entity_type/mbid shall be accepted and replaced and with a proper link to https://musicbrainz.org/… before storing the event setlist.
      • Links point to musicbrainz.org rather than to the current host. A postprocessing replacement can be added to render current scheme/host rather than https/musicbrainz.org

    Migration path (of about 10K event setlists) based on event last updated date:

    • Display event setlists prior to the syntax change with previous renderer.

    • Gradually enter edits with ModBot to convert every event setlist.

      • Optionally, these edits can be ignored for notifications.
    • Pros:

      • Keep track of changes, thus allowing editors to eventually fix incorrect conversion by comparing with the original version.
      • No need to maintain old event setlist markup rendering code.
      • Homogeneous data.
    • Cons:

      • Potential conversion bug might damage event setlist, until reported and fixed by editor.

Postponed features

Tagging @username

It is already supported by Discourse for Community forums.

There is a similar feature in Tickets issue tracker with JIRA syntax [~username].

There is a feature request for this in MusicBrainz. https://tickets.metabrainz.org/browse/MBS-9240

However, it has implications far beyond text formatting, more specifically about username normalization and notifications policy.

Changes summary

In MusicBrainz:

  • For event setlist:
    • Syntax will be completely replaced, and extended with additional features such as linking to a recording.
  • For annotation, edit note, and editor biography:
    • Italics and bold have different syntax.
    • Linking to an edit won’t be supported anymore, but edit URLs will be displayed with the same result.
    • Linking to WikiDocs [CamelCase] won’t be supported anymore, but doc:CamelCase will.
    • Descriptive link `[url|description] won’t be supported anymore,
    • Wiki links and entity links won’t be supported anymore, but

For existing content, see migration paths in above section.

References

Specifications

GFM’s extensions

Implementations

@Zastai
Copy link

Zastai commented Mar 3, 2018

For ticket links, it might be nice if that included additional markup based on the ticket state; for example, add the ticket-type icon in front, and use strikethrough for closed tickets (a bit like what you get for issue links in Jira itself). Priority and status are other potential items to include.

While I like the doc: and edit: prefixes, I do agree that things should not be MusicBrainz-centric. If the plan is to use the same terms everywhere (although it's already fairly clear from BB that this is not the case), it could just be edit:mb:xxx and doc:mb:Foo. Otherwise, I don't see a problem with mb:doc:Foo / mbdoc:Foo / mbedit:1234. I'd perhaps also expect this to be special URL syntax only (so just shorthand inside [], not just anywhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment