Skip to content

Instantly share code, notes, and snippets.

@sharwell
Last active May 11, 2024 15:22
Show Gist options
  • Save sharwell/ab7a6ccab745c7e0a5b8662104e79735 to your computer and use it in GitHub Desktop.
Save sharwell/ab7a6ccab745c7e0a5b8662104e79735 to your computer and use it in GitHub Desktop.
Documentation comments revised

Overview

Markdown documentation comments are a backwards-compatible replacement for XML documentation comments.

  • If the first non-whitespace character of the comment is <, it is treated as an XML documentation comment
  • Otherwise, the comment is treated as a Markdown documentation comment

Unlike XML documentation, Markdown comments are allowed anywhere a line or block comment is allowed in the language.

🔗 dotnet/csharplang#891

Language and Compiler

Language changes

While XML documentation files will remain the standard for shipping documentation with assemblies, the language will relax its rules surrounding the form for these comments in code.

  1. The behavior of a documentation comment whose first non-whitespace character is not < is implementation-defined.
  2. The behavior of a documentation comment not placed on a type or member is implementation-defined.
  3. Documentation comments are allowed to contain arbitrary valid XML. In addition to the elements defined in earlier versions of the C# language, documentation rendering tools are encouraged to support the following elements:
    • <em>
    • <strong>
    • <inheritdoc>
    • <a href="">
    • <see href="">
    • <see langword="keyword">

Compiler changes

The compiler translates documentation comments for exposed types and members to XML during the build. For new documentation comments, the compiler delegates the translation to a documentation analyzer, which is responsible for:

  1. Translation of documentation comments to XML form for inclusion in compiler outputs
  2. Analysis of documentation comments for any diagnostics

The compiler provides a default documentation analyzer which handles XML documentation comments. It may provide a minimal documentation analyzer for non-XML documentation comments based on a minimal CommonMark behavior, which is used if no other documentation analyzer is provided.

IDE extensibility

Documentation analyzers interact at a low level with the compiler. The documentation analyzer specifies a content type for the documentation contents, which IDEs may use to provide a default editing experience. A separate documentation presenter can be provided which interacts with higher-level IDE features. It is responsible for:

  • Classification
  • Find references
  • Get symbol info (determine the symbol(s) referenced by a specific location within the comment)
  • Complexification and simplification
  • Rename

Sections

Sections are treated as extensions to the thematic breaks behavior of CommonMark.

@summary

The first section of a comment is the summary. This section may optionally start with a @summary thematic break. The @summary element typically does not need to be specified explicitly. However, a user may want to include it for one of the following reasons:

  • The content of the summary section starts with a < character, which would otherwise cause the compiler to treat the comment as an XML documentation comment.
  • The content of the summary section includes more than one paragraph, and the comment does not include a remarks section.

@remarks

The remarks section starts following the @remarks thematic break.

Other sections

🚧 Other sections may be supported by this design. Possible approaches include:

  1. Restrict the sections to @summary and @remarks
  2. Allow additional sections, but restrict the set to an allowlist
  3. Allow any section in the form ^\s*@\w+\s*$, or perhaps a more restricted form focusing on identifiers

Implicit breaks

If the @summary and @remarks thematic breaks are omitted, a @remarks thematic break is implicitly added immediately following the first paragraph of the summary section.

Parameters

Parameters are defined using an extension to the list syntax.

🚧 The delimiter syntax is not finalized for this, but may look like one of the following:

  • name:
  • @param name
  • @name

The documentation for a parameter follows the list delimiter under the same rules as bulleted or numbered lists.

Type parameters

Type parameters would be documented in a manner similar to parameters.

🚧 The name portion of the delimiter syntax is not finalized, but could be either T or <T> for a type parameter T.

Return values

Return values would be documented in the same list as parameters and/or type parameters.

🚧 The delimiter syntax is not finalized, but could be one of the following:

  • return:
  • returns:
  • @return
  • @returns

Tuple elements

Tuple elements may be documented in the same manner as parameters, appearing as a nested list under the item whose type is a tuple.

/// point: The point to scale
///     x: The x-coordinate of the point to scale
///     y: The y-coordinate of the point to scale
/// scale: The amount by which to scale the point
/// return: The scaled point
///     x: The x-coordinate of the scaled point
///     y: The y-coordinate of the scaled point
(double x, double y) Scale((double x, double y) point, double scale);

Code and References

By default, code within a comment is validated. In their simplest forms, inline code and code blocks are treated as code in the same language as the containing document.

  • Inline code may be treated as "plain" code by using one more set of backticks than is necessary for escaping purposes.

    • `semantic`
    • ``"Semantic string with backtick (`)"``
    • ``plain``
    • ```plain backtick (`)```
  • Fenced code may be treated as "plain" code in the current lanuage by including plain in the info string.

    ```
    // In a C# source file, this is treated as C# code and semantically validated
    void Method() { }
    ```
    ```csharp
    // This is semantically validated
    void Method() { }
    ```
    ```csharp plain
    // This is highlighted as C# code but not semantically validated
    void Method() { }
    ```

Resolving references

  • For comments not placed in a code block, resolve the comments from a pseudo-context "inside" the element (i.e. parameters resolve, then element name, then containers...)
  • For comments preceding a statement which can have child statements, resolve the comments from the beginning of the first child statement
  • For comments preceding a standalone statement, resolve the comments from the end of the statement
  • For comments at the end of a code block, resolve from the current location
@CyrusNajmabadi
Copy link

A separate documentation presenter can be provided which interacts with higher-level IDE features. It is responsible for:

Note: this sounds very similar to the IEmbeddedLanguage system i built for embedded json/regex literals. When you get to this part, i would both like to be part of the discussion, and i think it would be good if we could evaluate how we might be able to build a system where we have one single concept here instead of multiple similar concepts.

Note: a recent thought i've been having here is that all of these areas should simply be represented as (Contained)?Documents. Documents already have a defined and understandable way to get at structure and to expose services. And we already have the concept of embedded documents in the ASP/razor system. The only real difference we need is:

  1. arbitrary nesting levels. We would expect to potentially have a 'markdown' doc embedded in a C# doc. Then we would expect to potentially have 'semantically inert c#' docs embedded in the 'markdown doc'.
  2. an appropriate registration/discovery system for the language processors here.
  3. a system to load/embed processing of the different sections to the different processors. Note that this would have to be collaborative. i.e. the markdown-provider would be the one that would have to figure out which sections of itself would then have to load and be processed by a diferent language processor.

--

The system i built was effectively this. Though i didn't reuse the Document abstraction as i was worried too much about the potential size impact on the rest of the system. I intentionally tried to keep things explicitly separated for simplicity. However, i would want to not do that in the future if this is a first-class concept in the workspace and presentation models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment