Over the course of my maintenance of sphinx and sphinx extensions (including myst-parser, myst-nb and sphinx-needs)
the question of using headings within directives, or more generally nested inside other nodes of the docutils document
tree has come up multiple times.
To not have to repeat myself anymore, I decided to write this short doc, to explain why this is not possible (or at least generally a bad idea).
Lets take the following reStructuredText document as an example:
Section 1
=========
Section 1.1
-----------
Section 1.1.1
.............
Section 1.2
-----------
Section 1.2.1
.............
more ...
If we convert this to a document
tree (the first stage of processing undertaken by docutils/sphinx), we get the following (note --no-transforms
is not currently available in docutils):
rst2pseudoxml --no-transforms test.rst
<document source="test2.rst">
<section ids="section-1" names="section\ 1">
<title>
Section 1
<section ids="section-1-1" names="section\ 1.1">
<title>
Section 1.1
<note>
<section ids="section-1-1-1" names="section\ 1.1.1">
<title>
Section 1.1.1
<section ids="section-1-2" names="section\ 1.2">
<title>
Section 1.2
<section ids="section-1-2-1" names="section\ 1.2.1">
<title>
Section 1.2.1
<paragraph>
more ...
If we run this through to a full HTML5 conversion, you get the following:
rst2html5 test.rst
<body>
<main id="section-1">
<h1 class="title">Section 1</h1>
<section id="section-1-1">
<h2>Section 1.1</h2>
<section id="section-1-1-1">
<h3>Section 1.1.1</h3>
</section>
</section>
<section id="section-1-2">
<h2>Section 1.2</h2>
<section id="section-1-2-1">
<h3>Section 1.2.1</h3>
<p>more ...</p>
</section>
</section>
</main>
</body>
This is all good 👍
Lets now "nest" some of these headings inside a directive, for example the note
admonition:
Section 1
=========
Section 1.1
-----------
.. note::
Section 1.1.1
.............
Section 1.2
-----------
Section 1.2.1
.............
more ...
In regular docutils / sphinx processing, this will raise an error:
rst2pseudoxml --no-transforms test1.rst
test1.rst:10: (SEVERE/4) Unexpected section title.
Section 1.1.1
.............
Exiting due to level-4 (SEVERE) system message.
In sphinx there is a function nested_parse_with_tiles
, see: https://github.com/sphinx-doc/sphinx/blob/04bd0df100809de350be89b64bb85c3524867132/sphinx/util/nodes.py#L327 which "allows" you to do this (see below why I think this should not be part of the public API).
So lets hack into docutils and replace the note
directive with a custom directive that uses nested_parse_with_titles
:
from sphinx.util.nodes import nested_parse_with_titles
class NoteWithTitles(Directive):
final_argument_whitespace = True
option_spec = {'class': directives.class_option,
'name': directives.unchanged}
has_content = True
node_class = nodes.note
def run(self):
set_classes(self.options)
self.assert_has_content()
text = '\n'.join(self.content)
admonition_node = self.node_class(text, **self.options)
self.add_name(admonition_node)
if self.node_class is nodes.admonition:
title_text = self.arguments[0]
textnodes, messages = self.state.inline_text(title_text,
self.lineno)
title = nodes.title(title_text, '', *textnodes)
title.source, title.line = (
self.state_machine.get_source_and_line(self.lineno))
admonition_node += title
admonition_node += messages
if 'classes' not in self.options:
admonition_node['classes'] += ['admonition-'
+ nodes.make_id(title_text)]
nested_parse_with_titles(self.state, self.content, admonition_node, self.content_offset)
return [admonition_node]
Now lets run our modified docutils again:
rst2pseudoxml --no-transforms test1.rst
<document source="test2.rst">
<section ids="section-1" names="section\ 1">
<title>
Section 1
<section ids="section-1-1" names="section\ 1.1">
<title>
Section 1.1
<note>
<section ids="section-1-1-1" names="section\ 1.1.1">
<title>
Section 1.1.1
<section ids="section-1-2" names="section\ 1.2">
<title>
Section 1.2
<section ids="section-1-2-1" names="section\ 1.2.1">
<title>
Section 1.2.1
<paragraph>
more ...
Maybe you can already see what the problem is, but lets convert this to HTML5 and the problem becomes more obvious:
rst2html5 test1.rst
<body>
<main id="section-1">
<h1 class="title">Section 1</h1>
<p class="subtitle" id="section-1-1">Section 1.1</p>
<aside class="admonition note">
<p class="admonition-title">Note</p>
<section id="section-1-1-1">
<h2>Section 1.1.1</h2>
<section id="section-1-2">
<h3>Section 1.2</h3>
</section>
</section>
</aside>
<section id="section-1-2-1">
<h2>Section 1.2.1</h2>
<p>more ...</p>
</section>
</main>
</body>
As you can see, Section 1.2
is now incorrectly an h3
element,
and Section 1.2.1
is now incorrectly an h2
element.
In short, allowing nested headings seriously messes up the document structure, in a way that may not be immediately apparent (since there are no warnings/errors raised).
This is even more of of a problem, for any document
tree post-processing (a.k.a. transforms) that expect the correct document structure,
and can lead them to them failing in unexpected ways (for example processing of toctree
s).
This is an intrinsic limitation of the way docutils/sphinx works. I would emphasise that I don't believe it is a bug in docutils/sphinx though, because it is not a new problem when it comes to markup parsing (see for example jgm/djot#213)
In my opinion, nested_parse_with_tiles
should probably be a private function in sphinx, as it is not really intended to be used by extensions or users, unless they really know what they are doing.
Plus, ideally this functionality would be up-streamed into docutils itself, so that it was less of a hack of the underlying docutils parser.
The core use cases for nested_parse_with_titles
are:
-
If you need to "include" a document fragment inside another document, but want the heading level of the document fragment adjusted to the current level. This is utilised, for example, when including docstrings from Python code into sphinx documentation.
-
If your directive is only used at the "top-level" of a document, and the generated nodes are not wrapped in any kind of "container" node, or you are also doing some form of "post-processing" on the generated nodes, to fix the problems that arise from nested headings. This is utilised, for example, in the
ifconfig
directive
These are both very specific and advanced use-cases, and should be used with absolute caution (and testing).