Skip to content

Instantly share code, notes, and snippets.

@tonyfast
Created June 28, 2022 18:46
Show Gist options
  • Save tonyfast/6c236af7dcaa87fc012f31b720575dd7 to your computer and use it in GitHub Desktop.
Save tonyfast/6c236af7dcaa87fc012f31b720575dd7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 43,
"id": "84f6a1e4-041f-45df-a2ed-f35c71db4d9e",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"# how general is the `nbformat`?\n",
"\n",
"_a seed for discussion_\n",
"\n",
"* `nbformat` hidden in plain sight keeps millions of notebooks honest\n",
"* literate programming and computing\n",
" * problems with plain text\n",
" * literate documents\n",
"* a format for notebooks -> a format documents"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# how general is the `nbformat`?\n",
"\n",
"_a seed for discussion_\n",
"\n",
"* `nbformat` hidden in plain sight keeps millions of notebooks honest\n",
"* literate programming and computing\n",
" * problems with plain text\n",
" * literate documents\n",
"* a format for notebooks -> a format documents"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "bc3d190e-2655-4dd3-94b1-51649914856f",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [],
"source": [
" \n",
" from mistune import markdown\n",
" from functools import partial\n",
" from textwrap import indent\n",
" from json import dumps\n",
" shell.weave.environment.filters[\"md\"] = markdown\n",
" shell.weave.environment.filters[\"in\"] = lambda x, i: indent(x, prefix=\" \"*i)\n",
" shell.weave.environment.filters[\"dumps\"] = partial(dumps, indent=2)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "84794bd8-fb6f-4299-98ee-cc10f28a87bd",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"<details><summary>what systems use the <code>nbformat</code>?</summary>\n",
"<ul>\n",
"<li><code>jupyterlab</code>, <code>notebook</code>, <code>nbviewer</code>, <code>papermill</code>, <code>jupytext</code>, <code>black</code>, <code>nbqa</code>, <code>...</code></li>\n",
"<li>Github, Gitlab, Colab, VSCode</li>\n",
"<li>publishers <a href=\"https://joss.theoj.org/papers/in/Jupyter%20Notebook\">JOSS</a></li>\n",
"<li>organizatons, communities</li>\n",
"</ul>\n",
"\n",
"</details>\n",
"\n",
"\n",
"\n",
"[what systems don't?](https://discourse.jupyter.org/t/julia-community-is-creating-a-new-notebook-format/5422)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"<details><summary>what systems use the <code>nbformat</code>?</summary>\n",
"{% filter md %}\n",
"* `jupyterlab`, `notebook`, `nbviewer`, `papermill`, `jupytext`, `black`, `nbqa`, `...`\n",
"* Github, Gitlab, Colab, VSCode\n",
"* publishers [JOSS]\n",
"* organizatons, communities\n",
"\n",
"[JOSS]: https://joss.theoj.org/papers/in/Jupyter%20Notebook\n",
"{% endfilter %}\n",
"</details>\n",
"\n",
"\n",
"\n",
"[what systems don't?](https://discourse.jupyter.org/t/julia-community-is-creating-a-new-notebook-format/5422)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"id": "9f4c02ee-5ecf-46bd-a886-afad7d35a8ba",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"## [literate ~~programming~~ computing][lc]\n",
"\n",
"> From this perspective, we therefore refer to the worfklow exposed by these kinds of computational notebooks (not just IPython, but also Sage, Mathematica and others), as \"literate computing\": it is the weaving of a narrative directly into a live computation, interleaving text with code and results to construct a complete piece that relies equally on the textual explanations and the computational components.\n",
"\n",
"[lc]: http://blog.fperez.org/2013/04/literate-computing-and-computational.html"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## [literate ~~programming~~ computing][lc]\n",
"\n",
"notebooks are for telling stories; even if it is talking to yourself. in notebooks, code, data, and hypermedia become literary devices.\n",
"\n",
"> From this perspective, we therefore refer to the worfklow exposed by these kinds of computational notebooks (not just IPython, but also Sage, Mathematica and others), as \"literate computing\": it is the weaving of a narrative directly into a live computation, interleaving text with code and results to construct a complete piece that relies equally on the textual explanations and the computational components.\n",
"\n",
"[lc]: http://blog.fperez.org/2013/04/literate-computing-and-computational.html"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c744325a-8f73-4428-b381-c1f4ee5fdb06",
"metadata": {
"jupyter": {
"source_hidden": true
}
},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.50.0 (20220117.2223)\n",
" -->\n",
"<!-- Pages: 1 -->\n",
"<svg width=\"387pt\" height=\"98pt\"\n",
" viewBox=\"0.00 0.00 387.48 98.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 94)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-94 383.48,-94 383.48,4 -4,4\"/>\n",
"<!-- WEB -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>WEB</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"33.8\" cy=\"-45\" rx=\"33.6\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"33.8\" y=\"-41.3\" font-family=\"Times,serif\" font-size=\"14.00\">WEB</text>\n",
"</g>\n",
"<!-- TEX -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>TEX</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"194.49\" cy=\"-72\" rx=\"29.8\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"194.49\" y=\"-68.3\" font-family=\"Times,serif\" font-size=\"14.00\">TEX</text>\n",
"</g>\n",
"<!-- WEB&#45;&gt;TEX -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>WEB&#45;&gt;TEX</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M66.15,-50.33C91.78,-54.7 128.11,-60.88 155.41,-65.52\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"154.95,-68.99 165.39,-67.22 156.12,-62.09 154.95,-68.99\"/>\n",
"<text text-anchor=\"middle\" x=\"116.09\" y=\"-66.8\" font-family=\"Times,serif\" font-size=\"14.00\">WEAVE</text>\n",
"</g>\n",
"<!-- PAS -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>PAS</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"194.49\" cy=\"-18\" rx=\"28.7\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"194.49\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">PAS</text>\n",
"</g>\n",
"<!-- WEB&#45;&gt;PAS -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>WEB&#45;&gt;PAS</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M65.08,-38.1C71.82,-36.66 78.94,-35.21 85.59,-34 109.02,-29.73 135.47,-25.8 156.32,-22.9\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"156.93,-26.34 166.36,-21.52 155.98,-19.41 156.93,-26.34\"/>\n",
"<text text-anchor=\"middle\" x=\"116.09\" y=\"-37.8\" font-family=\"Times,serif\" font-size=\"14.00\">TANGLE</text>\n",
"</g>\n",
"<!-- DVI -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>DVI</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"348.94\" cy=\"-72\" rx=\"27.9\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"348.94\" y=\"-68.3\" font-family=\"Times,serif\" font-size=\"14.00\">DVI</text>\n",
"</g>\n",
"<!-- TEX&#45;&gt;DVI -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>TEX&#45;&gt;DVI</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M224.52,-72C249,-72 284.19,-72 310.74,-72\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"310.8,-75.5 320.8,-72 310.8,-68.5 310.8,-75.5\"/>\n",
"<text text-anchor=\"middle\" x=\"271.39\" y=\"-75.8\" font-family=\"Times,serif\" font-size=\"14.00\">TEX</text>\n",
"</g>\n",
"<!-- REL -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>REL</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"348.94\" cy=\"-18\" rx=\"30.59\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"348.94\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">REL</text>\n",
"</g>\n",
"<!-- PAS&#45;&gt;REL -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>PAS&#45;&gt;REL</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M223.1,-18C246.81,-18 281.29,-18 308,-18\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"308.19,-21.5 318.19,-18 308.19,-14.5 308.19,-21.5\"/>\n",
"<text text-anchor=\"middle\" x=\"271.39\" y=\"-21.8\" font-family=\"Times,serif\" font-size=\"14.00\">PASCAL</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.sources.Source at 0x7f1f0004d5b0>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" \n",
" import graphviz\n",
" hommage = graphviz.Source(\n",
" \"\"\"digraph {rankdir=LR WEB->TEX[label=WEAVE] TEX->DVI[label=TEX] WEB->PAS[label=TANGLE] PAS->REL[label=PASCAL]}\"\"\"\n",
" )\n",
" md = graphviz.Source(\n",
" \"\"\"digraph {rankdir=LR MD->HTML[label=WEAVE]}\"\"\"\n",
" )\n",
" nb = graphviz.Source(\n",
" \"\"\"digraph {rankdir=LR IPYNB->JSON[label=WEAVE] JSON->HTML[label=nbcovert] \n",
" IPYNB->JSON[label=TANGLE] JSON->PY[label=nbcovert]}\"\"\"\n",
" )\n",
" hommage"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "745ac9a9-2b46-492e-9a87-4aff546ea4d8",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"* notebooks are literate programs that permit stories in multiple formal and informal languages\n",
"* _notebooks the literature_ will outlive _notebooks the software_.\n",
"* on the web, they enable multiple authors. \n",
"* `nbformat` is the rock for these affordances"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"* notebooks are literate programs that permit stories in multiple formal and informal languages\n",
"* _notebooks the literature_ will outlive _notebooks the software_.\n",
"* on the web, they enable multiple authors. \n",
"* `nbformat` is the rock for these affordances"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "4c89854e-22ca-4540-a567-2a765cdd2082",
"metadata": {
"jupyter": {
"source_hidden": true
}
},
"outputs": [
{
"data": {
"text/markdown": [
"### json to plain text\n",
"\n",
"contemporary `notebook` applications are biased by the prevelence of POSIX systems.\n",
"\n",
"* `jupytext` weaves notebooks into different plain-text formats\n",
"* `nbexplore` models the notebook as a file system\n",
"* `importnb` using python imports to find modules on the file system. _in `pyolite` contexts, it is a virtual file system."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### json to plain text\n",
"\n",
"contemporary `notebook` applications are biased by the prevelence of POSIX systems.\n",
"\n",
"* `jupytext` weaves notebooks into different plain-text formats\n",
"* `nbexplore` models the notebook as a file system\n",
"* `importnb` using python imports to find modules on the file system. _in `pyolite` contexts, it is a virtual file system."
]
},
{
"cell_type": "code",
"execution_count": 71,
"id": "9f381f7a-c575-4469-bdb6-a1feae966e23",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.50.0 (20220117.2223)\n",
" -->\n",
"<!-- Pages: 1 -->\n",
"<svg width=\"438pt\" height=\"98pt\"\n",
" viewBox=\"0.00 0.00 437.68 98.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 94)\">\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-94 433.68,-94 433.68,4 -4,4\"/>\n",
"<!-- IPYNB -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>IPYNB</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"40.95\" cy=\"-45\" rx=\"40.89\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"40.95\" y=\"-41.3\" font-family=\"Times,serif\" font-size=\"14.00\">IPYNB</text>\n",
"</g>\n",
"<!-- JSON -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>JSON</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"214.64\" cy=\"-45\" rx=\"36\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"214.64\" y=\"-41.3\" font-family=\"Times,serif\" font-size=\"14.00\">JSON</text>\n",
"</g>\n",
"<!-- IPYNB&#45;&gt;JSON -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>IPYNB&#45;&gt;JSON</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M82.2,-45C108.1,-45 141.8,-45 168.59,-45\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"168.86,-48.5 178.86,-45 168.86,-41.5 168.86,-48.5\"/>\n",
"<text text-anchor=\"middle\" x=\"130.39\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\">WEAVE</text>\n",
"</g>\n",
"<!-- IPYNB&#45;&gt;JSON -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>IPYNB&#45;&gt;JSON</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M71.87,-33.19C80.75,-30.24 90.59,-27.49 99.89,-26 126.66,-21.72 134.19,-21.34 160.89,-26 166.35,-26.95 171.98,-28.42 177.44,-30.13\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"176.43,-33.48 187.02,-33.39 178.69,-26.85 176.43,-33.48\"/>\n",
"<text text-anchor=\"middle\" x=\"130.39\" y=\"-29.8\" font-family=\"Times,serif\" font-size=\"14.00\">TANGLE</text>\n",
"</g>\n",
"<!-- HTML -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>HTML</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"390.03\" cy=\"-72\" rx=\"39.79\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"390.03\" y=\"-68.3\" font-family=\"Times,serif\" font-size=\"14.00\">HTML</text>\n",
"</g>\n",
"<!-- JSON&#45;&gt;HTML -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>JSON&#45;&gt;HTML</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M249.07,-50.21C275.57,-54.33 312.9,-60.15 342.35,-64.73\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"341.99,-68.22 352.41,-66.3 343.06,-61.3 341.99,-68.22\"/>\n",
"<text text-anchor=\"middle\" x=\"300.39\" y=\"-65.8\" font-family=\"Times,serif\" font-size=\"14.00\">nbcovert</text>\n",
"</g>\n",
"<!-- PY -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>PY</title>\n",
"<ellipse fill=\"none\" stroke=\"black\" cx=\"390.03\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
"<text text-anchor=\"middle\" x=\"390.03\" y=\"-14.3\" font-family=\"Times,serif\" font-size=\"14.00\">PY</text>\n",
"</g>\n",
"<!-- JSON&#45;&gt;PY -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>JSON&#45;&gt;PY</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M247.65,-37.93C254.47,-36.53 261.65,-35.15 268.39,-34 296.69,-29.18 328.94,-24.97 352.97,-22.08\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"353.54,-25.53 363.06,-20.88 352.72,-18.58 353.54,-25.53\"/>\n",
"<text text-anchor=\"middle\" x=\"300.39\" y=\"-37.8\" font-family=\"Times,serif\" font-size=\"14.00\">nbcovert</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.sources.Source at 0x7f1ef063dfa0>"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/markdown": [
"### notebooks at web scale\n",
"\n",
"POSIX is dead! long live posix!\n",
"\n",
"* enter `pyscript` `pyolite` `jupyterlite`\n",
"* in the browser `jupyterlite` Jupyter contexts make `json` our lingua franca\n",
"* questions about `json`?\n",
" * is it valid json?\n",
" * is it a valid `nbformat`\n",
" * how big is it? \n",
" * how will the content effect performance?\n",
" \n",
" nb"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"### notebooks at web scale\n",
"\n",
"POSIX is dead! long live posix!\n",
"\n",
"* enter `pyscript` `pyolite` `jupyterlite`\n",
"* in the browser `jupyterlite` Jupyter contexts make `json` our lingua franca\n",
"* questions about `json`?\n",
" * is it valid json?\n",
" * is it a valid `nbformat`\n",
" * how big is it? \n",
" * how will the content effect performance?\n",
" \n",
" nb"
]
},
{
"cell_type": "markdown",
"id": "b454fa18-cd9d-4489-b608-917d240a0b7d",
"metadata": {},
"source": [
"## hypermedia collage\n",
"\n",
"the `nbformat` presents the ability to store many document types in a single data structure.\n",
"\n",
"* markdown and python go in cells\n",
"* other mimetypes go in the display\n",
"* other valid json can live in metadata"
]
},
{
"cell_type": "code",
"execution_count": 72,
"id": "efd356ec-5a94-4c3e-8e90-2fae94f1b0ae",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [],
"source": [
" \n",
" from schemin import Object as O\n",
" base =\\\n",
"https://raw.githubusercontent.com/jupyter/nbformat/main/nbformat/v4/nbformat.v4.5.schema.json\n",
"\n",
" nbid = \"nbformat/v4/v4.5.schema.json\"\n",
" t0 = O.Id[base] + O.Vocabulary[{\n",
" nbid: True\n",
" }] + O.AllOf[\n",
" O.Ref[nbid],\n",
" ]\n",
" t1 = O.List[\n",
" O.Ref[\"#/definitions/cell\"]\n",
" ] + O.Id[\"nbformat/v5/v5.-100.schema.json\"]"
]
},
{
"cell_type": "code",
"execution_count": 73,
"id": "8f480a0d-21ee-4c76-b8ee-4efff1eae322",
"metadata": {
"jupyter": {
"source_hidden": true
},
"tags": []
},
"outputs": [
{
"data": {
"text/markdown": [
"## questions for the future?\n",
"\n",
"* do the newer `jsonschema` standards allow for more composable schema?\n",
"\n",
" {\n",
" \"allOf\": [\n",
" {\n",
" \"$ref\": \"nbformat/v4/v4.5.schema.json\"\n",
" }\n",
" ],\n",
" \"$vocabulary\": {\n",
" \"nbformat/v4/v4.5.schema.json\": true\n",
" },\n",
" \"$id\": \"https://raw.githubusercontent.com/jupyter/nbformat/main/nbformat/v4/nbformat.v4.5.schema.json\"\n",
" }\n",
" \n",
"* would a jsonlines notebook format help with large notebooks? what about parts of notebooks? search?\n",
"\n",
" {\n",
" \"$id\": \"nbformat/v5/v5.-100.schema.json\",\n",
" \"items\": {\n",
" \"$ref\": \"#/definitions/cell\"\n",
" },\n",
" \"type\": \"array\"\n",
" }\n",
"\n",
"* <details expand=\"false\"><summary>is <code>cell_type</code> real? are there better standards like mime/types?</summary>\n",
" <p><img src=\"https://c.tenor.com/JHjG5vxW9zIAAAAd/missy-elliott-work-it.gif\" alt=\"\"></p>\n",
"\n",
" </details>"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## questions for the future?\n",
"\n",
"* do the newer `jsonschema` standards allow for more composable schema?\n",
"\n",
"{{t0.schema() | dumps | in(4)}}\n",
" \n",
"* would a jsonlines notebook format help with large notebooks? what about parts of notebooks? search?\n",
"\n",
"{{t1.schema() | dumps | in(4)}}\n",
"\n",
"* <details expand=\"false\"><summary>is <code>cell_type</code> real? are there better standards like mime/types?</summary>\n",
" {{\"![](https://c.tenor.com/JHjG5vxW9zIAAAAd/missy-elliott-work-it.gif)\" | md}}\n",
" </details>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "pidgy",
"language": "markdown",
"name": "pidgy"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@agoose77
Copy link

Yo @tonyfast I missed this meeting and only just stumbled across this.

I'm also interested in the idea of where the notebook format goes - https://discourse.jupyter.org/t/notebook-cell-type-generalisation/10703/8

When looking at the existing schema from the perspective of "I want to completely break your assumptions about Markdown", it quickly feels as though what we have right now is too restrictive.

@tonyfast
Copy link
Author

thanks for the message @agoose77 . i like where you are going with these thoughts. stoked to see folks pushing this topic.

i definitely agree that the notebook is a rich structured document, but i feel that minimizes the polyglot nature. my interpretation is notebooks are "hypermedia collage".

what is funny is we probably agree, but I actually am of the mind that the notebook format is too permissive. my unique experience with pidgy has demonstrated when you elevated markdown to the primary language you naturally lose the ternary of raw, code, markdown cell conventions. instead cells become a binary of ON and OFF cells.

i could see something added to the schema that makes things statement like:

{
  ...
  "if": {
    "cell_type": {"const": "code"}
  },
  "then": {  
     "properties": {"outputs": {"type": "array"}}
  },
  "else": {
     "properties": {"outputs": False}
}
  ...
}

i think y'all might have the best case to make if you can keep the vocabulary near to the existing vocab with minimal changes. the more than can be reused the better. the new json schema approaches are more composable and could allow us to maintain backwards compatibility while evolving the existing schema.

schema vs context

its going to be hard to add a lot of contextual information to a schema. that is not really the goal. schema impose structural consistency, not contextual information. for that tools like json-ld may be a preferred standard for described the context of the markdown in a given document.

further i would suggest looking at the ietf rfcs as there are some recommendations in dealing with markdown variants https://www.rfc-editor.org/rfc/rfc7764.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment