currently the notebook schema contains intrinsic notebook schema properties and application specific definitions. it would be good to separate the pure schema information from the application specific schema into something extensible. this document explores core set of definitions that can extended to either a traditional notebook container or a streaming lines format.
a notebook is a collection of cells that can be contained in different ways.
the defs
schema is reused to demonstrate different top level schema.
its purpose is to make cell schema and cell metadata schema extensible.
"#/$defs/metadata"
- notebook level metadata needed for either nested or streaming containers.
"#/$defs/cells"
- allows different combinations of cells to be used.
- these specific definitions can define their own cell metadata schema
"#/$defs/cell-metadata"
-
constraints for metadata across any cell likes execution time or slide type.
defs=\
"$defs":
cells:
- "$comment": new cells can define their own metadata constraints
oneOf:
- "$ref": "code-cell.json"
- "$ref": "md-cell.json"
- "$ref": "raw-cell.json"
- "$ref": "sql-cell.json"
metadata:
"$comment": notebook metadata
type: object
cell-metadata:
- "$comment": it is possible add metadata that applies across all cells
allOf:
- "$comment": slide metadata would most likely be part of the core schema cause it exists
"$ref": "cell-metadata-slides.json"
- "$comment": cell execution information
"$ref": "cell-metadata-execution.json"
this schema uses the defs
above to recapture the current container format for the notebook.
nb=\
required: [metadata, cells]
properties:
metadata:
"$ref": "#/$defs/metadata"
cells:
items:
allOf:
- "$comment": default metadata properties can be defined
"$ref": "cell.json"
- "$ref": "#/$defs/cells"
- "$ref": "#/$defs/cell-metadata"
an another representation of the containers is as json lines.
in this schema, the first line captures top level notebook information like the metadata with the kernel spec information and notebook format. every line there after is a cell defined in the #/items
schema
lines=\
prefixItems:
- "$comment": |
let the first cell contain the notebook level metadata.
using a readline approach we can peak at the state of a notebook if it has a bunch of information in the metadata
"$ref": "#/$defs/metadata"
items:
allOf:
- "$comment": default metadata properties can be defined
"$ref": "cell.json"
- "$ref": "#/$defs/cells"
- "$ref": "#/$defs/cell-metadata"