Skip to content

Instantly share code, notes, and snippets.

@erezsh
Last active August 17, 2020 20:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save erezsh/6f0ae0cc3541d3bed106ae735aa06ee8 to your computer and use it in GitHub Desktop.
Save erezsh/6f0ae0cc3541d3bed106ae735aa06ee8 to your computer and use it in GitHub Desktop.
Design idea for Omnidoc (working title)

Entities

TreeType

Nodes in the graph implement the interface TreeType{set of tree-nodes supported}

Examples of TreeType:

  • PandocTree
  • DocutilsTree
  • Tree (lark.Tree)

Tree nodes

  • named uniquely to their format (their types of children)

  • when possible, use a shared namespace

  • Examples of node names:

    • pandoc_document
    • docutils_para
    • headline
  • provide validation of ontology? (i.e. list of parents / children types)

Tranformer

  • Edges in the graph
  • Works like Lark transformer, on Tree instances
  • Actions: Discard, Inline

Translation

Every transform defines two sets: Incoming and Outgoing

Before running, it validates the TreeType is a subset of Incoming. After running, it validates the result is a subset of Outgoing

Similarly, each reader defines the set of features it might produce (result must be a subset) And each writer defines the set that it is capable of writing.

For convenience, we can group the node name into categories, which will simplify the notation Example:

    OUTGOING = Nodes.Common | Nodes.HTML | {"unique_feature"}

You start the graph search by giving it a tree and a target TreeType. It will pathfind the sequence of transforms (in each step is_subset(src_node.outgoing, dst_node.incoming)==true)

The result will be a subset of the target TreeType.

But let's extend the concept just a little bit, as the graph actually accepts several types:

  • TreeType{set-of-features}
  • Markdown
  • HTML
  • rst
  • etc.

And we can write functions that implement the same Incoming -> Outgoing interface.

For example, here's how the Python code might look (technicalities aside):

@edge(Markdown, TreeType[markdown_features])
def parse_commonmark(md):
    import commonmark
    return commonmark.parse(md)

This will enable you to eventually write something like this, to convert Markdown->HTML:

start = Markdown('index.md')
end = HTML(option1="yes", options2="no")
graph.execute_route(start, end)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment