Skip to content

Instantly share code, notes, and snippets.

@nvkv
Created April 8, 2013 12:02
Show Gist options
  • Save nvkv/5336314 to your computer and use it in GitHub Desktop.
Save nvkv/5336314 to your computer and use it in GitHub Desktop.

Lit, simple tool for language agnostic literate programming

Literate programming is a technique introduced by Dolad Knuth many years ago. Nowdays literate programming is almost dead, it's really sad in my opinion. This little application designed to bring literate programming approach to almost any programming language expirience.

I strongly recomend to read original Knuth paper on Literate Programming (http://www.literateprogramming.com/knuthweb.pdf)

Noweb.py by Jonathan Aquino was inspiration for this humble peace of code.

Main idea

It was surprisingly easy to implement this tool. Main idea is to parse file in single pass line-by-line detecting chunks and use Map to store it's names and values. In second part of processing recursively 'expand' chunks bodies, replacing entries of others chunks to get full programm.

Used packages

To process files this application using os, io, bufio and regex packages. Flag package used to parse command line parameters. It's a bit shitty, but it's ok.

Run flow

Parsing command line parameters

Right after start application will try to parse command line parameters. If some vital data is not defined application will show usage and exit. There is 4 overall parameters:

  • --src-out: File name for code output (tangle output)
  • --doc-out: File name for document output (weave output)
  • --default-chunk: Default chunk name. Chunk with this name will consider holding main program code. By default it's name is "*"
  • First parameter after all options witll consider file name to parse

As I mention above, we using flag package to parse command line. For every command line argument there is variable defined. Default values for src-out and doc-out parameters is empty string. In Go this is "Zero value" for string, so we can catch situation when user omit one or another parameter. Default value for default-chunk is always "*".

Check command line options validity

If there is no file to parse we can't do anything except show usage. Another case is when both src-out and doc-out is missing. In this situation application will show usage too, because it can't do anything useful with given file.

But if /only one/ of they is missing application can dump source code or documentation without dumping another part.

For exmaple, if you want to generate both, documentation and source from some file source.w, you should run:

lit --src-out source.c --doc-out source.tex source.w

But if you need only source, you can omit doc-out parameter:

lit --src-out source.c source.w

Same works for doc-out.

File parsing

File parsing process is extremely straightforward. After file is open we reading it line by line trying to match one specified regular expressions.

Expression "<<([^>]+)>>=" is used to match beginning of chunk, "@" for end of chunk.

After chunk beginning is found we extract his name from submatches and store it in variable chunkName, after that any line not matched by any regular expression is added to Map named chunks with value of chunkName as a key. If line matches with end of chunk expression chunkName is set ot zero value. If line no one expression can match line and chunkName variable set to zero value, that line is adding to document string variable.

As a result of execution parseFile function returns document string and chunks map.

To simplify processing of every line of code defined closure processLine. This closure decides where current processing line will go: to the chunk body or documentation.

Expanding chunks

Every chunk body can contain any number of links to another chunks. To build whole program from literate source we need to "expand" every chunk body by replacing links to other chunks by its bodies. First of all we define data structure for "final" expanded chunks expandedChunks. After that we define regular expression, which will match "links" to other chunks.

Expand body closure defined inside expandChunks function takes a body as an argument and match it for links to another chunks. After that it takes every linked chunk name and replaces it with result of recursive self-invocation with linked chunk body. If there is no linked chunks closure just returns given body. May be I should check if expandedChunks already has expanded body for linked chunk to avoid extra work.

Main program structure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment