Skip to content

Instantly share code, notes, and snippets.

@smondet
Created May 3, 2019 14:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save smondet/ecb21235585e240c638bf7adf7aa491c to your computer and use it in GitHub Desktop.
Save smondet/ecb21235585e240c638bf7adf7aa491c to your computer and use it in GitHub Desktop.
Genspio: Generating Shell Phrases In OCaml

Genspio is a typed EDSL to generate shell scripts from OCaml. The idea is to build values of type 'a Genspio.EDSL.t with the combinators in the Genspio.EDSL module, and compile them to POSIX shell scripts (or one-liners) with functions from the Genspio.Compile module.

The project provides two compilers. The standard compiler generates strict but complex POSIX shell expressions, which can be even treated as one-liners. The second compiler, generates much slower but simpler scripts targeting a subset of POSIX which aims at being portable to older and buggy interpreters which are still found in the wild. In addition, a few generic optimizations have been implemented, and the API gives access to a generic AST visitor (using standard OCaml objects).

The most interesting parts of the implementation of the project are actually its testing infrastructure and its documenation.

The tests aim at evaluating the portability of the generated code by running an extensive test suite with various shells and on various systems (including by providing infrastructure to try older operating systems or other architectures with Qemu).

The documentation effort has also been extensive, and includes an browser-based experimentation environment; to try and modify the documented examples, see type-errors, and inspect or even download the compilation outputs. This sub-project is based on the Tyxml and react OCaml libraries, builds with js_of_ocaml, and includes a full OCaml REPL running in a “Web-worker.”

Bigger, proof-of-concept, examples, are also available showing that the approach can scale. These are 100% generated Github repositories which do not depend on OCaml at all, see multi-git (scripts for dealing with a few git repositories), and and cosc (script to manage long-running processes using a hidden GNU-screen session).

Genspio was developed at the Hammer Lab of Mount Sinai to help data-analysts manage their own complex deployments of idiosyncratic bioinformatics software and data artifacts on cloud or local computing infrastructure (see Secotrec).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment