Skip to content

Instantly share code, notes, and snippets.

@nilium
Last active July 8, 2018 21:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nilium/b7d1e877ece767efc329d014d66195bb to your computer and use it in GitHub Desktop.
Save nilium/b7d1e877ece767efc329d014d66195bb to your computer and use it in GitHub Desktop.
Codf rationale draft

Codf

codf (godoc) is a personal config language for declaring structured config files with support for a range of built-in types. It's meant to scratch an itch I've had over configuration in Go programs for some time.

Rationale

Codf exists primarily to make expressive configuration easy to use. Expressive in this case means, more or less, relatively complex but easy to write. There are cases of this in the wild, such as nginx, where configuration is integral to making use of them. Codf takes inspiration from nginx in particular, along with other curly-braced config languages. The goal is to provide structure for programs where the status quo does not.

I’ve spent a fair amount of time on and off the job building programs that require configuration to define their runtime behavior. This includes programs whose configuration borders on scripting. These programs’ config files defines not only inputs (listener addresses) and outputs (metrics, DBs, logs), but also tasks, data pipelines, state transitions, and so on. So far, these have all used JSON, YAML, or HCL (once), and they’ve all felt poorly defined as a result. Most of this comes down to what these languages attempt to do: they encapsulate data in simple data-structures (objects and lists) with simple key-value pairs.

If we take a look at configuration in Go right now, the de facto standard for Go program configuration right now is mainly key-value pairs. These take the form of CLI flags, environment variables, JSON, YAML, TOML, HCL (usually only seen in the HashiCorp sphere), and INI files. The amount of expression and structure in each varies, but it typically lines up as one of the two groups (these terms are my own, so naming is open for debate):

  • Unstructured — Values do not have keys of their own. This creates a simple, flat namespace. Includes environment variables, CLI flags, properties files, and most INI parsing.
  • Structured — Values may contain key-value pairs. As a result, namespaces may have keys referring to one or more nested namespaces with additional values. This includes JSON, YAML, TOML, and HCL. All non-JSON languages tend to mirror JSON in terms of what can be expressed (in HCL’s case, this is by design; in TOML’s case, this appears to be accidental).

Both are used for configuration in a wide variety of programs, so it can be taken for granted that these can be made to work for a wide range of configuration. I won’t be addressing the unstructured case — programs often don’t use them unless it already makes sense, so they’re rarely entirely wrong. The argument for codf is that these formats are not always appropriate for modeling program configuration.

Languages like HCL try to address some of the problem of expressiveness by permitting section-like structures:

location "/" {
  root = "/var/www"
}

Which maps to the JSON {"location":{"/":{"root":"/var/www"}}}. This helps a little, but breaks down if your nesting exceeds two levels (i.e., location and /). Locations in nginx can match in different ways, so a location like this gets a little dice-y with HCL and other languages:

location ~ \.(png|jpg|gif)$ {
  root /var/www/static;
  expires 7d;
}

While the above can easily be expressed in nginx and codf (the above is valid codf, though the regexp is a word in this context), it’s harder to express in languages like HCL:

// A possible writing of the above location
location "\\.(png|jpg|gif)$" {
  regexp = true
  root = "/var/www/static"
  expires = "168h" // Unlike nginx, Go doesn't have a day unit
}

Or in YAML:

locations:
- regexp: \.(png|jpg|gif)
  root: /var/www/static
  expires: 168h

The proposed solution with codf is fairly simple: don’t shoehorn configuration into data structures that don’t make sense. Instead, give programs configuration that fits their needs. With that in mind, this is the equivalent syntax for codf:

location ~ #/\.(png|jpg|gif)$/ {
  root /var/www/static;
  expires 168h; ' This is parsed as a time.Duration
}

So this is where codf comes in. It tries to fill in what I see as a missing link in Go configuration — it gives me the tools to work with forms other than lists and key-value pairs.

First, with access to the AST, I can pre-process and validate files. It becomes easy to provide precise feedback and error messages for invalid configuration. (This is also possible with HCL if you want just key-value pairs.) A typo can be an error instead of a piece of data that didn’t match in unmarshaling, and the program can tell me where the offending token lives.

Second, I can express something other than assignment. I can write a single statement that exists on its own, such as internal;, or one with a variable number of parameters. There’s no need to wrap this up in a key = [list] format, so it can be written and read without the extra syntax to get around the limitations of other languages.

Third, I can use datatypes that don’t require additional parsing after the fact. Codf supports big integers, durations, regular expressions, and barewords as part of the syntax, along with a handful of other literals. As a result, it’s rarely needed to string-ify these things the way some languages require to fit their syntax.

Given all of that, codf lets me build programs where configuration reads and writes well for the user. So, that’s why I built codf: I dislike the status quo.

Other Formats

There are also cases like XML, Erlang terms, and S-expressions, and they all allow expression beyond key-value pairs. Erlang terms and S-expressions serve as good examples of data as config. XML on the other hand allows interleaving of text and structure (elements with attributes).

XML is fairly uncommon in Go as a configuration format. I believe it’s poorly suited for the job, but there are likely examples where it’s appropriate. As with codf, it’s capable of expressing more than the key-value languages, but working with XML poses more challenges for little benefit in Go.

Erlang terms and S-expressions are both interesting and fit well in their domains. Either is usable in Go if the work is put in to parse them, but without the tools their languages give you for working with them, they would be cumbersome in Go without a great deal more work. So, Erlang and the many Lisps out there likely don’t need help to do their work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment