edolstra/nix-lang.md

## nix-lang.md

      
    Raw
  

              nix-lang.md
            
          
    This document contains some ideas for additions to the Nix language.
Motivation

The Nix package manager, Nixpkgs and NixOS currently have several
problems:


Poor discoverability of package options. Package functions have
function arguments like enableFoo, but there is no way for the Nix
UI to discover them, let alone to provide programmatic ways to
change them. Likewise, there is a global configuration mechanism
(config.*) but the Nix UI doesn't know about it.


Ad-hoc override/configuration mechanisms in Nixpkgs: .override.,
overrideDerivation, super: self:, overlays, config.*.


Very inefficient evaluation mechanism. To get metadata about a
package, nix-env needs to evaluate a derivation, just to be able
to extract its name attribute and other metadata. Quite apart from
being inefficient, it's not guaranteed that evaluation succeeds
(e.g. a package might have an assert stdenv.isLinux, so on macOS
you can't even see the package).
Also, derivations and package function need to hang on to their
inputs to make .override and .overrideDerivation work. This
makes it impossible to garbage-collect inputs.


The functional style of package definitions is verbose and of
questionable value. That is, package functions need to specify
dependencies in three places: 1) at the use site, e.g. buildInputs = [ libfoo ]; 2) as a function argument, e.g. { libfoo }: ...;
and 3) at the call site, e.g. import package.nix { inherit libfoo; }. The last one can be elided using callPackage but that's an
ugly hack.


The NixOS module system is implemented as a library function, which
causes oddities like the need for a mkIf function (due to a lack
of laziness in the language when merging attribute sets). Evaluation
is also pretty expensive.


Inconsistent configuration styles between NixOS (the module system)
and Nixpkgs (function arguments, and an ad hoc config attribute
set).


nix-env has no notion of package plugins. For example, there is no
way to install flashplayer into the firefox package. Nix doesn't
know that firefox is a wrapper script generator that takes a list
of plugins and that flashplayer is something that can be plugged
into that list. Nixpkgs has numerous other examples of such
"composable" packages, e.g. TeX (texlive.combine) and GHC
(ghcWithPackages).
Such compositions can be expressed in Nix expressions
(e.g. ghcWithPackages (pkgs: [pkgs.mtl])), but how to do so is not
discoverable except by searching the Internet or grepping the
Nixpkgs sources.


It's worth noting that the Nix language is intended as a DSL for
package and configuration management, but it has no notions of
"packages" or "configurations".
Solution: Configurations

The solution to these problems is to add a language feature named
configurations to make Nix more suitable as an actual DSL for
package/configuration management. Configurations are essentially an
implementation of the NixOS module system as a language feature. The
goal is that this provides us with a better override mechanism for
packages and can replace the current module system with a more
performant implementation.
A configuration is essentially a nested attribute set: it contains
values (plus some metadata) indexed by an attribute
path. Configurations are composed out of one or module configuration
modules. A configuration module is a syntactic construct that vaguely
looks like a recursive attribute set using angle brackets:
module1 = <
  foo = 123;
  bar = true;
  a.b.c = if bar then foo else foo * 2;
>;

This configuration module by itself describes a configuration with
three options named foo, bar and a.b.c. Selecting an attribute
from a configuration module causes it to be evaluated into a
configuration (which is an opaque value, not an attribute set!). E.g.
module1.a.b.c => 123

Note that option values can refer to other options in the
configuration, so the value of a.b.c. can depend of the values of
foo and bar. (In contrast to the NixOS module system, you don't
have to write config. in front of every option use.)
Configuration modules can be composed with other configuration modules
to create more complex configurations. This is done by extending
another module. For example:
module2 = <
  extends module1;
  bar = false;
>;

Now,
module2.a.b.c => 246

Thus, modules can override the values of options declared in another
module.
TBD: the alternative is this quasi-functional style:
module2 = module1 <
  bar = false;
>;

i.e. a configuration is extended by applying a module to it. This is
particularly nice in conjunction with derivation builders (see
below). This also allows extending a configuration with a plain old
attrset, e.g. module2 = module1 { bar = false; }.
A crucial property here is that forward references are not
allowed. That is, the option values in a particular module cannot
reference options that have not been declared in that module or one of
its ancestors (i.e. the modules it extends). For example, this doesn't
work:
module3 = <
  foo = bar;
>;

module4 = <
  extends module3;
  bar = 1;
>;

Here the evaluation of module3 will fail because there is no bar
in scope. This is in contrast to the NixOS module system, which allows
any module to reference any other option, which makes it hard to
reason about the dependencies of a module.
A few restrictions: option names cannot be computed dynamically. For
example, this is not allowed:
bad1 = <
  foo.${bar} = ...;
>;

Also, options cannot be a prefix of a previously defined option, as in
bad2 = <
  extends module1;
  a.b = { c = 100; }; # wrong
  a.b.c = 100; # right
>;

Configuration option fields

[Annotations? Attributes?]
Configuration options, like NixOS options, are not just values. They
can have some metadata such as documentation, a type, default values
etc. The syntax of an option is as follows:
<ATTR-PATH> (| <FIELD-NAME> <EXPR>?)* (= <EXPR)?;

For example:
module5 = <

  networking.firewall.enable
    | doc "Whether to enable the firewall."
    | type types.bool
    | default false;

  networking.firewall.allowedTCPPorts
    | doc "TCP ports to be opened in the firewall."
    | type types.list types.int
    = [];

>;

TODO: syntax bikeshedding.
The following fields are supported:


value: An option value at priority level 100. = <X> is sugar for
| value `.


default: An option value at priority level 1500. Thus defaults are
ignored (not merged) when there are any higher-priority values.


doc: A description of the option in some sensible format.


example: An example value. May be repeated. Unlike in the NixOS
module system, there is no literalExample because the parser can
store the source text of the example field. (The example needs to be
a syntactically valid Expr, but it doesn't need to evaluate.)


type: The type of the option. This should be a configuration with
merge and check members. TODO: to support future incremental
evaluation, maybe merge should be a fold, rather than a function
that takes a list of all option values.


if: A Boolean value that determines whether the option value is to
be used, e.g.
<
  environment.systemPackages
    | if enableFoo
    = [ foo ];
>


prio: The numerical priority of the value. When computing the
final value of the option, all values that have a lower priority
that the highest-priority value(s) are discarded.


label, before and after: A string value used to
topologically sort option values prior to merging. The NixOS module
system uses a numeric value for this, but strings are nicer for
things like ordering of the activation script:
<
  activationScript
    | label "users-groups"
    = "... populate /etc/{passwd, group} ... ";

  # This activation script fragment needs to run after users/groups
  # have been created.
  activationScript
    | after "users-groups"
    = "install -d -o fnord -g /var/foo";
>


final: Specifies that the value of the option cannot be changed by
later modules.


scope: An attrset added to the lexical scope of the evaluation of
values of this option, equivalent to putting with <scope>; around
every value. Typical use case: allowing different pkgs sets to be
in scope for different options for cross-compilation (e.g. you can
have options hostBuildInputs and nativeBuildInputs with
different package sets in scope).


Any other field name gives a syntax error. Only the initial definition
of an option can specify doc, type and example.
Note: there is no merge field. If you want another merge policy
you should adapt the type.
TODO: flip priorities? "Higher priority" is ambiguous.
Configuration module fields

Probably it's also nice to have some global fields for a configuration
module. E.g.


doc: A description of the configuration.


globalScope: Like scope, but for all options in this
configuration. Typical use case would be globalScope = pkgs.


fieldScope: Like scope, but for the evaluation of option fields
other than values/defaults. For example, fieldScope = types would
allow getting rid of types. in option types.


More syntactic sugar

Configuration modules could have an if construct to enable/disable a
group of values more concisely. This is similar to mkIf in the NixOS
module system, which pushes the conditional down into the options.
<
  enableFoo
    | type bool
    = false;

  if enableFoo {
    environment.systemPackages = [ foo ];
    systemd.units.foo = ...;
  };

  # The above is equivalent to:
  environment.systemPackages | if enableFoo = [ foo ];
  systemd.units | if enableFoo = { foo = ...; };
>

However, this might not be very useful if we get rid of most enable
options (see below).
Configurations as derivation builders

Currently Nix uses functions as the mechanism to abstract over common
derivation patterns. For example, stdenv.mkDerivation,
buildPythonPackage and fetchurl are all functions that take some
inputs and return a derivation. The problem with this approach is that
the act of applying a function consumes its inputs, leaving us without
the ability to override inputs or to query the inputs. This led to
hacks like .override, .overrideDerivation, passthru, meta and
so on. Also, functions lack a documentation mechanism.
We can replace such functions with configuration modules that produce
a drv output option. For example, here is a very bare-bones
derivation configuration module:
builders.derivation = <

  # Interface

  name
    | doc "Name of the derivation, used in the Nix store path."
    | type types.str
    | example "openssl";

  version
    | doc "Version of the derivation, used in the Nix store path."
    | type types.str
    | example "1.0.2"
    = "";

  builder
    | doc "Command to be executed to build the derivation."
    | type types.path
    | example "${bash}/bin/sh";

  args
    | doc "Arguments passed to the builder."
    | type types.unique (types.list types.str)
    = [];

  outputs
    | doc "Symbolic names of the outputs of this derivation."
    | type types.list types.str
    = "out";

  env
    | doc "Structured values passed to the builder."
    | type types.attrsOf types.str
    = { inherit outputs; };

  # Implementation

  drv
    | doc "The resulting store derivation."
    | final
    = builtins.derivation ({
        name = "${name}-${version}";
        inherit builder args;
      } // env);

>;

We can then build other abstractions on top of each other:
# This is basically stdenv.
builders.generic = <

  extends builders.derivation;

  # Interface

  buildInputs
    | doc "Dependencies of this derivation."
    | type types.list types.package # FIXME
    = [];

  phases
    | doc "Names of build phases."
    | type types.list types.str
    | example ["build" "install"];

  # Implementation

  builder = "${bash}/bin/bash";

  args = [ "-c" stdenv/generic/setup.sh ];

  env = { inherit phases buildInputs; };

  ...
>;

# A package is a derivation that can be installed by nix-env.
builder.package = <

  description
    | doc "A short (one-line) description of the package."
    | type types.str;

  homepage = ...;
  longDescription = ...;

  supportedPlatforms
    | doc "List of platforms on which this package is supported."
    | type types.list types.platform;

  # nix-env won't show this package if `supported` evaluates to false.
  supported
    | doc "Whether the package is supported on the target platform."
    | type types.bool
    = elem system supportedPlatforms;

  # Nix will refuse to build this package if `enabled` evaluates to false.
  enabled
    | doc "Whether the package can be built."
    | type types.bool
    = (free || config.allowUnfree) && supported;

>;

# Global Nixpkgs configuration.
config = <

  allowUnfree
    | doc "Whether to allow unfree software to be installed."
    | type types.bool
    = false;

>;

builders.unixPackage = <

  extends builders.generic builders.package;

  src
    | doc "The source tarball of the package."
    | type types.path;

  configureFlags
    | doc "Flags passed to the package's configure script."
    | type types.list types.str
    = [];

  # Create some phases.
  phases | label "unpack" = ["unpack"];
  phases | label "configure" after "unpack" = ["configure"];
  phases | label "build" after "configure" = ["build"];
  phases | label "install" after "build" = ["install"];

  configurePhase
    | doc "Shell code to configure the package."
    | type type.lines
    = "./configure --prefix=${placeholder out} $configureFlags";

  buildPhase
    | doc "Shell code to build the package."
    | type type.lines
    = "make";

  installPhase
    | doc "Shell code to build the package."
    | type type.lines
    = "make install";

  env = { inherit configurePhase ...; };
>;

# An actual package.
pkgs.hello = <
  extends builders.unixPackage;

  name = "hello";
  version = "1.12";
  description = "A program that produces a familiar, friendly greeting";
  license = licenses.gpl;

  enableGUI
    | doc "Enable GTK+ support."
    | type types.bool
    | default false;

  # This uses the functional notation for extending a configuration:
  src = builders.fetchurl { url = ...; sha256 = ...; };

  buildInputs = if enableGUI then [ gtk ] else [];
  # or equivalently:
  buildInputs | if enableGUI = gtk;
>;

Since configuration values are computed lazily, a tool like nix-env -qa only needs to evaluate attributes like name and enabled. It
doesn't need to evaluate drv.
Package overrides

# To override the source of Hello.
pkgs.hello = <
  src.url = http://example.org/my-hello.tar.gz;

  # Note: this is sugar for:
  src = < url = http://example.org/my-hello.tar.gz; >;
  # I.e. it adds a module to the configuration 'src'.
>;

# To enable a feature in Hello.
pkgs.hello = <
  enableGUI = true;
>;

# To add something to installPhase.
pkgs.hello = <
  installPhase | after default = "mv $out/bin/hello $out/bin/my-hello";
>;

# To change a dependency of Hello in Hello *only*.
pkgs.hello = <
  targetPackages = < extends pkgs; gtk = < src = ./my-gtk.tar.gz; >; >;
>;

# To replace Hello entirely. FIXME
pkgs.hello | prio 0 = pkgs.my-hello;
pkgs.my-hello = < ... >;

Plugins

TODO
pkgs.firefox = <
  extends builders.unixPackage;

  plugins
    | doc = "Set of enabled plugins."
    | type = subsetOf pkgs.mozillaPlugins;
    # this type should implicitly set `scope` to the given attrset
    = [];

  installPhase = "... generate firefox wrapper script ...";
>;

pkgs.mozillaPlugins.flashplayer = < ... >;

To enable:
pkgs.firefox = <
  plugins = [ flashplayer ];
>;

User interface

Having explicit package options and attribute annotations allows nix
to show and modify options. E.g.
$ nix query-package nixpkgs.firefox
Available options:

- enablePulseAudio: Enable sound via PulseAudio.
Default value: true
Current value: N/A

- enableOfficialBranding
...

- plugins: Set of enabled plugins.
Possible values:
  nixpkgs.flashplayer
  nixpkgs.google-talk
  ...

$ nix install nixpkgs.firefox --with enableOfficialBranding --without enablePulseAudio

$ nix modify-package nixpkgs.firefox --with enablePulseAudio --add plugins nixpkgs.flashplayer

TBD: we probably don't want to show only end-user options like
enablePulseAudio by default, rather than every option inherited from
builders like buildInputs or name. So we need some way to specify
which options should be shown in what context (like the internal
flag in the current NixOS module system).
Nixpkgs/NixOS structure

TODO: how to hook everything up.
At top-level, Nixpkgs should evaluate to a set (or configuration?) of configurations:
{
  lib = <
    ...;
  >;

  builders = <
    derivation = ...;
    generic = ...;
    package = ...;
    unixPackage = ...;
    fetchurl = ...;
    ...;
  >;

  pkgs = <
    glibc = import .../glibc;
    hello = import .../hello;
  >;

  # NixOS modules
  modules = <
    etc = import modules/.../etc.nix;
    ...;
    top-level = import modules/.../top-level.nix;
    ...
    # The minimal NixOS system.
    base = < extends etc top-level initrd kernel ...; >;

    # Non-default stuff.
    kde = import modules/.../kde.nix;
  >;
}

We should no longer include every NixOS module by default. Instead of
using enable options, you just inherit NixOS modules into your
system configuration, e.g.
myConfig = <
  extends modules.base modules.kde modules.nginx ...;
  ...
>;

NixOS services

In the NixOS Of The Future, most modules shouldn't be top-level
configuration modules, since those 1) violate POLA: you don't want the
PostgreSQL module to have the ability to change your X11
configuration; and 2) are not functional: you can't easily instantiate
a module multiple times, e.g. to have multiple PostgreSQL instances.
Instead most NixOS modules should be implemented as extensions of
smaller configuration types. For example, builders.service allows
the construction of a service that declares some systemd unit, and
only has access to (say) /var/services/<service-name> and the
closure of the unit in the Nix store. (This is essentially a
light-weight container.)
builders.service = <
  name
    | doc "The identifier of the service."
    | type str;

  systemdUnits
    | doc "The systemd units that implement this service."
    | type attrsOf /* instances of builders.systemUnit */
    = [];

  ...
>;

So PostgreSQL can be extension of this type:
services.postgresql = builders.service <
  # Interface
  port
    | doc "TCP port on which the server listens."
    | type int
    = 5432;

  # Implementation
  systemdUnits =
    ... a systemd service that runs postgresql, using
    /var/services/<service-name> as the data directory ...;

>;

NixOS at top-level has a systemServices option that allows services
to be hooked into the system:
modules.services = <

  systemServices
    | doc "Set of isolated system services"
    | type attrsOf /* instances of builders.service */;
    ;

  systemd.units =
    ... the concatenation of the systemdUnits values of all systemServices ..;

>;

A NixOS configuration can then instantiate multiple PostgreSQL
instances:
<
  systemServices.postgresql-prod = services.postgresql < >;
  systemServices.postgresql-test = services.postgresql < port = 12345; >;
>

Include mechanism

The idea is to provide an include mechanism as an alternative to
import. The expression include ./foo.nix is equivalent to
replacing the expression with the contents of ./foo.nix. Thus
foo.nix has access to the lexical scope at the site where it is
included.
This means that we can write all-packages.nix as a long list of
inputs:
<
  pkgs = <
    git = include ./git.nix;
    openssl = include ./openssl.nix;
    ...
  >;
>

where the package expressions look like
builders.unixPackage <
  name = "git";
  version = "2.3.4";
  ...
  buildInputs = [ openssl ];
>

That is, git.nix is no longer a function that takes builders as an
argument. Rather it just assumes that builders exists in the lexical
scope, thus preventing the need for ugly hacks like
callPackage. This might seem like a radical departure from how we
expressed dependencies in Nixpkgs, but in fact it's already widely in
use, see e.g. perl-packages.nix or python-packages.nix where most
dependencies are obtained from the surrounding rec set. include
simply makes it possible to split such package sets into multiple
files.
Note: The include mechanism must be lazy, i.e., the inclusion
shouldn't be done eagerly while parsing the including file, since we
don't want all-packages.nix to eagerly all its referenced files.
Note: This is not a token-based include mechanism like C's
#include. That is, include E is only valid as an expression
(i.e. the grammar will have a non-terminal expr: INCLUDE expr), and
the contents of the included file is parsed as an expr. Thus you
cannot write something like
{ x = 123;
  include ./other-attrs.nix
}

TODO: this may not be needed since option values are evaluated in a
dynamic scope anyway. E.g. in the example above openssl doesn't have
to be provided as a function argument anyway because it doesn't refer
to the lexical scope but to the scope set by the buildInputs
option. However, builders does need to come from somewhere, and it
would be annoying if we had to change the whole file into a function
just to pass in builders. OTOH, if we used the extends syntax
instead, then maybe this could be avoided: its argument could be
evaluated in a dynamic scope.
Discussion

Memory / performance impact

Hopefully Nix configurations can lead to less CPU and memory-hungry
Nixpkgs/NixOS evaluation. There are some aspects to this:


Configuration values should be implemented internally by keeping
weak pointers to option values. Thus option values can be garbage
collected (and re-evaluated from the saved thunk if necessary). Of
course, this is a risky strategy because the thunk might be bigger
than the WHNF. So if an option value is "small" (like an integer or
small string) it makes sense to overwrite the thunk by a non-weak
pointer to the resulting value.


Getting rid of pass-thru derivation attributes like .override,
.overrideDerivation, .meta etc. means that after evaluating a
drv option value of a package configuration, we can
garbage-collect the entire package configuration (assuming nothing
else is keeping it alive).


Not including every NixOS module by default should save a huge
amount of time and memory.


Doing the module system in C++ rather than in a slow purely
functional language should give a big speedup.


Incremental evaluation?

Can we make NixOS module evaluation incremental? E.g. in a NixOps
module with multiple machines that share a base configuration:
myBase = <
  extends modules.nixos modules.kde ...;
  networking.firewall.allowedTCPPorts = [ 1234 ];
>;

machine1 = myBase < networking.hostname = "machine1"; >;
machine2 = myBase < networking.hostname = "machine2"; >;
...

it would be nice if the evaluation of the configurations of machine1
and machine2 only needs to recompute the options that are affected
by the change to networking.hostname.
Do we need "super"?

The self: super: style has a self attribute to refer to the final
set, and super attribute to refer to the previous set in the
chain. (It should really be named final: previous:.) In
configuration modules, any reference is to the final configuration, so
no special notation for referring to self is required. But do we
need an equivalent for super?
The typical use case is to extend a previous value, e.g.
buildInputs = super.buildInputs ++ [ libfoo ];

However, merge functions remove the need for this: we just concatenate
all definitions of buildInputs, just as is done in the NixOS module
system, i.e.
buildInputs = [ libfoo ];

If we want to throw away previous definitions, we do:
buildInputs override = [ libfoo ];

(This corresponds with mkForce in the module system.)
The experience with the NixOS module system suggests that this is
sufficient and we don't need a super keyword.
Strict argument checking

Historically, Nixpkgs has been full of builder functions that accept
an open set of arguments (...), e.g. stdenv.mkDerivation. This is
bad for error checking since a typo in an argument won't be
detected. Thus, configuration modules like builders.package or
builders.generic have a closed set of inputs in the sense that they
cannot refer to options that haven't been declared yet. Also,
builders.derivation doesn't pass through arbitrary arguments to the
builder environment; instead the environment has to be specified
explicitly (by adding values to the env option).
This should be fine because the use of environment variables really
predates the introduction of string interpolation. Originally, if
(say) buildPhase needed access to some variable foo, we would
write
derivation {
  inherit foo;
  buildPhase = "... $foo ... ";
}

But nowadays we can just write
<
  buildPhase = "... ${foo} ... ";
>

Migration path

TODO