Skip to content

Instantly share code, notes, and snippets.

@edolstra
Last active July 16, 2024 02:12
Show Gist options
  • Save edolstra/29ce9d8ea399b703a7023073b0dbc00d to your computer and use it in GitHub Desktop.
Save edolstra/29ce9d8ea399b703a7023073b0dbc00d to your computer and use it in GitHub Desktop.
Nix language changes

This document contains some ideas for additions to the Nix language.

Motivation

The Nix package manager, Nixpkgs and NixOS currently have several problems:

  • Poor discoverability of package options. Package functions have function arguments like enableFoo, but there is no way for the Nix UI to discover them, let alone to provide programmatic ways to change them. Likewise, there is a global configuration mechanism (config.*) but the Nix UI doesn't know about it.

  • Ad-hoc override/configuration mechanisms in Nixpkgs: .override., overrideDerivation, super: self:, overlays, config.*.

  • Very inefficient evaluation mechanism. To get metadata about a package, nix-env needs to evaluate a derivation, just to be able to extract its name attribute and other metadata. Quite apart from being inefficient, it's not guaranteed that evaluation succeeds (e.g. a package might have an assert stdenv.isLinux, so on macOS you can't even see the package).

    Also, derivations and package function need to hang on to their inputs to make .override and .overrideDerivation work. This makes it impossible to garbage-collect inputs.

  • The functional style of package definitions is verbose and of questionable value. That is, package functions need to specify dependencies in three places: 1) at the use site, e.g. buildInputs = [ libfoo ]; 2) as a function argument, e.g. { libfoo }: ...; and 3) at the call site, e.g. import package.nix { inherit libfoo; }. The last one can be elided using callPackage but that's an ugly hack.

  • The NixOS module system is implemented as a library function, which causes oddities like the need for a mkIf function (due to a lack of laziness in the language when merging attribute sets). Evaluation is also pretty expensive.

  • Inconsistent configuration styles between NixOS (the module system) and Nixpkgs (function arguments, and an ad hoc config attribute set).

  • nix-env has no notion of package plugins. For example, there is no way to install flashplayer into the firefox package. Nix doesn't know that firefox is a wrapper script generator that takes a list of plugins and that flashplayer is something that can be plugged into that list. Nixpkgs has numerous other examples of such "composable" packages, e.g. TeX (texlive.combine) and GHC (ghcWithPackages).

    Such compositions can be expressed in Nix expressions (e.g. ghcWithPackages (pkgs: [pkgs.mtl])), but how to do so is not discoverable except by searching the Internet or grepping the Nixpkgs sources.

It's worth noting that the Nix language is intended as a DSL for package and configuration management, but it has no notions of "packages" or "configurations".

Solution: Configurations

The solution to these problems is to add a language feature named configurations to make Nix more suitable as an actual DSL for package/configuration management. Configurations are essentially an implementation of the NixOS module system as a language feature. The goal is that this provides us with a better override mechanism for packages and can replace the current module system with a more performant implementation.

A configuration is essentially a nested attribute set: it contains values (plus some metadata) indexed by an attribute path. Configurations are composed out of one or module configuration modules. A configuration module is a syntactic construct that vaguely looks like a recursive attribute set using angle brackets:

module1 = <
  foo = 123;
  bar = true;
  a.b.c = if bar then foo else foo * 2;
>;

This configuration module by itself describes a configuration with three options named foo, bar and a.b.c. Selecting an attribute from a configuration module causes it to be evaluated into a configuration (which is an opaque value, not an attribute set!). E.g.

module1.a.b.c => 123

Note that option values can refer to other options in the configuration, so the value of a.b.c. can depend of the values of foo and bar. (In contrast to the NixOS module system, you don't have to write config. in front of every option use.)

Configuration modules can be composed with other configuration modules to create more complex configurations. This is done by extending another module. For example:

module2 = <
  extends module1;
  bar = false;
>;

Now,

module2.a.b.c => 246

Thus, modules can override the values of options declared in another module.

TBD: the alternative is this quasi-functional style:

module2 = module1 <
  bar = false;
>;

i.e. a configuration is extended by applying a module to it. This is particularly nice in conjunction with derivation builders (see below). This also allows extending a configuration with a plain old attrset, e.g. module2 = module1 { bar = false; }.

A crucial property here is that forward references are not allowed. That is, the option values in a particular module cannot reference options that have not been declared in that module or one of its ancestors (i.e. the modules it extends). For example, this doesn't work:

module3 = <
  foo = bar;
>;

module4 = <
  extends module3;
  bar = 1;
>;

Here the evaluation of module3 will fail because there is no bar in scope. This is in contrast to the NixOS module system, which allows any module to reference any other option, which makes it hard to reason about the dependencies of a module.

A few restrictions: option names cannot be computed dynamically. For example, this is not allowed:

bad1 = <
  foo.${bar} = ...;
>;

Also, options cannot be a prefix of a previously defined option, as in

bad2 = <
  extends module1;
  a.b = { c = 100; }; # wrong
  a.b.c = 100; # right
>;

Configuration option fields

[Annotations? Attributes?]

Configuration options, like NixOS options, are not just values. They can have some metadata such as documentation, a type, default values etc. The syntax of an option is as follows:

<ATTR-PATH> (| <FIELD-NAME> <EXPR>?)* (= <EXPR)?;

For example:

module5 = <

  networking.firewall.enable
    | doc "Whether to enable the firewall."
    | type types.bool
    | default false;

  networking.firewall.allowedTCPPorts
    | doc "TCP ports to be opened in the firewall."
    | type types.list types.int
    = [];

>;

TODO: syntax bikeshedding.

The following fields are supported:

  • value: An option value at priority level 100. = <X> is sugar for | value `.

  • default: An option value at priority level 1500. Thus defaults are ignored (not merged) when there are any higher-priority values.

  • doc: A description of the option in some sensible format.

  • example: An example value. May be repeated. Unlike in the NixOS module system, there is no literalExample because the parser can store the source text of the example field. (The example needs to be a syntactically valid Expr, but it doesn't need to evaluate.)

  • type: The type of the option. This should be a configuration with merge and check members. TODO: to support future incremental evaluation, maybe merge should be a fold, rather than a function that takes a list of all option values.

  • if: A Boolean value that determines whether the option value is to be used, e.g.

    <
      environment.systemPackages
        | if enableFoo
        = [ foo ];
    >
    
  • prio: The numerical priority of the value. When computing the final value of the option, all values that have a lower priority that the highest-priority value(s) are discarded.

  • label, before and after: A string value used to topologically sort option values prior to merging. The NixOS module system uses a numeric value for this, but strings are nicer for things like ordering of the activation script:

    <
      activationScript
        | label "users-groups"
        = "... populate /etc/{passwd, group} ... ";
    
      # This activation script fragment needs to run after users/groups
      # have been created.
      activationScript
        | after "users-groups"
        = "install -d -o fnord -g /var/foo";
    >
    
  • final: Specifies that the value of the option cannot be changed by later modules.

  • scope: An attrset added to the lexical scope of the evaluation of values of this option, equivalent to putting with <scope>; around every value. Typical use case: allowing different pkgs sets to be in scope for different options for cross-compilation (e.g. you can have options hostBuildInputs and nativeBuildInputs with different package sets in scope).

Any other field name gives a syntax error. Only the initial definition of an option can specify doc, type and example.

Note: there is no merge field. If you want another merge policy you should adapt the type.

TODO: flip priorities? "Higher priority" is ambiguous.

Configuration module fields

Probably it's also nice to have some global fields for a configuration module. E.g.

  • doc: A description of the configuration.

  • globalScope: Like scope, but for all options in this configuration. Typical use case would be globalScope = pkgs.

  • fieldScope: Like scope, but for the evaluation of option fields other than values/defaults. For example, fieldScope = types would allow getting rid of types. in option types.

More syntactic sugar

Configuration modules could have an if construct to enable/disable a group of values more concisely. This is similar to mkIf in the NixOS module system, which pushes the conditional down into the options.

<
  enableFoo
    | type bool
    = false;

  if enableFoo {
    environment.systemPackages = [ foo ];
    systemd.units.foo = ...;
  };

  # The above is equivalent to:
  environment.systemPackages | if enableFoo = [ foo ];
  systemd.units | if enableFoo = { foo = ...; };
>

However, this might not be very useful if we get rid of most enable options (see below).

Configurations as derivation builders

Currently Nix uses functions as the mechanism to abstract over common derivation patterns. For example, stdenv.mkDerivation, buildPythonPackage and fetchurl are all functions that take some inputs and return a derivation. The problem with this approach is that the act of applying a function consumes its inputs, leaving us without the ability to override inputs or to query the inputs. This led to hacks like .override, .overrideDerivation, passthru, meta and so on. Also, functions lack a documentation mechanism.

We can replace such functions with configuration modules that produce a drv output option. For example, here is a very bare-bones derivation configuration module:

builders.derivation = <

  # Interface

  name
    | doc "Name of the derivation, used in the Nix store path."
    | type types.str
    | example "openssl";

  version
    | doc "Version of the derivation, used in the Nix store path."
    | type types.str
    | example "1.0.2"
    = "";

  builder
    | doc "Command to be executed to build the derivation."
    | type types.path
    | example "${bash}/bin/sh";

  args
    | doc "Arguments passed to the builder."
    | type types.unique (types.list types.str)
    = [];

  outputs
    | doc "Symbolic names of the outputs of this derivation."
    | type types.list types.str
    = "out";

  env
    | doc "Structured values passed to the builder."
    | type types.attrsOf types.str
    = { inherit outputs; };

  # Implementation

  drv
    | doc "The resulting store derivation."
    | final
    = builtins.derivation ({
        name = "${name}-${version}";
        inherit builder args;
      } // env);

>;

We can then build other abstractions on top of each other:

# This is basically stdenv.
builders.generic = <

  extends builders.derivation;

  # Interface

  buildInputs
    | doc "Dependencies of this derivation."
    | type types.list types.package # FIXME
    = [];

  phases
    | doc "Names of build phases."
    | type types.list types.str
    | example ["build" "install"];

  # Implementation

  builder = "${bash}/bin/bash";

  args = [ "-c" stdenv/generic/setup.sh ];

  env = { inherit phases buildInputs; };

  ...
>;

# A package is a derivation that can be installed by nix-env.
builder.package = <

  description
    | doc "A short (one-line) description of the package."
    | type types.str;

  homepage = ...;
  longDescription = ...;

  supportedPlatforms
    | doc "List of platforms on which this package is supported."
    | type types.list types.platform;

  # nix-env won't show this package if `supported` evaluates to false.
  supported
    | doc "Whether the package is supported on the target platform."
    | type types.bool
    = elem system supportedPlatforms;

  # Nix will refuse to build this package if `enabled` evaluates to false.
  enabled
    | doc "Whether the package can be built."
    | type types.bool
    = (free || config.allowUnfree) && supported;

>;

# Global Nixpkgs configuration.
config = <

  allowUnfree
    | doc "Whether to allow unfree software to be installed."
    | type types.bool
    = false;

>;

builders.unixPackage = <

  extends builders.generic builders.package;

  src
    | doc "The source tarball of the package."
    | type types.path;

  configureFlags
    | doc "Flags passed to the package's configure script."
    | type types.list types.str
    = [];

  # Create some phases.
  phases | label "unpack" = ["unpack"];
  phases | label "configure" after "unpack" = ["configure"];
  phases | label "build" after "configure" = ["build"];
  phases | label "install" after "build" = ["install"];

  configurePhase
    | doc "Shell code to configure the package."
    | type type.lines
    = "./configure --prefix=${placeholder out} $configureFlags";

  buildPhase
    | doc "Shell code to build the package."
    | type type.lines
    = "make";

  installPhase
    | doc "Shell code to build the package."
    | type type.lines
    = "make install";

  env = { inherit configurePhase ...; };
>;

# An actual package.
pkgs.hello = <
  extends builders.unixPackage;

  name = "hello";
  version = "1.12";
  description = "A program that produces a familiar, friendly greeting";
  license = licenses.gpl;

  enableGUI
    | doc "Enable GTK+ support."
    | type types.bool
    | default false;

  # This uses the functional notation for extending a configuration:
  src = builders.fetchurl { url = ...; sha256 = ...; };

  buildInputs = if enableGUI then [ gtk ] else [];
  # or equivalently:
  buildInputs | if enableGUI = gtk;
>;

Since configuration values are computed lazily, a tool like nix-env -qa only needs to evaluate attributes like name and enabled. It doesn't need to evaluate drv.

Package overrides

# To override the source of Hello.
pkgs.hello = <
  src.url = http://example.org/my-hello.tar.gz;

  # Note: this is sugar for:
  src = < url = http://example.org/my-hello.tar.gz; >;
  # I.e. it adds a module to the configuration 'src'.
>;

# To enable a feature in Hello.
pkgs.hello = <
  enableGUI = true;
>;

# To add something to installPhase.
pkgs.hello = <
  installPhase | after default = "mv $out/bin/hello $out/bin/my-hello";
>;

# To change a dependency of Hello in Hello *only*.
pkgs.hello = <
  targetPackages = < extends pkgs; gtk = < src = ./my-gtk.tar.gz; >; >;
>;

# To replace Hello entirely. FIXME
pkgs.hello | prio 0 = pkgs.my-hello;
pkgs.my-hello = < ... >;

Plugins

TODO

pkgs.firefox = <
  extends builders.unixPackage;

  plugins
    | doc = "Set of enabled plugins."
    | type = subsetOf pkgs.mozillaPlugins;
    # this type should implicitly set `scope` to the given attrset
    = [];

  installPhase = "... generate firefox wrapper script ...";
>;

pkgs.mozillaPlugins.flashplayer = < ... >;

To enable:

pkgs.firefox = <
  plugins = [ flashplayer ];
>;

User interface

Having explicit package options and attribute annotations allows nix to show and modify options. E.g.

$ nix query-package nixpkgs.firefox
Available options:

- enablePulseAudio: Enable sound via PulseAudio.
Default value: true
Current value: N/A

- enableOfficialBranding
...

- plugins: Set of enabled plugins.
Possible values:
  nixpkgs.flashplayer
  nixpkgs.google-talk
  ...

$ nix install nixpkgs.firefox --with enableOfficialBranding --without enablePulseAudio

$ nix modify-package nixpkgs.firefox --with enablePulseAudio --add plugins nixpkgs.flashplayer

TBD: we probably don't want to show only end-user options like enablePulseAudio by default, rather than every option inherited from builders like buildInputs or name. So we need some way to specify which options should be shown in what context (like the internal flag in the current NixOS module system).

Nixpkgs/NixOS structure

TODO: how to hook everything up.

At top-level, Nixpkgs should evaluate to a set (or configuration?) of configurations:

{
  lib = <
    ...;
  >;

  builders = <
    derivation = ...;
    generic = ...;
    package = ...;
    unixPackage = ...;
    fetchurl = ...;
    ...;
  >;

  pkgs = <
    glibc = import .../glibc;
    hello = import .../hello;
  >;

  # NixOS modules
  modules = <
    etc = import modules/.../etc.nix;
    ...;
    top-level = import modules/.../top-level.nix;
    ...
    # The minimal NixOS system.
    base = < extends etc top-level initrd kernel ...; >;

    # Non-default stuff.
    kde = import modules/.../kde.nix;
  >;
}

We should no longer include every NixOS module by default. Instead of using enable options, you just inherit NixOS modules into your system configuration, e.g.

myConfig = <
  extends modules.base modules.kde modules.nginx ...;
  ...
>;

NixOS services

In the NixOS Of The Future, most modules shouldn't be top-level configuration modules, since those 1) violate POLA: you don't want the PostgreSQL module to have the ability to change your X11 configuration; and 2) are not functional: you can't easily instantiate a module multiple times, e.g. to have multiple PostgreSQL instances.

Instead most NixOS modules should be implemented as extensions of smaller configuration types. For example, builders.service allows the construction of a service that declares some systemd unit, and only has access to (say) /var/services/<service-name> and the closure of the unit in the Nix store. (This is essentially a light-weight container.)

builders.service = <
  name
    | doc "The identifier of the service."
    | type str;

  systemdUnits
    | doc "The systemd units that implement this service."
    | type attrsOf /* instances of builders.systemUnit */
    = [];

  ...
>;

So PostgreSQL can be extension of this type:

services.postgresql = builders.service <
  # Interface
  port
    | doc "TCP port on which the server listens."
    | type int
    = 5432;

  # Implementation
  systemdUnits =
    ... a systemd service that runs postgresql, using
    /var/services/<service-name> as the data directory ...;

>;

NixOS at top-level has a systemServices option that allows services to be hooked into the system:

modules.services = <

  systemServices
    | doc "Set of isolated system services"
    | type attrsOf /* instances of builders.service */;
    ;

  systemd.units =
    ... the concatenation of the systemdUnits values of all systemServices ..;

>;

A NixOS configuration can then instantiate multiple PostgreSQL instances:

<
  systemServices.postgresql-prod = services.postgresql < >;
  systemServices.postgresql-test = services.postgresql < port = 12345; >;
>

Include mechanism

The idea is to provide an include mechanism as an alternative to import. The expression include ./foo.nix is equivalent to replacing the expression with the contents of ./foo.nix. Thus foo.nix has access to the lexical scope at the site where it is included.

This means that we can write all-packages.nix as a long list of inputs:

<
  pkgs = <
    git = include ./git.nix;
    openssl = include ./openssl.nix;
    ...
  >;
>

where the package expressions look like

builders.unixPackage <
  name = "git";
  version = "2.3.4";
  ...
  buildInputs = [ openssl ];
>

That is, git.nix is no longer a function that takes builders as an argument. Rather it just assumes that builders exists in the lexical scope, thus preventing the need for ugly hacks like callPackage. This might seem like a radical departure from how we expressed dependencies in Nixpkgs, but in fact it's already widely in use, see e.g. perl-packages.nix or python-packages.nix where most dependencies are obtained from the surrounding rec set. include simply makes it possible to split such package sets into multiple files.

Note: The include mechanism must be lazy, i.e., the inclusion shouldn't be done eagerly while parsing the including file, since we don't want all-packages.nix to eagerly all its referenced files.

Note: This is not a token-based include mechanism like C's #include. That is, include E is only valid as an expression (i.e. the grammar will have a non-terminal expr: INCLUDE expr), and the contents of the included file is parsed as an expr. Thus you cannot write something like

{ x = 123;
  include ./other-attrs.nix
}

TODO: this may not be needed since option values are evaluated in a dynamic scope anyway. E.g. in the example above openssl doesn't have to be provided as a function argument anyway because it doesn't refer to the lexical scope but to the scope set by the buildInputs option. However, builders does need to come from somewhere, and it would be annoying if we had to change the whole file into a function just to pass in builders. OTOH, if we used the extends syntax instead, then maybe this could be avoided: its argument could be evaluated in a dynamic scope.

Discussion

Memory / performance impact

Hopefully Nix configurations can lead to less CPU and memory-hungry Nixpkgs/NixOS evaluation. There are some aspects to this:

  • Configuration values should be implemented internally by keeping weak pointers to option values. Thus option values can be garbage collected (and re-evaluated from the saved thunk if necessary). Of course, this is a risky strategy because the thunk might be bigger than the WHNF. So if an option value is "small" (like an integer or small string) it makes sense to overwrite the thunk by a non-weak pointer to the resulting value.

  • Getting rid of pass-thru derivation attributes like .override, .overrideDerivation, .meta etc. means that after evaluating a drv option value of a package configuration, we can garbage-collect the entire package configuration (assuming nothing else is keeping it alive).

  • Not including every NixOS module by default should save a huge amount of time and memory.

  • Doing the module system in C++ rather than in a slow purely functional language should give a big speedup.

Incremental evaluation?

Can we make NixOS module evaluation incremental? E.g. in a NixOps module with multiple machines that share a base configuration:

myBase = <
  extends modules.nixos modules.kde ...;
  networking.firewall.allowedTCPPorts = [ 1234 ];
>;

machine1 = myBase < networking.hostname = "machine1"; >;
machine2 = myBase < networking.hostname = "machine2"; >;
...

it would be nice if the evaluation of the configurations of machine1 and machine2 only needs to recompute the options that are affected by the change to networking.hostname.

Do we need "super"?

The self: super: style has a self attribute to refer to the final set, and super attribute to refer to the previous set in the chain. (It should really be named final: previous:.) In configuration modules, any reference is to the final configuration, so no special notation for referring to self is required. But do we need an equivalent for super?

The typical use case is to extend a previous value, e.g.

buildInputs = super.buildInputs ++ [ libfoo ];

However, merge functions remove the need for this: we just concatenate all definitions of buildInputs, just as is done in the NixOS module system, i.e.

buildInputs = [ libfoo ];

If we want to throw away previous definitions, we do:

buildInputs override = [ libfoo ];

(This corresponds with mkForce in the module system.)

The experience with the NixOS module system suggests that this is sufficient and we don't need a super keyword.

Strict argument checking

Historically, Nixpkgs has been full of builder functions that accept an open set of arguments (...), e.g. stdenv.mkDerivation. This is bad for error checking since a typo in an argument won't be detected. Thus, configuration modules like builders.package or builders.generic have a closed set of inputs in the sense that they cannot refer to options that haven't been declared yet. Also, builders.derivation doesn't pass through arbitrary arguments to the builder environment; instead the environment has to be specified explicitly (by adding values to the env option).

This should be fine because the use of environment variables really predates the introduction of string interpolation. Originally, if (say) buildPhase needed access to some variable foo, we would write

derivation {
  inherit foo;
  buildPhase = "... $foo ... ";
}

But nowadays we can just write

<
  buildPhase = "... ${foo} ... ";
>

Migration path

TODO

@versedwildcat
Copy link

This is awesome. Is this still an "eventually" idea? Or has the nix ecosystem since matured/(become complex) enough that it is no longer feasible?

@roberth
Copy link

roberth commented Dec 13, 2023

The Nix team is currently trying to finish other eventually-level ideas that already have implementations in the form of experimental features, that need to be made sustainable first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment