Skip to content

Instantly share code, notes, and snippets.

@CMCDragonkai
Last active June 23, 2022 17:31
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CMCDragonkai/b2337658ff40294d251cc79d12b34224 to your computer and use it in GitHub Desktop.
Save CMCDragonkai/b2337658ff40294d251cc79d12b34224 to your computer and use it in GitHub Desktop.
An Illustration of a Nix Python Package #nix #python

An Illustration of a Nix Python Package

Every Python package (which is a collection of modules associated with version metadata) needs to have a setup.py.

So first create a relevant setup.py, refer to my article here: https://gist.github.com/CMCDragonkai/f0285cdc162758aaadb957c52f693819 for more information.

Remember that setup.py is meant to store version compatible ranges of dependencies, whereas requirements.txt is an optional fixed package set list generated by pip freeze > requirements.txt.

It is incorrect to use use requirements.txt as the install_requires list in setup.py because they are not meant to be used for the same thing.

The requirements.txt is optional, but is a good practice for allowing non-Nix users to attempt to build a reproducible environment.

However because we install the current package in --editable mode, then you should remove the line referencing the current package from the requirements.txt.

After that we need pkgs.nix, default.nix, shell.nix and release.nix.

The pkgs.nix will point to a content address Nixpkgs distribution. The default.nix will be the derviation of the current package, and work with nix-build. The shell.nix will import default.nix and override with development specific dependencies. Finally release.nix will import default.nix and produce several release targets. One of them will be Docker.

You should also have a MANIFEST.in that includes all the Nix files.

The following is an example of the relevant files.

The setup.py:

#!/usr/bin/env python3

from setuptools import setup, find_packages

with open('README.md', 'r') as f:
    long_description = f.read()

setup(
    name='awesome-package',
    version='0.0.1',
    author='Your Name',
    author_email='your@email.com',
    description='Does something awesome',
    long_description=long_description,
    url='https://awesome-package.git',
    packages=find_packages(),
    scripts=['awesome-script'],
    install_requires=['numpy'])

The pkgs.nix:

import (fetchTarball https://github.com/NixOS/nixpkgs-channels/archive/e6b8eb0280be5b271a52c606e2365cb96b7bd9f1.tar.gz) {}

The default.nix:

{
  pkgs ? import ./pkgs.nix,
  pythonPath ? "python36"
}:
  with pkgs;
  let
    python = lib.getAttrFromPath (lib.splitString "." pythonPath) pkgs;
  in
    python.pkgs.buildPythonApplication {
      pname = "awesome-package";
      version = "0.0.1";
      src = lib.cleanSourceWith {
        filter = (path: type:
          ! (builtins.any
              (r: (builtins.match r (builtins.baseNameOf path)) != null)
              [
                "pip_packages"
                ".*\.egg-info"
              ])
        );
        src = lib.cleanSource ./.;
      };
      propagatedBuildInputs = (with python.pkgs; [
        numpy
      ]);
    }

The shell.nix:

{
  pkgs ? import ./pkgs.nix,
  pythonPath ? "python36"
}:
  with pkgs;
  let
    python = lib.getAttrFromPath (lib.splitString "." pythonPath) pkgs;
    drv = import ./default.nix { inherit pkgs pythonPath; };
  in
    drv.overrideAttrs (attrs: {
      src = null;
      shellHook = ''
        echo 'Entering ${attrs.pname}'
        set -v

        # extra pip packages
        unset SOURCE_DATE_EPOCH
        export PIP_PREFIX="$(pwd)/pip_packages"
        PIP_INSTALL_DIR="$PIP_PREFIX/lib/python${python.majorVersion}/site-packages"
        export PYTHONPATH="$PIP_INSTALL_DIR:$PYTHONPATH"
        export PATH="$PIP_PREFIX/bin:$PATH"
        mkdir --parents "$PIP_INSTALL_DIR"
        pip install --editable .

        set +v
      '';
    })

The release.nix:

{
  pkgs ? import ./pkgs.nix,
  pythonPath ? "python36"
}:
  with pkgs;
  let
    drv = import ./default.nix { inherit pkgs pythonPath; };
  in
    {
      docker = dockerTools.buildImage {
        name = drv.pname;
        contents = drv;
        config = {
          Cmd = [ "/bin/awesome-script" ];
        };
      };
    }

The MANIFEST.in:

include README.md
include LICENSE
include requirements.txt
include pkgs.nix
include default.nix
include release.nix
include shell.nix

graft tests

global-exclude *.py[co]

When using buildPythonApplication or buildPythonPackage, both expect that the source is a legitimate Python package (a directory that contains a setup.py).

The difference between buildPythonApplication and buildPythonPackage is that the buildPythonApplication does not prefix the resulting derivation name with the Python interpreter major version. Therefore buildPythonApplication should be used when the derivation is intended for applications that are not used like libraries (for example if they include scripts). However there are packages that are used like applications and libraries. In that case refer to the Nixpkgs manual and look for toPythonApplication documentation.

When using buildPythonApplication or buildPythonPackage make sure to use pname and not name. The name will be automatically created by joining the pname and version.

The numpy package is put into propagatedBuildInputs instead of buildInputs because Python is an interpreted language, which requires numpy package to exist at runtime.

The usage of lib.cleanSource and lib.cleanSourceWithis in order to ignore metadata files and directories such as .git along with a number of generated files such as the./result symlink created by nix-build. This avoids "store-leak" caused by repeated invocations of nix-build under the presence of changes to the src path. For more information see: https://gist.github.com/CMCDragonkai/8d91e90c47d810cffe7e65af15a6824c

Note the override of src used in the shell.nix because the nix-shell environment does not need create a temporary build directory. It just uses your current directory.

The unsetting of SOURCE_DATE_EPOCH is needed for building binary wheels.

We use pip_packages for pip installed local packages. This is meant to match node_packages used in Node.js. It makes it easier to delete if we need to wipe it out and install it again.

Note how pip install --editable . ensures that within the development environment, module directories are "importable" using absolute names. It also means that updates to the Python source code will be automatically used by any installed scripts that are now in the $PATH environment variable.

Note that pip will ignore any dependencies that are already installed by Nix.

Sometimes you want to force pip to ignore installation of any dependencies, especially when you already have all the dependencies required, and pip with setuptools does dumb things. In those cases, use:

installFlags = [ "--no-deps" ];

When doing this, you will probably also need to disable tests:

doCheck = false;

Remember to add /result to your .gitignore to deal with nix-build.

This allows your whole project to be used by developers who are not using Nix, and you can submit it to PyPi.

Overrides

Sometimes you need to override a package in the package set for all dependencies.

This can happen due to dependency collision:

Package duplicates found in closure, see above. Usually this happens if two packages depend on different version of the same dependency.

A common example is setting the backend of matplotlib to Qt4.

To do this, create an overrides.nix like this:

{ pkgs, pythonPath }:
  with pkgs;
  let
    pythonPath_ = lib.splitString "." pythonPath;
  in
    import path
    {
      overlays = [(
        self: super:
          lib.setAttrByPath
          pythonPath_
          (
            lib.getAttrFromPath (pythonPath_ ++ ["override"]) super
            {
              packageOverrides = self: super:
                {
                  matplotlib = super.matplotlib.override { enableQt = true; };
                };
            }
          )
      )];
    }

Then in default.nix change to using pkgs_ instead of pkgs:

pkgs_ = import ./overrides.nix { inherit pkgs pythonPath; };

The main reason this is required for Python, is that Python doesn't have the ability to keep multiple versions of the same dependency in the same project. Python's transitive dependencies must use the same dependency version for all their dependencies.

Installation

Install into Nix user profile:

nix-env -f ./default.nix -i

Using Docker:

# load the container from serialised image
docker load --input "$(nix-build ./release.nix --attr docker)"
# run the container
docker run awesome-package
# run the container with alternative command (works even with no Cmd)
docker run awesome-package /bin/some-other-command
# view contents
docker save awesome-package | tar x --to-stdout --wildcards '*/layer.tar' | tar t --exclude="*/*/*/*"

Development

# explore the nixpkgs package set
nix-repl ./pkgs.nix
# development environment (if there are multiple attributes, one must be chosen with --attr)
nix-shell
# you can also use nix-shell against a different package set
nix-shell -I nixpkgs=/some/other/nixpkgs -p "python3.withPackages (ps: with ps; [ dask distributed ])"
nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs-channels/archive/06808d4a14017e4e1a3cbc20d50644549710286f.tar.gz -p "python3.withPackages (ps: with ps; [ dask distributed ])"
# test the build (if there are multiple attributes, all will be built as ./result, ./result-*)
nix-build
# check the derivation
nix-instantiate
# show immediate dependencies (derivations)
nix-store --query --references $(nix-instantiate)
# show transitive dependencies (derivations)
nix-store --query --requisites $(nix-instantiate)
# clean the build
link="$(readlink ./result)" \
&& rm ./result \
&& nix-store --delete "$link"
# show the derivation closure in JSON
nix show-derivation --file ./default.nix
# show the derivation closure recursively in JSON
nix show-derivation --file ./default.nix --recursive
# show how your derivation depends on another derivation
nix why-depends \
  --include 'pkgs=./pkgs.nix' \
  --include 'default=./default.nix' \
  --all \
  default \
  pkgs.python36.pkgs.numpy

You can also use nix-gc to help clean up result symlinks: https://gist.github.com/CMCDragonkai/0908706df9c9dbc45575a2345fab93f1


Note the reason why this doesn't use pypi2nix is because pypi2nix doesn't share derivations/store-paths with the nixpkgs pythonPackages package set. This is not ideal. However if you really need to work on a custom Python package that has lots of packages not packaged in nixpkgs, then you can use pypi2nix instead. nix-community/pypi2nix#222


Dealing with data files is tricky in Python. Basically you have 3 options:

  1. Using package_data
  2. Using data_files
  3. Using MANIFEST.in

https://blog.ionelmc.ro/presentations/packaging

When using package_data, the files have to be inside a Python module directory. Once installed, they will sit in the installed Python module directory. To properly refer to these files in the module directory you have to use pkg_resources module which is provided by the setuptools package. Which means setuptools becomes a runtime dependency, which is not very nice. In python 3.7 we now have importlib.resources. If that file is only loaded by module code within in the same directory, it is sufficient to use this instead: https://gist.github.com/CMCDragonkai/2e0ae76e87537b708ed12ba05851d96b Note that you must use the name of the module directory:

package_data={
    'module_dir': ['file_in_module_dir']
}

See: https://importlib-resources.readthedocs.io/en/latest/

The data_files specification is designed to refer to files that aren't actually meant to be used from within Python. However it's also incredibly unreliable, as there are lots of ways of installing Python packages, and all of them may install them in different ways. For Nix, you can expect them to exist relative to the Nix store output path. You may use data_files for when you are building things that also involve non-Python code that expect things like man pages or things in the share directory.

Finally there are files that you expect to exist in the source distribution but not in the final build. You need to specify these files in the MANIFEST.in. These files are basically things like licenses, source documentation, tests... etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment