Skip to content

Instantly share code, notes, and snippets.

@webknjaz
Last active November 28, 2021 08:17
Show Gist options
  • Save webknjaz/a3311a968a5a7472bde4fefdd9c3f164 to your computer and use it in GitHub Desktop.
Save webknjaz/a3311a968a5a7472bde4fefdd9c3f164 to your computer and use it in GitHub Desktop.
A few thoughts on enabling reproducible builds with PEP 517

Huh?

I've got an idea to fill the void of the non-existent PEP 517 build requirements provisioning. The problem is that currently there's no first-class citizen support for reproducible builds in the PEP 517 world. People usually set lower bounds on their build backend of choice and that's about it. Even if some will set exact pins in their pyproject.toml, it's not enough because those entries may contain unpinned (transitive) dependencies. One way to do this with python -m build is to set PIP_CONSTRAINTS environment variable (I haven't actually tested this yet but I expect it to work).

That workaround is suboptimal and leaves the users with no answer on how to manage the invocations and the constraints files. I would, of course, go for pip-tools but it's only a part of the unsolved problem.

Oh boi

So here's a list all the of things I could think about:

  • external tools invocation:

    1. the constraints should probably be managed separately and the calls would be

      $ BUILD_REQS=$(\
      > python3 -c '\
      >  from pathlib import Path; \
      > from sys import argv; \
      > from tomli import loads; \
      > print("\n".join(loads(Path(argv[1]).read_text())["build-system"].get("requires", []))) \
      > ' pyproject.toml\
      > )  # extract the dependencies from `pyproject.toml` somehow
      $ echo ${BUILD_REQS} | python3 -m \
      >   piptools compile --allow-unsafe \
      >   --generate-hashes --strip-extras \
      >   --output-file constraints.txt \
      >   -  # this would need to be checked in Git
      $ PIP_CONSTRAINTS=constraints.txt python3 -m build
    2. or maybe we want to fully manage the dependencies provisioning, then, we'd do

      $ python3 -m venv constraints-generation-env
      $ BUILD_REQS=$(\
      > constraints-generation-env/bin/python -c '\
      >  from pathlib import Path; \
      > from sys import argv; \
      > from tomli import loads; \
      > print("\n".join(loads(Path(argv[1]).read_text())["build-system"].get("requires", []))) \
      > ' pyproject.toml\
      > )  # extract the dependencies from `pyproject.toml` somehow
      $ echo ${BUILD_REQS} | python3 -m \
      >   piptools compile --allow-unsafe \
      >   --generate-hashes --strip-extras \
      >   --output-file constraints.txt \
      >   -  # this would need to be checked in Git
      $ python3 -m venv build-env
      $ build-env/bin/python -m pip install $(echo ${BUILD_REQS}) -c constraints.txt
      $ PYTHONPATH=build-env/lib/python3.10/site-packages python3 -m build --no-isolation
  • PEP 517 aware tooling

    There is a possibility to have a custom build backend wrapping another target. It could maybe just take the constraints and return them as pinned dependencies via get_requires_for_build_wheel() and get_requires_for_build_sdist(). Plus every hook would be proxied to the main backend transparently. This would require extra pyproject.toml settings specific for the tool pointing at constraints files (given that it wouldn't solve the lockfile problem, just aid using them).

  • Multiple lockfiles?

    Allowing the package maintainers to specify requirements that are needed only for building sdists or only for building wheels is not solved by PEP 517 so we'll not try to do this either. But unless the whole the project is supposed to be only built under one very specific environment, the dependency tree will be differnt for those envs becauese some of the transitive dependencies may have their own dependencies declared as conditional. This calls for more than one constraints file. So it's probably reasonable to allowing to override the lockfile via PEP 517 config settings. And maybe this means that having a setting in pyproject.toml pointing at just one file is pointless? Or does this mean that we need to have a mapping with some env conditionals in the config for pointing at different constraints files?

  • On a different note, what are the use-cases for building projects with non-pinned dependencies? Are there any other than somebody not caring to have reproducible builds?

Now what?

I want to experiment with making a tool that would address those pain points. Not sure if it'll be just a CLI tool with subprocess invocations, or a PEP 517 build backend proxy, or 2-in-1, though. Extra idea: play with creating a tox plugin for this.

Package name ideas — help me solve one of the hardest problems in programming

  • prerequisite
  • prerequisites
  • toolchain
  • requirements
  • lock-in
  • consistent-build
  • build-bootstrap
  • predicatable
  • consistent
  • build-jail
  • vendor-lock
@pradyunsg
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment