This is a proposal for a solution to help alleviate some frequent paint points with Python packaging. Mainly those things:
- Fix bad dependency requirements in metadata
- Fix bad build system in dependencies
- Provide more options per-depency
(
--pre
,--config-settings
,--index-url
,--find-links
, etc.)
This document mostly focuses on the expected user experience, not on how this should be implemented in installers.
This proposition is born from real issue reports and questions.
In most cases, when "pip" is mentioned in this document, one should read "Python package installers" instead.
Main discussion thread:
python -m pip install Application --overrides overrides.toml
where overrides.toml
looks like this:
[Library]
requirements = [
"LibraryNightly[cpu]",
]
[LibraryNightly]
index-url = "https://index.internal/simple/"
pre = true
In this scenario, the user wants to install Application
.
But in case it has a dependency on Library
the user wants pip to install LibraryNightly
with the cpu
extra instead.
Additionally the user wants
that pip looks for LibraryNigtly
on an internal index exclusively,
and that pip considers pre-releases of that LibraryNightly
.
Other potential dependencies should be handled as usual,
including potential dependencies of LibraryNightly
.
If the installer's dependency resolver encounters a dependency
whose name is a top-level section title in overrides.toml
then the installer should replace
this and any other occurence of this dependency
with the override as defined in that section.
Exclude some specific version ranges, similar to pip's constraints.txt
:
[Django]
requirements = [
"Django>=4",
]
Skip a library altogether, do not install it at all:
[importlib]
requirements = []
Enforce some extra:
[Something]
requirements = [
"Something[feature]",
]
Use specific distribution file for a specific library:
[Something]
file-uri = "/path/to/Something.whl"
Install the CPU variant of PyTorch:
[torch]
index-url = "https://download.pytorch.org/whl/cpu"
Install a pre-release of the CPU variant of torchvision
:
[torchvision]
pre = true
index-url = "https://download.pytorch.org/whl/nightly/cpu"
Install a nightly variant of TensorFlow:
[tensorflow]
requirements = [
"tf-nightly",
]
Fix build dependencies:
[pysam.build-system]
requires = [
"cython",
"setuptools",
]
Use alternative index for a specific dependency:
[requests]
index-url = "https://repository.internal/simple"
Use --find-links
for a specific dependency:
[Something]
find-links = "/path/to/wheelhouse"
Let's assume TOML for now.
We need something that is easy to write for humans and easy to parse for computers.
There is tomllib
to read TOML files in Python's standard library (since 3.11).
Suggested file name overrides.toml
.
Format: list of PEP-508 (or subset?) strings
Replace any requirement mentioning this project with the requirements listed here. List can contain the same project but with a different version range or different list of extras. List can also be empty.
Format: URI string
Install the dependency from this URI specifically.
Can be https:
or file:
.
Format: Path to a source code directory on file system as string
Maybe redundant with file-uri
Format: boolean
Option to install dependency as editable
.
Format: boolean
Format: URL string
Any search and/or download for this project name MUST happen on the specified index.
If a required dependency has a version already installed that satisfies the requirements, then no need to check that it was installed from a specific index.
Regarding index priority, this is not a concern in this proposal. If a required has no version installed that satisfies the requirements, and there is an index override, then only this index shall be considered and no other.
What happens in the installer's local cache (download or build)? Potentially there could be artifacts from multiple indexes for the same combination:
- project name
- version string
- distribution type (sdist or wheel)
How do pip and other installers handle this currently?
Format as defined by:
A list of currently existing issues with Python packaging that could be solved by this suggestion.
Some packagers put upper bound version constraints (upper caps) on the dependencies of their libraries. They have good intentions in mind. But it breaks things, and it is difficult, if not impossible, to un-break. No one can predict the future. Putting upper bound version constraints before knowing if a potential new major update (in the semver meaning) will actually break the usage of the library can be counter-productive.
Further reading:
"dependency conflicts when developing an application, where two dependencies couldn’t be installed due to (unnecessary) upper version bounds on their dependencies. Haven’t had this problem since we switched from poetry
to pdm
but what I remember is that you basically had to either wait for the maintainers to release a new version with updated dependencies or (temporarily) fork the dependency and bump the sub-dependency yourself. As an application developer I’d really appreciate an escape hatch that let me override whatever the dependency solver thought was correct and just let me use the version that I tell it to."
Source:
Examples:
Some packagers forget to list a dependency.
For example setuptools
or pip
, since it is often assumed that those are
always available anyway.
Or the other way around a dependency that is wrongly listed,
and is not actually needed.
For example listing something that is part of Python's standard library.
Some package that relies on a library that is no longer in Python's standard library but that has a drop-in replacement on PyPI.
Some projects have multiple (git) branches and dev teams try to let pip choose the right branch by putting the branch name in the version string. Maybe it would be better to have the branch name in the project name (this is easy-ish with the right CI/CD process) meaning that each dev should be able to override the actual dependency name for their own local dev environment.
Examples:
In some case one might want to allow pre-releases for one specific dependency only. Examples:
"It may still happen, but I think the bigger issue is that some config_settings are unlikely to apply to all packages involved in an installation. So you end up needing a syntax for “when installing package X, use this config, otherwise ignore it”."
Source:
Often packagers want to force the users of their library to install dependencies from a specific index. Of course it is not up to the packager of library to enforce the choice of an index on the user doing the actual installation. Actually what they want is to communicate to their users that they should install some dependencies from some specific index. But there is no standard way, so it is a bit difficult to communicate this correctly.
This also covers prevention against dependency confusion attacks (or at least some aspects of it, a full proxy/mirror is the better solution).
Examples:
Unspecified build dependencies.
Is it possible to override the [build-system]
of a dependency
or define one when it is missing?
Examples:
- https://stackoverflow.com/q/75372835
- https://stackoverflow.com/q/75409322
- https://stackoverflow.com/q/75583768
Examples:
- Use fork of a dependency for specific CPU architecture
This range of issues can usually be solved
by setting up and using an alternate package index server.
For example something as simple and straightforward as simpleindex
can solve quite a lot of such issues.
But for some users, the hurdle to setting this up might be too high.
And if this could be solved directly in the installers (pip
, and so on)
then I believe the configuration should be done as proposed here.
So I guess, this is something that can be considered optional. This proposition has value with or without the handling of indexes. It could be left out entirely, it could be tackled in a later format update.
If we take the pip case (but I guess the impact would be similar for other package installers) the impact in terms of necessary code changes (without breaking other things) is most likely big.
But it seems to me like the impact in terms of improvement of the developer/user experience could be worth it.
pip (and the larger PyPA ecosystem) works on the principle that all distributions across all indexes for a specific combination of "project name" and "version string" are considered to be the same. It means that if "Something v1.2.3" is found on PyPI as well as on another package index server, then it should be considered that pip will pick a distribution from a random server, there is no concept of priority here. So a rather large conceptual change might be necessary to accomodate for this proposition.
-
pdm
: https://pdm-project.org/latest/usage/config/#override-the-resolved-package-versions -
uv
: https://github.com/astral-sh/uv?tab=readme-ov-file#dependency-overrides -
Core Metadata
Provides-Dist
andObsoletes-Dist
. -
Use of a proxy/mirror index
-
Server containing package metadata only, where the metadata can be fixed after the distributions have been uploaded. This is available in conda ecosystem.
Maybe it could be useful if we could have some global settings,
so that it would act like a pip.conf
that is local to a project.
If we change the initial logic to "everything in the file should be installed"
then it could be some kind of superset replacement for requirements.txt
.
python -m pip install requirements.toml
This is kind of a last resort tool. Feature set should be small but very powerful.
In most cases, something is rejected because it would imply code complexity in the installer software beyond what seems necessary (complexity outweighs usefulness). If it turns out that the implementation is actually straightforward then the rejection can be reverted.
We could have a list of index server URLs, but it seems pointless for the use cases envisioned here. We should assume that who writes the override already knows exactly the one index that should be used for that library.
Use a proxy server instead (such as simpleindex
).
Here we were trying to offer the possibility to specify multiple overrides for the same library. Based on the version range, for example:
- If
Library
has its version string in range<4
then use overrideA
- but if it is in range
>=4
then use overrideB
.
It could have looked something like this:
Library = [
{
condition = "<4"
requirements = []
},
{
condition = ">=4"
requirements = ["AnotherLibrary"]
},
]
It would probably be a real mess for dependency resolver. How would a dependency resolver be able to handle this?
But the more important question: Is that even necessary?
When we write overrides,
we are in a situation where we know
what the dependency resolver wants to give us
and we know that it is not what we want and we know what we want instead.
So there is not much point in adding those conditions,
we already know that pip is gonna try to give us Library<4
.
There is also the risk of having logical inconsistencies (range overlaps and whatnot).
This would require some logic for conflict resolution required in case a configuration exists in multiple files with different values.
Sure, if we could design something that is able to handle this, then maybe we should, but that seems unnecessary complexity.
Maybe we need three file names and/or locations:
-
One file that can be pushed to the project's source code repository so that some configuration can be shared by all maintainers/developers. Especially if the project is an application where dependencies should be pinned and locked.
-
One file that can be placed in the project's source code directory without pushing into the shared repository (something that will be added to
.gitignore
) so that this developer can have their own preferences for that particular project. -
One file that can be placed in a user location (typically
.config/
on Linux, see XDG andplatformdirs
) for this user's preferred settings for all their projects.
Instead we want it to be always explicit. pip should not automatically pick up files, but the user should always specify explicitly on the command line. And also we rejected the idea of having multiple files anyway because of conflict resolution.
-
https://discuss.python.org/t/adding-a-global-config-to-specify-package-indexes/8599
-
https://discuss.python.org/t/dependency-notation-including-the-index-url/5659
-
https://discuss.python.org/t/python-packaging-strategy-discussion-part-1/22420/127
-
https://classic.yarnpkg.com/lang/en/docs/selective-version-resolutions/
-
https://docs.npmjs.com/cli/v8/configuring-npm/package-json#overrides