This document discusses the issues with pip's upgrade behaviour and proposes a resolution for the various issues.
There has been a lot of discussion and half-done work related to modifying pip and it's behaviour related to upgrading packages. Some of these discussions are spread across various Github Issues and Pull Request and even buried in Mailing List Archives. This document is the result of an attempt to assimilate all the ideas, proposals and problems discussed in all these places into one place.
It will be helpful to establish a minimum vocabulary to simplify the reading and discussion of this document. This section defines certain terms for the scope of such discussions.
- Upgrade
Install the latest allowed version of a package
- Upgrade Strategy
Logic used to determine which packages should be upgraded.
There are a few types of upgrade strategies that are repeatedly mentioned/talked about in this Post, it makes sense to explicitly define them as well.
- Eager
Upgrade dependencies regardless of whether they currently satisfy the new parent version-specification.
- Non-Eager
Upgrade dependencies if and only if they do not satisfy the new parent version-specification.
- Non-Recursive
Ignore dependencies and do not upgrade them.
The phrase "<upgrade strategy type> upgrade" is equivalent to the phrase "upgrade using <upgrade strategy type> upgrade strategy".
Do note that the parent package is always upgraded. This is because that is the intent of running the upgrade on the parent package in the first place.
pip currently has only one way to upgrade a package, install --upgrade
. This command-option pair follows an eager upgrade strategy1.
pip handles multiple version-specifiers for a single package in a "first found wins" manner, the first version-specifier is acknowledged and latter version-specifiers are ignored.2
The eager strategy followed by install --upgrade
also uninstalls a package which currently matches the requirement of the new parent package, to install the latest version that matches the version-specifications. While it is useful on some occasions but this can cause unexpected behaviour on many occasions3.
Eager updating may result in unnecessary upgrading of packages45. This updating may even be undesirable in some cases (eg. some packages may need extra configuration to work).
This issue is so problematic that various prominent scientific packages lie and don't list their dependencies if they are already installed, to (understandably) avoid a reinstall of such packages (scipy setup.py, statsmodels setup.py).
pip ignoring version-specifiers along with defaulting to eager updating can easily result in breaking the user's environment. It is possible that installing/upgrading a package can result in an environment in which a package does not have a correct version of a dependency6. Breaking the environment is bad enough. Doing so silently is even worse.
Even if eager upgrades may be useful in some cases, currently they can cause surprising behaviour in other cases, which is undesirable. Thus, it does not serve as a good default upgrade strategy.
Non-eager upgrades are the most desirable behaviour we can provide without adding a dependency resolver. They behave less surprisingly than an eager upgrades. Since they make less modifications to the user's environment, there's a reduced risk of breaking the user's environment. There is consensus that non-eager upgrades should be the default upgrade strategy78. The packages that lie about their dependencies can stop lying as pip would no longer do unnecessary upgrades.
Change the behaviour of pip install
to upgrade packages by default, non-eagerly. This behaviour is consistent with how various OS-level package managers work and simplifies the "just install this package" UX to pip install pkg
.
pip would also stop providing support for the eager upgrade strategy. This is because the behaviour is highly undesirable and dangerous in nearly all cases. Usefulness of this behaviour should be re-evaluated after adding a proper dependency resolver in pip.
There is a wart in --target
command that it uses --upgrade
as a flag to change behaviour to overwrite, rather than skip, an item when the target location is already existing. It is felt that this behaviour, of overwriting is not very useful and the said functionality can be removed, unless needed on a future date, when it may trivially added back in the future via a new option, if needed.
There was an unintended change in behaviour made due to how this was implemented, an item passed as a path or URL would now always be installed. This behaviour had been previously discussed separately and it was felt that this change would have been made eventually; Any effort put in to undo this change would later be undone in another patch. Thus this unintended change was not undone and shall be kept.
While this change would be backwards incompatible, the change would be made without a deprecation cycle. However, to prevent immediate breakage of existing scripts, --upgrade
flag would not be removed immediately and would become a no-op in the aforementioned major version of pip.
It was proposed to change the behaviour of the "--upgrade" switch of pip install to be non-eager. While it caused minimal changes to the UX, it was rejected in favour of providing a behaviour consistent with existing package managers.
It was originally proposed to add an "upgrade" command, that moves the upgrade-a-package UX to pip upgrade
and defaults to non-eager upgrade. The current install --upgrade
switch would be deprecated. While it was agreed upon by pip's developers at one point, the proposed upgrade
command would have be largely similar to the existing install
command which seemed like unnecessary baggage. Moreover, the UX would possibly have been made unnecessarily complex by the addition of such a command.
It had been expressed that an opt-in to eager upgrade would be potentially useful in some cases.9 On further discussion, it was felt that such behaviour need not be provided initially due to it's environment breaking behaviour. It was felt that this may be discussed on a future date once pip gets a dependency resolver. Since it would be easier to add this behaviour, rather than remove, on a later date, it would be fine to not provide this in the first-iteration.
It was proposed to to add a new flag as a replacement flag for --upgrade
, meant for --target
. For consistency, this flag would have had some effect, not replacing/modifying existing installations, even if --target
was not passed. This was rejected as both the behaviour of --target <dir> --upgrade
and of the new non-target no-replace installations were not seen as important enough.
It was proposed to go through a deprecation cycle, to ease the shift from older eager upgrades to new non-eager upgrades. It was felt that a deprecation period would be as cumbersome as a single major-version-release switch of behaviour. In view of saving time, backwards-incompatible changes shall be shipped without a deprecation period in the next major version of pip.
The assertion that "This behaviour is consistent with how various OS-level package managers work" needs citation. As far as I am aware, both apt-get install and dnf/yum install won't upgrade already installed packages just because a new version happens to be available - they have separate commands for that, while "install" will only potentially force an upgrade or downgrade when the current system state doesn't satisfy the installation request.