title: Python packaging is better than you think created at: Mon Jun 20 2022 17:48:50 GMT+0000 (Coordinated Universal Time) updated at: Mon Jun 20 2022 17:48:58 GMT+0000 (Coordinated Universal Time)
Alternative titles: "Stop saying Python packaging is terrible", "Python packaging for the 99 %"
Proof that there's an audience for this: https://twitter.com/juanluisback/status/1538936104824492033
[https://twitter.com/juanluisback/status/1538936104824492033]
Unbundle what people really mean when they say that "Python packaging is bad":
- bootstrapping Python for development
- OS-specific
- surprisingly, more difficult on Linux, since there are too many options and also Python is a core part of the system
- hard only because there is no canonical method or bad docs
- problem solved by Anaconda
- pyenv too, more intrusive and Linux specific but offers a wider range of Python versions
- diagnosing packaging problems
- a real mess because bootstrapping is hard and therefore people end up with chaotic Python installations
- takes skill, but some simple tricks:
which python
(tells you where does it come from),which pip
,python -m pip
to make sure,import sys; print(sys.prefix)
to be really sure
- installing system-wide binaries based on Python
- use pipx or fades and forget about it
- avoid system Python like the plague
- you could use environments for this, but you'd have to remember to activate it, which is not very convenient: avoid it if you don't need it!
- managing environments
- absolutely not OS-specific after the bootstrapping is done
- only 2 kinds of environments exist
- conda environments, managed by conda
- python environments, managed by stdlib venv, pyenv, virtualenv, PEP 582
- dealing with non-Python dependencies
- Python native solution for non-Python dependencies is bundling shared libraries inside wheels. mostly works!
- however, wheels can be quite fat (tensorflow, pytorch), not have enough specificity (GPU vs non-GPU etc), not be available for certain packages (RAPIDS), or lead to incompatibilities (Cartopy & rasterio)
- conda solves this, and it will be difficult that pip solves this for the general case. use conda, it's fine!
- declaring environment dependencies
- Python cannot install/import several versions of the same package in the same environment as opposed to Node.js
- that might be a good thing though! security patches are applied uniformly. too long to discuss
- but this leads of course to conflicts! which must be handled somehow
- libraries doing weird things with dependencies is not Python's fault (now upper version pinnings are frowned upon for example)
- pip solves dependencies these days! even though backtracking is often not verbose enough for good diagnosis
- mamba is a blazing fast replacement of conda
- installing environment dependencies
- conda, pip, poetry, pdm work fine, there are probably others
- but there's lots of outdated advice: Pipenv is largely dead
- conda and pip don't interoperate very well, so they need to be combined with care
- pip-tools and poetry are currently lagging behind in terms of standards adoption and bug fixing, but they are excellent projects and will get there with some time
- publishing packages
- nowadays most needs are solved by PEP 621
pyproject.toml
- you can use setuptools, flit, hatch and pdm and your metadata will look 90 % the same
- a separate tool i.e. twine is needed for publishing, is it really that bad?
- nowadays most needs are solved by PEP 621
- hot-reloading
- editable installations are now standardized, not a problem for the majority
- unless you're using Meson, like SciPy does, in which case there's still no good solution
Looks good! A few thoughts while reading:
pyenv is Unix specific, not Linux. Also it's a fork of rbenv - the Ruby version is very standard, while the Python version is more of a "one of many" choice. Biggest issue with it is the shim mechanism is terrible and breaks python discovery for most tools (tox, nox, CMake, etc). I still use it though to grab a specific version of Python and just deal with the fact that nox "fails" on all the versions it can't actually access because they are shims. There are some projects out there for simpler binary distribution of Python that might help in the near future.
Strong second on pipx. That also solves the issue of using separate tools (
twine
,build
) - just usepipx run twine
orpipx run build
. Then they are never more than a week old and you don't have to pre-install anything. I usepipx run
for pretty much everything that's not in homebrew and used daily.For wheels, I'd say it's fine to use them unless you need something specific (like heavy data science work needing GPU PyTorch or something), then it's fine to use Conda. You don't need conda just for simple compiled dependencies - many libraries ship cibuildwheel built wheels these days. I'd recommend at least mentioning cibuildwheel, as it's been huge in simplifying compiled wheel building. That reminds me - compiling your own code with conda is usually painful - the
compilers
conda-forge package really helps, but if you don't include that, you often mix system and conda compilers and segfault. pybind11 gets an issue about once a week on that, not counting Gitter, I think.I'd mention hatch in the list of environment tools; it doesn't provide locking environments yet, but it does provide multiple environments (think nox/tox), which pdm/poetry do not. I'd probably call the PEP 517 backend "hatchling" as well, just to differentiate the two parts.
Is pip-tools lagging? At least pip-compile, there's no standard lock file format, so not sure there's a standard there to lag behind. And Poetry plans to support PEP 621 in Poetry 2.0 (but they've talked about 1.2 for years, and are over a year past first alpha for it and it's still not out; so they expect 2.0 to be very far off).
I like the mention of no capping, obviously. :)
Scikit-build is likely to be rewritten like Meson is. But we still won't have a solution of live reloading. There's also a project to make extension building something pluggable into PEP 517 builders (
extensionlib
by @ofek is a start on that project).PS: Not sure if all or any of the above needs to be in the article series, just pointing out things to make sure you know them, and then you can and should decide no the perfect level of detail to include. :) I usually put too much. ;)
Couple of followups for @zooba's notes:
PEP 582 is used by PDM by default (can be disabled), and I think I've seen it as an option on at least one other. It's a shame if it's completely dead, as manually enabling it as default in Python and using it with PDM is really pleasant. Maybe a good replacement would be to mention the Python Launcher for Unix? It has native support for
.venv
which is also great and fills a similar purpose.Agree, Pipenv is maintained by the maintainer of PDM (@frostming), and is very much not dead. It's not the "one and only correct way to manage environments" either, which is where it went wrong for a while.
I second highlighting that the system Python is used for making system packages. It's also "okay" to use the system Python for venv's if you are happy with the version (don't know if it's great practice, but it is easy and works without risking breaking anything).