This document lays out a set of Python packaging practices. I don't claim they are best practices, but they fit my needs, and might fit yours.
This document has been superseded as of January 2023.
This was written in July 2020, superseding this gist from 2019.
As of this writing, Python 3.5 is approaching its end-of-life and many packages have already set a minimum version of Python 3.6. This document should be superseded or disregarded no later than the Python 3.7 end-of-life. If you cite this as a justification for your behavior, please stop doing so at that time.
- For versioning, use versioneer
- For describing your build requirements, use
pyproject.toml
- For all static (and some dynamic-ish) metadata, use
setup.cfg
- A very small
setup.py
for dynamic metadata and to tie everything together - Use
MANIFEST.in
for files that cannot otherwise be included in the sdist
- Updates to how to include data files in packages.
- Reduce scope of
MANIFEST.in
- Do not use
include_package_data = True
insetup.cfg
- Reduce scope of
- Dropped
tests_require
fromsetup.cfg
-python setup.py test
was deprecated in setuptools 41.5 - Make
pyproject.toml
more clearly optional; some use cases are made harder by it.
- Versioneer is starting to show its age, and has not accepted a change since 2017. There are alternatives that are worth considering, but I haven't evaluated them yet.
My position in the Python ecosystem will color my perspective and approach to packaging, and, presumably, how much weight you give what I think.
I am most active with the NIPY collection of projects, generally related to neuroimaging and neuroscience. I am currently the lead maintainer of NiBabel and do a reasonable amount of maintenance work for Nipype, PyBIDS, fMRIPrep and a few other packages closely related to the aforementioned.
I am not an active developer in CPython, PyPA or any packaging-related tools. I have not followed deep arguments, but have relied mostly on PEPs, documentation and sporadic searches to identify the current state of the art.
So my perspective is less concerned with what packaging should become, and more with what works today and where things appear to be heading, as I look to prepare or update packaging infrastructure for several tools. Infrastructure I hope to stop thinking about so much.
Motivating my recommendations are a few desiderata, in rough order of importance:
- Installation should work, from source, on fairly old systems. Debian Stable (10; "buster") is my touchstone here.
- Prefer declarative syntax, and limit dynamic metadata, as much as possible.
- Enable revision-based versions, with minimal opportunity for error.
- Limit custom code to absolute minimum. (Partially redundant with limiting dynamic metadata.)
To operationalize (1), the following approaches should all install correctly:
pip install .
python setup.py sdist && pip install dist/*.tar.gz
python setup.py bdist_wheel && pip install dist/*.whl
python setup.py install
And development/editable mode should work:
python setup.py develop
pip install -e .
This also means that newer, better build systems that do not rely on setuptools are not really under consideration here.
To operationalize (3), all of the above should produce an install with the same version
string, and setting the version should be done from a version control tag if possible.
Assuming a git
repository, the following should also work:
git archive -o archive.tar.gz $TAG && pip install archive.tar.gz
I recommend on a setuptools-based approach,
using setup.cfg
to declare as much of the metadata as possible, along with an OPTIONAL
pyproject.toml
laid out in PEP 518.
Versioneer is used to handle versioning.
A bare minimum pyproject.toml
is as follows:
[build-system]
requires = ["setuptools >= 30.3.0", "wheel"]
Additional build dependencies such as cython
and numpy
might be put here.
I would relax this to a mere suggestion this year. While it's mostly been fine,
I have seen a case where pip install -e --user .
fails, so it's not as consequence-free as I thought.
As of setuptools 30.3.0, most packaging metadata can be set declaratively in
setup.cfg
.
The following skeleton can be used as a model.
[metadata]
url = https://github.com/your/package
author = You
author_email = your@email.tld
maintainer = You
maintainer_email = your@email.tld
description = A package
long_description = file:README.rst
long_description_content_type = text/x-rst; charset=UTF-8
license = GPL
classifiers =
Programming Language :: Python
[options]
python_requires = >= 3.6
install_requires =
packages = find:
[options.package_data]
* =
data/*
[options.extras_require]
doc =
sphinx
test =
pytest
coverage
all =
%(doc)s
%(test)s
I recommend against using the include_package_data
option, which counterintuitively
overrides the package_data
options with the directives in MANIFEST.in
.
I want to draw attention to the python_requires
metadata which will prevent pip
from
attempting to install on incompatible systems. When you drop 3.5 - or any other versions -
update the python_requires
to avoid breaking downstream tools that still install on unsupported
versions.
In addition to plain key-value pairs, there are some constrained options for common
dynamic metadata. For example, long_description = file:<filename>
allows you to place
a long description in a separate file, to be included in your documentation.
packages = find:
replaces the find_packages()
option often used in setup.py
.
Finally, interpolated strings are used in extras_require
to provide a meta-extra like all
.
I recommend not placing the version in setup.cfg
.
The dynamic components of my package setup are as follows:
#!/usr/bin/env python
import sys
from setuptools import setup
import versioneer
# Give setuptools a hint to complain if it's too old a version
# 30.3.0 allows us to put most metadata in setup.cfg
# Should match pyproject.toml
SETUP_REQUIRES = ['setuptools >= 30.3.0']
# This enables setuptools to install wheel on-the-fly
SETUP_REQUIRES += ['wheel'] if 'bdist_wheel' in sys.argv else []
if __name__ == '__main__':
setup(name='package',
version=versioneer.get_version(),
cmdclass=versioneer.get_cmdclass(),
setup_requires=SETUP_REQUIRES,
)
I place the package name in setup.py
mostly because, without this, GitHub will not recognize
your package to place it in its dependency graphs.
By using versioneer
in setup.py
as opposed to adding version = attr:package.__version__
to
the setup.cfg
, we avoid the issue of missing import-time dependencies.
versioneer.get_cmdclass()
tells setuptools
how to encode the current version into various
installation methods.
Finally, setup_requires
is mostly here as a fall-back to let old versions of setuptools
provide a user-readable explanation for failures.
Versioneer will set the version based on your git tag, and handle all of the install cases I described in desiderata.
This requires an additional section to your setup.cfg
:
[versioneer]
VCS = git
style = pep440
versionfile_source = package/_version.py
versionfile_build = package/_version.py
tag_prefix =
parentdir_prefix =
It can then be installed from your repository root with:
pip install versioneer
versioneer install
Once done, it places a copy of itself in your repository root, so other users do not need to install it for it to be used correctly.
N.B. Versioneer does not work out of the box with git archives for non-tag releases.
If you need any archived revision, this will not be sufficient. I don't know of a general
solution to that problem at this point, as git archive
substitution is quite limited.
This was probably the most confusing and thing to nail down, so I want to lay it out clearly.
The package_data
metadata determines what data files inside your package directory will follow
your Python files into their install location. Which is the same as saying these files will be
packaged in a wheel, as that is (more-or-less) unzipped into your site-packages/
directory.
MANIFEST.in
determines what data files are included in your sdist on top of what is included
in your package_data
. Use it to include anything outside your package directory that you want
included in source. Note that there are some
defaults
that you don't need to specify.
DO NOT use include_package_data = True
. That will change the rules of how this all works to
something even less intuitive.
The minimum setuptools
version needed for setup.cfg
to work is
30.3.0, although more fields
have been defined since then, and the minimum pip
needed for PEP 518 compatibility is
10.0.0.
As noted above, setuptools was the only system under serious consideration, simply because it has
long been the standard to run python setup.py
. Until pip 10+ is universal, alternative build
systems will create headaches that I don't want to deal with.
CentOS 7, for instance still packages pip 7.1 and setuptools 0.9.8, which means the above will not
work out of the box (though this may be changing... I'm having a hard time reading pkgs.org).
However, sticking with setuptools and setup_requires
ensures that a user will at least be told to
upgrade setuptools.
To the extent copyright can be claimed, I disclaim it under CC0.
Thanks for writing these great guidelines!
Regarding this one:
We have been using setuptools_scm to manage versioning in a few software packages.
The main issue that we have encountered in our workflow, that also combines frequent uploads to test pypi is that the version strings need to be monkeyed with to be compliant, which we do in our setup file.
Happy to hear your thoughts about this!