I am no Virgil, but having stumbled my way through Python packaging a few times already, I'm documenting the things I learn as I go here.
To help me learn, I took a survey of the top 15 Python packages on Github along with 12 other commonly referenced packages. I thought... if there are any best practices out there, surely these well-maintained packages must be using them. What I found was that it's very much a zoo. You have to find similar packages and average out what their setup.py scripts do.
A condensed version of what setup.py --help
spits out. You should at least
include these:
Information display options (just display information, ignore any commands)
--name print package name
--version (-V) print package version
--author print the author's name
--author-email print the author's email address
--url print the URL for this package
--license print the license of the package
--description print the package description
--long-description print the long package description
--classifiers print the list of classifiers
By far, the most common thing to do was to copy in the README (12 did this). But how? The official doc's (I can't find the link right now) example does it with with:
with open('README.rst', 'r') as f:
long_description = f.read()
Most just go for the one-liner: long_description=open('README.rst', 'r').read()
.
- I don't know
- I don't care
Well... we can do better than that. How about just inspecting their docstrings?
distutils (10 used this):
distutils
The main package for the Python Module Distribution Utilities. Normally
used from a setup script as
from distutils.core import setup
setup (...)
setuptools (16 used this, 6 of those fell back on distutils):
Extensions to the 'distutils' for large or complex distributions
Ah, so setuptools is more complicated than distutils. What does it give us?
>>> setuptools.find_packages?
Type: function
String Form:<function find_packages at 0xa408b8c>
File: /home/crccheck/env/tt-dev/local/lib/python2.7/site-packages/distribute-0.6.27-py2.7.egg/setuptools/__init__.py
Definition: setuptools.find_packages(where='.', exclude=())
Docstring:
Return a list all Python packages found within directory 'where'
'where' should be supplied as a "cross-platform" (i.e. URL-style) path; it
will be converted to the appropriate local path syntax. 'exclude' is a
sequence of package names to exclude; '*' can be used as a wildcard in the
names, such that 'foo.*' will exclude all subpackages of 'foo' (but not
'foo' itself).
So if you have a simple package, just enumerate them explicitly in a list (9 used this):
...
packages=['flask', 'flask.ext', 'flask.testsuite'],
...
But if you have a complicated package, you may elect to use find_packages
(8
used this):
...
packages=find_packages(exclude=('tests', 'example')),
...
On the opposite end of the spectrum, if you only have one .py file, you should
just use py_modules
instead of packages
, package_data
, and
include_package_data
.
For larger, more complicated packages, if you needs extra files included, you
can include them using package_data
to make sure setup.py installs them, and
make sure the package builder picks them up using a
(MANIFEST)[http://docs.python.org/release/1.6/dist/manifest.html]. Don't use
either of those. There are shortcuts:
To make a MANIFEST, you can put a set of rules into a MANIFEST.in file. If you
set include_package_data=True
, it will be as if you put all of those into
package_data
. I think this is setuptools only.
TL;DR: package_data is for suckers. Warning: It will grab pyc/pyo files if you let it!
Survey results:
- Used package_data: 6 (only 1 uses a helper)
- Used include_package_data: 7
http://packages.python.org/distribute/setuptools.html#including-data-files http://docs.python.org/2/distutils/sourcedist.html#manifest-template
Some packages won't let you install based on your version of Python. Instead of
doing special logic in your setup.py, you should use the Programming Language :: Python
classifiers.
But just in case you feel like you should, here are two I found:
if not hasattr(sys, 'version_info') or sys.version_info < (2, 6, 0, 'final'):
raise SystemExit("Restkit requires Python 2.6 or later.")
if sys.version_info < (2,5):
raise NotImplementedError("Sorry, you need at least Python 2.5 or Python 3.x to use bottle.")
-
Just putting the version in (9 did this):
setup( ... version="1.2.3" ... )
-
From reading
__version__
of the package (9 did this):import bottle version=bottle.__version__
Extra fancy:
from imp import load_source
version = load_source("version", os.path.join("restkit", "version.py"))
version = version.__version__,
-
Sometimes I see
__version__
as a tuple (2 did this):__version_info__ = (1, 2, 3) __version__ = '.'.join(map(str, __version_info__))
But I haven't seen a justification for adding all that extra complexity. I guess since you can do sys.version_info as a named tuple that pattern is out there in core. Example: https://github.com/benoitc/restkit/blob/master/restkit/version.py
- Just had to be different:
Django does it REALLY weird. It's setup.py:
# Dynamically calculate the version based on django.VERSION.
version = __import__('django').get_version()
setup(
name = "Django",
version = version,
...
)
django/init.py:
VERSION = (1, 6, 0, 'alpha', 0)
def get_version(*args, **kwargs):
# Don't litter django/__init__.py with all the get_version stuff.
# Only import if it's actually called.
from django.utils.version import get_version
return get_version(*args, **kwargs)
django/utils/version.py:
All to get this:
$ python setup.py --version
1.6.dev20130227014508
Do I have to use it? I kinda hate it.
Yes.
But I already wrote everything in Markdown.
Try pandoc. It's not perfect, but it will get you 90% of the way there.
$ pandoc -o README.rst README.md
setup.py is complaining about a missing README or README.txt
warning: sdist: standard file not found: should have one of README, README.txt
Just ignore it, everyone else does.
Use 'em. Use as many of them as you can. You are now an SEO expert.
See http://pypi.python.org/pypi?:action=list_classifiers
- sdist -n (dry-run)
- sdist
- sdist upload
- build (I don't actually use this, but I like to run it once in a while because it makes me feel like I'm doing something.)
- clean --all
Make sure you can run setup.py in an empty virtualenv. I have a virtualenv just for diagnosing setup.py that only has:
- ipython: Because ipython everywhere
- lice: To help maintain license info
- restview: To preview restructured text
Start simple, stay simple. 95% of Python packages could be written in under 100
lines, and most of those should probably be under 50 lines. After all, the only
thing you're just putting a dozen keyword arguments into the setup
function.
Tried to ignore it but when running pip install it cannot find a README file and fails. WTF. So if everyone else is ignoring this then why can't I? I'm using the pandoc convert README.md and read in to long_description method. But seems somehow I have to also supply a README file for pip to be happy.
Oh, and I have a setup.cfg that tells setup about README.md but it doesn't do anything. So far my experience with pypi and trying to release a package has been a frustrating mess.