Skip to content

Instantly share code, notes, and snippets.

@crccheck
Last active March 2, 2024 16:39
Show Gist options
  • Save crccheck/3794099 to your computer and use it in GitHub Desktop.
Save crccheck/3794099 to your computer and use it in GitHub Desktop.
Python Packaging

What the Hell? -- A Journey Through the Nine Circles of Python Packing

Writing a setup.py

map

I am no Virgil, but having stumbled my way through Python packaging a few times already, I'm documenting the things I learn as I go here.

To help me learn, I took a survey of the top 15 Python packages on Github along with 12 other commonly referenced packages. I thought... if there are any best practices out there, surely these well-maintained packages must be using them. What I found was that it's very much a zoo. You have to find similar packages and average out what their setup.py scripts do.

Survey Results

Information You Should Include

A condensed version of what setup.py --help spits out. You should at least include these:

Information display options (just display information, ignore any commands)
  --name              print package name
  --version (-V)      print package version
  --author            print the author's name
  --author-email      print the author's email address
  --url               print the URL for this package
  --license           print the license of the package
  --description       print the package description
  --long-description  print the long package description
  --classifiers       print the list of classifiers

long_description

By far, the most common thing to do was to copy in the README (12 did this). But how? The official doc's (I can't find the link right now) example does it with with:

with open('README.rst', 'r') as f:
    long_description = f.read()

Most just go for the one-liner: long_description=open('README.rst', 'r').read().

Difference Between Setuptools and Distutils

  1. I don't know
  2. I don't care

Well... we can do better than that. How about just inspecting their docstrings?

distutils (10 used this):

distutils

The main package for the Python Module Distribution Utilities.  Normally
used from a setup script as

   from distutils.core import setup

   setup (...)

setuptools (16 used this, 6 of those fell back on distutils):

Extensions to the 'distutils' for large or complex distributions

Ah, so setuptools is more complicated than distutils. What does it give us?

Setuptools Has find_packages

>>> setuptools.find_packages?
Type:       function
String Form:<function find_packages at 0xa408b8c>
File:       /home/crccheck/env/tt-dev/local/lib/python2.7/site-packages/distribute-0.6.27-py2.7.egg/setuptools/__init__.py
Definition: setuptools.find_packages(where='.', exclude=())
Docstring:
Return a list all Python packages found within directory 'where'

'where' should be supplied as a "cross-platform" (i.e. URL-style) path; it
will be converted to the appropriate local path syntax.  'exclude' is a
sequence of package names to exclude; '*' can be used as a wildcard in the
names, such that 'foo.*' will exclude all subpackages of 'foo' (but not
'foo' itself).

So if you have a simple package, just enumerate them explicitly in a list (9 used this):

...
packages=['flask', 'flask.ext', 'flask.testsuite'],
...

But if you have a complicated package, you may elect to use find_packages (8 used this):

...
packages=find_packages(exclude=('tests', 'example')),
...

On the opposite end of the spectrum, if you only have one .py file, you should just use py_modules instead of packages, package_data, and include_package_data.

MANIFEST.in

For larger, more complicated packages, if you needs extra files included, you can include them using package_data to make sure setup.py installs them, and make sure the package builder picks them up using a (MANIFEST)[http://docs.python.org/release/1.6/dist/manifest.html]. Don't use either of those. There are shortcuts:

To make a MANIFEST, you can put a set of rules into a MANIFEST.in file. If you set include_package_data=True, it will be as if you put all of those into package_data. I think this is setuptools only.

TL;DR: package_data is for suckers. Warning: It will grab pyc/pyo files if you let it!

Survey results:

  • Used package_data: 6 (only 1 uses a helper)
  • Used include_package_data: 7

http://packages.python.org/distribute/setuptools.html#including-data-files http://docs.python.org/2/distutils/sourcedist.html#manifest-template

Checking for Minimum Python Version (don't do this)

Some packages won't let you install based on your version of Python. Instead of doing special logic in your setup.py, you should use the Programming Language :: Python classifiers. But just in case you feel like you should, here are two I found:

restkit

if not hasattr(sys, 'version_info') or sys.version_info < (2, 6, 0, 'final'):
    raise SystemExit("Restkit requires Python 2.6 or later.")

bottle

if sys.version_info < (2,5):
    raise NotImplementedError("Sorry, you need at least Python 2.5 or Python 3.x to use bottle.")

Grabbing Version

  1. Just putting the version in (9 did this):

     setup(
         ...
         version="1.2.3"
         ...
     )
    
  2. From reading __version__ of the package (9 did this):

     import bottle
     version=bottle.__version__
    

Extra fancy:

    from imp import load_source
    version = load_source("version", os.path.join("restkit", "version.py"))
    version = version.__version__,
  1. Sometimes I see __version__ as a tuple (2 did this):

     __version_info__ = (1, 2, 3)
     __version__ = '.'.join(map(str, __version_info__))
    

But I haven't seen a justification for adding all that extra complexity. I guess since you can do sys.version_info as a named tuple that pattern is out there in core. Example: https://github.com/benoitc/restkit/blob/master/restkit/version.py

  1. Just had to be different:

Django does it REALLY weird. It's setup.py:

    # Dynamically calculate the version based on django.VERSION.
    version = __import__('django').get_version()

    setup(
        name = "Django",
        version = version,
        ...
        )

django/init.py:

    VERSION = (1, 6, 0, 'alpha', 0)

    def get_version(*args, **kwargs):
        # Don't litter django/__init__.py with all the get_version stuff.
        # Only import if it's actually called.
        from django.utils.version import get_version
        return get_version(*args, **kwargs)

django/utils/version.py:

All to get this:

    $ python setup.py --version
    1.6.dev20130227014508

ReStructured Text

Do I have to use it? I kinda hate it.

Yes.

But I already wrote everything in Markdown.

Try pandoc. It's not perfect, but it will get you 90% of the way there.

$ pandoc -o README.rst README.md

setup.py is complaining about a missing README or README.txt

warning: sdist: standard file not found: should have one of README, README.txt

Just ignore it, everyone else does.

Classifiers

Use 'em. Use as many of them as you can. You are now an SEO expert.

See http://pypi.python.org/pypi?:action=list_classifiers

Running setup.py

  • sdist -n (dry-run)
  • sdist
  • sdist upload
  • build (I don't actually use this, but I like to run it once in a while because it makes me feel like I'm doing something.)
  • clean --all

Make sure you can run setup.py in an empty virtualenv. I have a virtualenv just for diagnosing setup.py that only has:

  • ipython: Because ipython everywhere
  • lice: To help maintain license info
  • restview: To preview restructured text

Keep It Simple, Stupid

Start simple, stay simple. 95% of Python packages could be written in under 100 lines, and most of those should probably be under 50 lines. After all, the only thing you're just putting a dozen keyword arguments into the setup function.

@ADraginda
Copy link

ADraginda commented Dec 6, 2017

newer versions of setuputils doesn't care if you use .md and will happily include it in your dist. (tested with 38.2.4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment