Skip to content

Instantly share code, notes, and snippets.

@wvxvw
Last active May 4, 2023 00:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wvxvw/0ecaaa643b4d7db7615e17268230e841 to your computer and use it in GitHub Desktop.
Save wvxvw/0ecaaa643b4d7db7615e17268230e841 to your computer and use it in GitHub Desktop.
Draft about Python infrastructure problems

Goals and Metrics

Before evaluating different tools and approaches it’s important to make explicit the objectives and the measure of success.

Most importantly, a program needs to be correct enough to be useful. The “enough” aspect of this usefuleness is the reliability of the program. A non-trivial program will typically have infinitely many possible results for similarly infinitely many inputs. The relationship between the subset of the inputs for which the program is correct and the set of plausible inputs is captured by robustness. It is common in computer science to treat non-terminating programs as erroneous. In practice, the longer the program takes to run, the more likely it is to be terminated forcefully, before it produces any useful results, i.e. it’s forced to be incorrect. In other words, the resource footprint of the program captures yet another possibility of obtaining incorrect results.

An important subset of robustness is the ability of the program to take inputs that include large disjoint subsets, typically referred to as “platforms”, or, in other words, cross-platformness.

Some programs are meant to be infinite. They don’t produce results, rather, they have side-effects that occur as a result of arriving at a particular internal state or as a result of an interaction with exteranl inputs. While their side-effects can be interpreted as results, in practice, such programs tend to work in cycles; the ability to maintain a steady resource footprint through cycles is the measure of program’s performance.

A related aspect of a program execution (not necessarily an infinite one) is its ability to scale its resource footprint with the size of its input. This is often also called performance, and more specifically, scalability.

Yet another aspect of performance is the ability of the program to utilize all available resources. This often includes ability of the program to be executed in parallel, ability to use secondary storage, if primary storage fills up etc.

Finaly, but very importantly, human ability to assess correctness, to find acceptable inputs is the ergonomics of the program.

Historical Perspective

In the early days of Python, the appeal of the language was its conscise syntax, prototyping speed especially enhanced by the ability of interactive development and the shortened write-test cycle. This was contrasted to the red-taping common in corporate environment which stifled the process and created unwieldy, “pompous” programs.

The “art” of Python was in creating short and unsophisticated programs which may lack in robustness, while maximising ergonomics. This was also the fate of Python packaging.

The early Python community consisted of curious programmers wanting to escape the bureaucracy of corporate boredom. These were typically seasoned programmers who valued substance over form, functionality over safety, ability to modify a program over rich feature set.

This led to early infrastructure tools being developed “on the back of an envelope”, with minimal feature set and complete disregard to safety. After all, they expected their users to be their peers: programmers able to identify relevant safety risks, modify programs to add or remove features etc.

This is how distutils was created. For very little code this module contained it was able to do a lot. But it couldn’t compete with tools like Maven.

Eternal October

The explosion of Python’s popularity dilluted its community with newcomers. The first wave of the newcomers were the “lazy” programmers. Likely just as epxerienced as the founders’ community, however focused on their ability to capitalize on predecessor work rather than putting in their own efforts. This community wanted more features. And this is how setuptools happened. It was a tool created by the members of the initial community for the novices. In order to reduce the effort of writing a new library and to maximise ergonomics by retaining the old interface the library was hacked on top of distutils.

It was never intended to be as robust or as feature-rich as Maven since its authors didn’t value these qualities. It was a means of keeping the community going.

Similarly, at the same time and in a very similar way the community created easy_install, distribution, pkg_resources etc.

Suffering from Success

As time passed, the “lazy” fraction of the growing community completely replaced the original developers. Now the “lazy” programmers became the authority deciding the fate of the language and tools. This brought the realization that the existing tools still require too much work to use them comfortably, while providing too few features. The community started to fill up with “clueless” programmers who not only didn’t want to modify their tools, but were afraid of doing so. This was about the time when Python started being used as the language to teach introduction to computer science, in bootcamps etc.

The “lazy” programmers saw various reasons to invest into infrastructure, but were prone to falling back on “easy” approaches, incrementalism, and fear of big changes. The latest substantial big change was pip, which promised a lot, but due to incrementallist strategy and 80/20 rule kept under-delivering, and, unfortunately still does.

The “lazy” programmers who saw an opportunity to capitalize on Python’s success while doing very little meaningful work created PyPA, which is the body that today decides how Python infrastructure should work.

The Schism

Dissapointed with slow development of Python infrastructure which instead of robustness seemed only to offer a lot of flakiness (a kind of poor correctness where the program doesn’t always match the same results to given inputs) a group of “smart” and motivated Python programmers decided to split off, and, possibly, make some money on providing better infrastructure for Python.

This is how Anaconda Python was born. Unfortunately, while trying to fix every problem pip has, conda, very ambitiously, promised way too much compared to what the programmers behind it could possibly deliver. The corporate nature of the project and lack of engagement with community at the design stage of the products offered by Continuum.io gave rise to “feature creep”, chasing every customer while catching none.

A Feast in Time of Plague

Various missteps mentioned above led to the tragic situation we are facing at present. While some lead developers from Continuum.io sit on the PyPA’s board, conda and pip styles of Python infrastructure are mostly ignoring each other. conda will never work with packages that pip understands, however, caving to popular demand, it implemented some interop with pip, the interop is but skin-deep and has no chances of getting better.

pip community, on the other hand, are creating more and more ironically “generic” tools which are incompatible with everything in conda community. Even more ironically, PyPA set on a crusade against setuptools while providing substitutes which use setuptools under the hood.

Even more ironically, the “innovations” by PyPA, while completely meaningless are hailed by the “clueless” wave of Python developers as huge improvements. The “clueless” Python programmers are the ones who absolutely need hands-off tools that ensure uniformity of approach sacrificing every other desirable aspect of software to the ability at minimal cost to replace programmers working on a project.

Review of Available Tools and Ways to Compensate for their Shortcomings

conda, conda-build, meta.yaml

These tools ignore pip world and everything that happens in PyPA’s meetings. They will never deel with Eggs or Wheels, nor will they ever respect requirements.txt, pyproject.toml or setup.cfg. Their developers have no incentive to do so.

conda-build, however, has very poor community support. Practically no documentation. The developer support is virtually non-existent. From their bug tracker it looks like they’re overflown with bug reports and any attempt to fix one generates more problems than it solves. All conda tools suffer from huge resource footprint often times making builds prohibitively expensive (eg. a machine running conda-build may run out of memory, or the build will be killed due to CI job timing out). Cross-platform support leaves a lot to be desired. In particular, there is no cross-compilation support, and MS Windows builds must use cmd shell.

conda-build also has a somewhat peculiar take on C/C++ compilation, and, specifically on CMake. This makes it hard for developers who want to expose native Python modules to Python to work with the tool.

While it’s relatively easy to create package archives compatible with conda while not using any of conda tools, the lack of any formal documentation of conda package format and any guarantees from Continuum.io as to the future package format, makes rolling your own version a dubious prospect.

So far, I’ve been grudginly using conda-build when I have to create conda packages, but in a new project I will likely consider the possibility of writing a very simplified substitute in the interest of saving resources and reducing flakiness of my builds.

pipenv, Pipfile

Deserves specific mention in this context. This was a huge PR attempt by the author of another very popular Python package: requests. Similar to requests, which is an alleged ergonomics wrapper of urllib3, pipenv was an attempted ergonomic wrapper of pip and virtualenv. The project pushed for early PyPA adoption and promotion, but eventually lost its association with PyPA.

This project was trying to bring in the concepts from Ruby world, s.a. “frozen requirements”. This comes from an observation that there’s usually a difference between the exact set of packages against which the developer programmed when creating the program and the intended set of packages with which the program needs to be used.

While there are those who believe that “frozen requirements” confer some benefits to the project reliability, (because this naturally reduces the surface for compatibility errors). I see this as “cheating” because quality software should aspire to work for all supported versions as much as possible. Programming against multitude of versions forces less assumption in interfaces, prevents hacks based on knowledge of internal functionality. Overall, it should create programs that have longer shelf life.

In conclusion: I see no value in Pipfiles and I don’t anticipate a PEP any time soon codifying this practice.

pip, requirements.txt

Before pip even existed it was advertised as an unmistakable improvement over setuptools. This advertisement was based not on the real ability but on the promise of one. The promise, that for the most part was never implemented. pip promised to be a package manager, but for the long time it wouldn’t even qualify as package installer due to its inability to properly resolve dependencies. Starting with version 20.2 pip may be used as package installer, but is rarely used as such. In particular due to the default treatment of source packages which are seen as valid candidates for distribution package installation.

Another promise pip made, but never implemented was its ability to remove installed packages. While the command is there, the removal is fundamentally broken. This is tied to the fact that pip is not a package manager and is unaware of the dependencies in the environment where packages are being installed. Use of pip, thus, will often lead to creation of inconsistent environments (i.e. environments that have conflicting dependencies simultaneously installed).

As mentioned above, pip is useless in the context of Anaconda Python.

It is, unfortunately, very common to use requirements.txt as a way of specifying project dependencies. This approach has following drawbacks:

  • Because the file is usually created by running pip freeze, it risks storing inconsistent environment.
  • The resulting file needs manual editing to make it cross-platform, or to enable larger flexibility in version of required package.
  • The file typically requires both the root dependency and all its sub-dependencies. This makes it hard for the developer to understand which dependencies are actually referenced in the project and which were installed, perhaps, by chance, as a possible choice of many.
  • The file allows repetitions with undefined results.
  • The file allows specification of unrelated command-line options for pip.

Another noteworthy aspect of pip functionality briefly mentioned above is that by default pip will install source packages (i.e. tarballs) by executing setup.py found in the tarball. This is both a security hazard and will typically result in an installation of the toolchain necessary to build the package – something the user may have no intention of doing.

This laxadasical attitude to what’s being installed on users’ machine led plenty of developers to assume that it’s OK to distribute source code only. Worse yet, a lot of packages come bundled with their test suites, documentation, code examples, build toolchains and so on. All being installed unbeknownst to the user.

The hesiation about the contents of binary distributions and poor support in initial stages led to ofther unfortunate effects. For instance, while it is valid to distribute a package with bytecompiled Python files only, the common practice is to package both the source and the bytecompiled versions into the same archive.

Similarly, the scripts and the data directories from binary distributions are installed by default, which both creates bloat on users’ machine and may result in unanticipated conflicts since pip will not verify that two packages provide the same script.

On top of this, by default, pip doesn’t check checksums (hashes), which makes installations suspect to errors related to transmission errors.

In conclusion: one should always keep in mind the drawbacks of using pip. As much as possible minimize or completely remove all use of this tool in production environments. If possible, execute pip install --only-binary, similarly, use --require-hashes when installing and --exists-action abort to fail creation of inconsistent environments.

On top of this, try to avoid using PEP-508 URLs in requirements.txt if you cannot avoid the use of it altogether. In particular, in production environments as this side-steps however minimal auditing the packages receive when uploaded to the public package index.

Poetry

I don’t have personal experience of working with this tool. My understanding is that it is trying to be a substitute for pip when installing packages while also being a build tool. Unfortunately, compared to setuptools, Poetry seems to lack support for building native modules, which makes this a no-go for plenty of popular packages.

pyproject.toml

This file (described in PEP 621) is an iteration on setup.cfg, however both are often used together. There was never a real need for this PEP, as the perceived benefits of having this file are in the future, when someone will write a tool to actually use this format. The idea was to have a configuration file to describe how certain common interactions with a project should happen. In particular, this format is supposed to describe “project’s dependencies” as well as “build system”. However, neither is generic enough to accomodate actual alternative tools for installing dependencies or for building distribution packages. These are modeled on corresponding settings from setuptools, but will never work with tools like conda or conda-build.

In the light of the above, “project dependencies” deserve special treatment. While its easy to understand what package dependencies are, projects have many different needs. Projects may need a special set of dependencies for editor integration, for testing, linting, other means of quality control, for packaging, for integration with other popular tools. There is no really exhaustive list of what these may be. The format of pyproject.toml was not designed to accomodate this diversity. In its sensless battle against setuptools, PyPA engineered this format, specifically, to create an alternative configuration for setuptools, which is available under [build-system] section.

Later, PyPA realized that different tools may want different dependency specification, but they never distinguished these sets by purpose, instead, they decided to associate them with specific tools, eg. [tool.poetry.dependencies]. Thus, in an attempt to create a generic format that could be exchanged between different tools PyPA created a format that essentially contains settings for individual tools, making it no better than simply configuring the desired tools using their original configuration.

Later still, they realized that there’s no reason that Poetry and setuptools use the same settings, so the contents under [build-system] are not inscribed in the PEP, but no real cooperation happens: if a developer configures pyproject.toml to work with Poetry, it’s unusable with setuptools, and conversely, projects configured for setuptools are not usable with Poetry.

More commonly, this file is used to record configurations for different tools used by project developers. This comes as a mixed blessing because translation of complex configuration between the original tool’s format and TOML isn’t always straight-forward or even possible. This usually results in reduced degree of control a developer has over the tools they use while simultaneously increasing the number of indirections necessary to use the tool.

In conclusion: I believe that pyproject.toml has no real value. There is no real need for developers working on a project to use multiple equivalent build toolchains or linters etc. Thus making this information generic while very hard on one hand is completely unnecessary on the other hand.

Still, this format cannot bridge the gap between radically differrent, yet very popular build tools, which makes developers who need to bridge this gap search for alternative solutions.

Old reliable, setuptools, setup.py, setup.cfg

Very early in its history setuptools developers understood that they’ve made a mistake. pip was one of the attempts to rectify it, setuptools2 was another. Developers tried to kill this project through neglect and abandonment. Scaring users with deprecation warnings. Yet, even conda relies on this project to function. The code quality of this project is abysmal. Resource footprint and utilization are laughably bad. Documentation and feature design will make an adult man cry. Oh, and there’s also version and commit history. Writing setup.py s.t. it can work with multiple versions of setuptools is a particularly masochistic exercise.

And yet it’s indispensible. No Python programmer yet rose to the task of replacing setuptools. And, under PyPA’s leadership and with the current state of Python community my estimate is that such a programmer does not and will not exist.

Below is a non-exhaustive list of some of the bad ideas implemented in this tool.

  • The install command is wrong. Instead of building and installing a package it does some machination to copy files around, skipping the most important part of actually building the package. This results in inconsistencies between the environment developers work in and the environment in which users will install the built package.
  • The develop command (a.k.a. pip install -e .) should have never existed. This is an even bigger hack than what install does. Instead of copying the source code (which is somewhat reminiscent of the distribution package) a egg-link file is created in platlib that links the source directory of the package being developed to platlib. This has countless footguns related to path resolution, location of shared libraries, accidental (dis-)inclusion of parts of the package in distribution package etc.
  • The build command is begging a questions: so what does bdist_wheel (or other bdist commands) do? Doesn’t it build a Wheel?
  • Inclusion of non-Python files in source distribution and in binary distribution uses different configuration settings and, subsequently, will work differently if you use install command or install the built package. (Ha-ha, gotcha!)

And yet, despite all the warts and ugliness, I contend, that this is the only tool on the market today that delivers the goods. The most important part of it is that it retained some of the features the early Python programmers put in it: if you know how, you can override and change everything. If things don’t work your way, you can make them.

A special note on setup.cfg: it’s not worth using. It was one of PyPA’s early (and evidently unsuccessful) attempts at killing setuptools by introducting what they thought would be a declarative way to configure project dependencies. The lack of understanding of that that may ential combined with the lack of imagination of how other tools meant to be configured by this format would want to be configured became soon apparent, and, to my knowledge, there are no plans presently, to continue work in this direction.

However, before the poor cooperation between setup.cfg and other tools commonly invoked from setup.py (s.a. test runners, linters etc.) became apparent, the initial atempt was made to deny setuptools the support from those other tools (which would usually come in a form of those tools implementing setuptools commands). Notably, pytest and very recently sphinx became victims of this shortsighted policy.

The suggested way is to… still use setuptools, but now instead of accessing the necessary commands from setup.py one needs to install mediator packages that would enable developers to connect pyproject.toml with those tools. This will, of course, unnecessarily increase the number of indirections, hinder debugging, while making repeting automatic operations involving such tools flakier due to the astounding rate of change in versions and feature sets of these tools.

In conclusion: while setuptools is a minefield, it’s the only option that can be made to work well. It is flexible enough so that it’s possible to achive good resource footprint and utilization, with enough care, it can be made reliable, and, of course, the robustness is the function of how much the developer is willing to invest into their tools.

While, in order to hit all those goals one needs to replace virtually every part of setuptools, like the ship of Theseus, it will sail on where others will sink or lay ashore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment