Skip to content

Instantly share code, notes, and snippets.

@FooBarWidget
Last active June 15, 2016 09:40
Show Gist options
  • Save FooBarWidget/98f5686b98afd805a4ad31b10f407d3c to your computer and use it in GitHub Desktop.
Save FooBarWidget/98f5686b98afd805a4ad31b10f407d3c to your computer and use it in GitHub Desktop.

The challenges of packaging Passenger for Debian and Ubuntu

Phusion provides Debian and Ubuntu packages for the Passenger application server. Debian and Ubuntu already package Passenger, but unfortunately their packages are constantly out of date. So we publish our own packages and we keep them up-to-date with our source releases.

Debian/Ubuntu packages -- we'll just call them "Debian packages" from now on, even if you're on Ubuntu -- are easy to install and a pleasure to use. But making and maintaining them is hard work. We strive to release new packages every time a new Debian or Ubuntu version comes out. But sometimes it is harder than expected, resulting in delays.

Why is maintaining Debian packages so hard? In this article we will explain how we maintain our Debian packages for Passenger. This article may also be valuable to you if you have ever thought about making your own Debian packages for your own software.

You can expect the following content in this article:

  • An explanation of the core problems. Why do we need to publish our own packages in the first place? Why can't distributions and third party packagers help us?
  • Packaging challenges. What sort of issues make packaging Passenger hard and/or a lot of work?
  • The Passenger Debian packaging automation system. In order to address the packaging challenges, we developed an automation system to help us. How does this system look like, and what are its features and limitations?
  • Manual procedures. Even with the help of an automation system, what kind of work do we have to do on a regular basis to keep our packages up-to-date?

Table of contents

  • The core problems
    • Distributions update conservatively
    • Third-party packagers introduce lag
    • Packages are typically made manually
    • Passenger releases often
    • Passenger Enterprise cannot be packaged by third parties
    • Striving for high packaging quality
    • Solution: automate all the things
  • Packaging challenges
    • Debian packaging has a steep learning curve
    • Sheer number of distribution versions
      • Difficulty of targeting multiple distribution versions
      • Large build time requirements
      • Large amounts of testing needed
    • The special problem of having to package Nginx
    • Complying with the Debian packaging policy
  • Packaging automation system
    • TODO
  • Conclusion
    • TODO

The core problems

The Debian and Ubuntu package repositories are incredible: they provide tens of thousands of packages for almost every piece of open source software you can think of. The APT package manager helps you easily keep your software up-to-date with the latest security fixes.

Unfortunately there is one drawback: a lot of packages lag behind the official releases by the softwares' authors. This is caused by three factors:

  1. Debian and Ubuntu are conservative within the context of a single distribution version.
  2. Debian packages are typically made by packagers, not by the software's original authors.
  3. Packaging is typically a manual process.

Distributions update conservatively

Debian and Ubuntu are conservative within the context of a single distribution version.

Once a package is published for a certain distribution version, Debian and Ubuntu won't update that package to the latest version of that software. Any newer versions will find their way into the next distribution version, but the package for the current distribution version will generally not be updated anymore, except for occasional security updates.

For example, Ubuntu Xenial 16.04 comes with Git 2.7.4. Git 2.8 was released shortly after, but there won't be a Git 2.8 update for Xenial. If you want Git 2.8, you will have to upgrade your entire distribution: to Ubuntu Yakkety 16.10. The most that you can expect is a Git 2.7.5 security update for Xenial.

Debian and Ubuntu might make an exception for some popular non-system software. Debian and Ubuntu have "backports" repositories which may contain newer versions of e.g. Firefox. But this is an exception to the rule. The rule is that software is not updated, unless there is a security update. But even then, Debian and Ubuntu prefer to backport security fixes to the older software versions, rather than to update to the latest software version.

Third-party packagers introduce lag

Debian packages are typically made by packagers, not by the software's original authors. The process is typically as follows:

  1. The software's author releases a source tarball.
  2. Some time later, a packager notices this release. S/he makes a Debian package and submits this package Debian/Ubuntu. If the packager is an influential Debian/Ubuntu maintainer, then the process ends here, and the package is included in a repository: typically the repository for the next Debian/Ubuntu version; or if you are lucky, the "backports" repository.
  3. But if the packager is not so influential, then a review process is needed. Various Debian/Ubuntu maintainers will check the packaging work for compliance with rules and guidelines and will discuss the work. If the package is rejected then the packager has to submit an update. This is typically a long and tedious process.

For a lot of software, the "some time later" in step 2 is measured in months. Unless there is a security update, only the most zealous packagers will package more often. We can't blame them: as you will learn later in this article, packaging is hard work.

Step 3 can be avoided if the packager publishes to an unofficial repository instead of to official Debian/Ubuntu repositories. For example, Ubuntu PPAs (Personal Package Archives) are examples of unofficial repositories. This makes the process quicker, but step 2 is still a fundamental bottleneck.

Packages are typically made manually

It follows from factor 2 that packaging is typically a manual process. Very few parties generate their packages using automation. This is why packagers tend to need a few days to publish a new package.

Passenger releases often

Passenger is under constant development and improvement. On average we release a new version once a month. In the past we worked with various third-party packagers. These packagers had done excellent jobs and we are grateful of their work. Unfortunately, we release faster than they could keep up.

Passenger Enterprise cannot be packaged by third parties

We provide a commercial Enterprise version of Passenger. It goes without saying that Passenger Enterprise cannot be included in the official Debian and Ubuntu repositories. We also can't rely on third-party volunteers to make packages for Passenger Enterprise. So for Enterprise customers there is no choice: we have to do it ourselves.

Striving for high packaging quality

We strife for a high-quality, consistent packaging experience for all users, even those on older distribution versions. This is in contrast to Debian and Ubuntu themselves, who mostly care about updating packages in the next distribution version only.

This means that on every Passenger release, we need to release packages for all distribution versions, not just the latest. And all Passenger features must be available on all distribution versions.

Our ambitions are high, but this comes at a cost. It makes packaging a difficult and challenging endeavor for us, as you will learn in the Packaging challenges section.

Solution: automate all the things

We want our open source and enterprise users to have easy access Passenger Debian packages, and we want these packages to always be up-to-date. There is only one way to address the core problems: we have to maintain packages ourselves, and we have to automate the **** out of this.

"There is only one way to address our core problems: we have to maintain packages ourselves, and we have to automate the **** out of this."

As described in section Packaging automation system, we now have an automation system in place. But this system took a lot of time to develop. And even with automation, packaging is still challenging. In the next two sections you will learn why this is so.

Packaging challenges

Debian packaging has a steep learning curve

Debian packaging is not easy to learn. There are significant issues with the available documentation:

  • Most tutorials are targeted at third party packagers: that is, people who are interested in packaging software that they themselves did not write. However, this point of view may cause confusion for readers who are developers that want to package their own software; it certain did for us.
    • Many tutorials also assume that the software to be packaged uses the Autoconf build system. Again, this makes sense for third party packagers because most Linux software uses Autoconf, but this is not the case for Passenger.
  • The packaging ecosystem consists of a large number of tools, some of which have (partially) overlapping functionality. The documentation contains a large number of references to every tool, but there are few documents that make an attempt to explain the whole.
  • A lot of documentation is written incoherently and make no attempt at being friendly to novices. Especially Debian wiki pages -- which are one of the first results found on Google -- appear to be written by multiple authors who dump random notes on a page, making no attempt to streamline the whole. While this is fine for experienced packagers, the result is hard to read for newcomers.
  • The documentation that are written coherently are either presented in a hard-to-read format (such as in a 81-page presentation PDF), or are too simplistic, with few explanations of the core concepts. Or both.

The result is that we spent 1 month full-time on learning Debian packaging. It took us another 2 months to actually create the packages. During the learning process, we had to scour 6-7 different documents, read scattered man pages, and downloading tens of existing Debian packages in order to learn from undocumented examples.

Fortunately, we have compiled our findings in a conceptual overview document. If you are interested in Debian packaging, then our overview document may save you a lot of time.

Sheer number of distribution versions

Our packaging effort tries to support nearly all versions of Debian and Ubuntu that are actively supported by their vendor. At the time of writing, that means:

  • Debian Wheezy 7
  • Debian Jessie 8
  • Ubuntu Precise 12.04
  • Ubuntu Trusty 14.04
  • Ubuntu Xenial 16.04

We also try to support two architectures, namely x86 and x86_64.

The following factors make this hard:

  • The tooling has bad support for targeting multiple distribution versions.
  • A large amount of build time is required, potentially making it an expensive endeavor.

The difficulty of targeting multiple distribution versions

The tooling has bad support for targeting multiple distribution versions.

The Debian packaging tooling has little to no support for targeting multiple distribution versions. When you make a package, the tooling expects you to target a single distribution version only.

Why is this a problem? Because different distribution versions are structured differently. Roughly 70% of the structure remains the same throughout different versions while the rest differs. But a lot of packages have to be mindful of these small changes: dependency names can differ, directory locations can change, etc.

We want to be able to write a packaging specification that is 70% identical for all distribution versions, with only 30% difference on a per-distribution-version basis. We want to be able to tell the packaging tools "in general do this, but on Ubuntu 14.04 do this slightly differently, and on Ubuntu 16.04 do that slightly differently".

Unfortunately there is almost no support in the tooling to do this. So we had two choices:

  1. Make a unique packaging specification for each distribution version, e.g. by copy-pasting a previous one and changing things. This duplicates effort and is error-prone: if we want to make a change that applies to all distribution versions, then we have to update each distribution's packaging specification separately.

More importantly: this approach places an unreasonable burden on our available man power. Passenger is a complex project, and maintaining it is difficult enough as-is. Maintaining packages for multiple distributions in this manner can quickly turn into a full-time effort. 2. Invent our own mechanism to handle this problem. We chose to do this.

Why does support for targeting multiple distribution versions from a single specification not exist? It's because the Debian and Ubuntu organizational structure does not need it. When a new software version is released by the upstream authors, the Debian and Ubuntu packagers take the previous packaging specification, modify it for the latest software version and, more importantly, for the next distribution version, and publish that. They rarely publish packages for previous distribution versions, so they have little need for tooling that supports targeting multiple distribution versions from a single packaging specification.

Debian and Ubuntu sometimes publish for older distributions through the backports repositories. This is hard, manual work and involves a lot of duplicate effort. Debian and Ubuntu have thousands of volunteers so this does not matter a lot to them, but it is unworkable for us.

Large build time requirements

At the moment we need to target 5 different distribution versions over two architectures. For each target, we need to generate at least 6 packages. This means that for each Passenger release we need to generate at least 5 * 2 * 6 = 60 packages.

The reason why we need to generate so many packages is because we need to package Nginx and need to comply with the Debian packaging guidelines, which is explained later in this article.

Anyway, building so many packages takes a tremendous amount of computing resources. In order to keep server costs reasonable, we had to optimize the **** out of our building process. This optimization effort took a lot of time: the current state is a result of 3 iterations, spread over of 3 months full-time work.

Large amounts of testing needed

Packaging for each distribution version is different, so Passenger needs to be tested on each distribution separately. Doing this manually is undoable, so we had to invest time into creating automated tests and continuous integration infrastructure.

The initial time investment was considerable and took a lot of tweaking to get right. The test suite also needs to be regularly updated. At the very least, it needs to be updated whenever Passenger has seen major structural changes, and whenever a new distribution version is released.

The special problem of having to package Nginx

A special problem that we have to deal with is the fact that, until very recently, Nginx did not support dynamically loadable modules. That means that the only way to extend Nginx with a module (as is the case with Passenger and its Nginx integration mode) is by recompiling Nginx.

On Debian and Ubuntu, installing an Apache module such as PHP is only an apt-get install <insert-apache-module-name-here> away. But you can't do that with Nginx, so Debian and Ubuntu supply an Nginx package in which Nginx is compiled with almost all third party modules that currently exist.

How can we package Passenger's Nginx integration mode then? The only solution is to supply our own Nginx package in which Nginx is compiled with Passenger support. In order to avoid compatibility problems with users who had already installed Debian's/Ubuntu's own Nginx package, our package must support seamless upgrading from Debian's/Ubuntu's version, and our package must also be fully compatible. This is not trivial: maintaining compatibility requires a lot of knowledge about the different distribution versions and about the way they package Nginx.

Because compatibility is such an important goal for us, we chose to base our Nginx package on Debian's and Ubuntu's version. We even hope that one day Debian and Ubuntu will merge our changes back to their repositories. But this brought us another challenge...

Complying with the Debian packaging policy

If you want a package to be eligible for submission to Debian and Ubuntu, then your package's structure has to comply to strict policies. The Debian packaging policies cover amongst others:

  • Package naming conventions.
  • The need to split various functionality to subpackages. We are not allowed to put all functionality in a single package: we have to split them according to their purpose. For example, documentation must be in a separate package, and development headers must be in a separate package.
  • Filesystem hierarchies. We have to install our files in a manner that complies with the Filesystem Hierarchy Standard. We can't put everything in a single directory: binaries have to be put at a certain location, documentation must be put at a different location, libraries must be put at a different location, etc.
  • We are not allowed to vendor (include) any dependencies. Dependencies have to be separate packages that are separately installable. If a dependency isn't already packaged by Debian/Ubuntu then we will have to package that too. If a dependency is already packaged by Debian/Ubuntu, but not the right version, then we need to work with Debian/Ubuntu to update that package. This latter is especially challenging if we need to have a newer version of a dependency on an older distribution, because as mentioned before Debian and Ubuntu are not keen on updating packages for older distributions.

Complying with the packaging guidelines is a lot of work by themselves. It is said that this can easily be 50% of the total packaging effort, which is in line with our experience.

Packaging automation system

TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment