Skip to content

Instantly share code, notes, and snippets.

@FooBarWidget
Created June 10, 2016 15:18
Show Gist options
  • Save FooBarWidget/b6dabebd65b55c8cbf1b12016efae559 to your computer and use it in GitHub Desktop.
Save FooBarWidget/b6dabebd65b55c8cbf1b12016efae559 to your computer and use it in GitHub Desktop.

Overview of Debian packages

Anatomy

So you are a Debian/Ubuntu user. You search for a package with apt-cache search. You install a package with apt-get install. You already intuitively know these:

  • Packages contain basic metadata such as names and descriptions.
  • Packages may have dependencies.
  • Packages contain files.

Indeed. A Debian package -- a .deb file -- is sort of like a tar.gz or zip file containing metadata and files. It's not actually a tar.gz or zip: the format is ar but that's not important.

Building process

If you are a developer who wants to publish a Debian package, then you may intuitively think that the procedure is as follows:

  1. Write a file describing the metadata.
  2. Compile binaries.
  3. Run some tool to package the metadata and binaries together, sort of like tar -czf filename1 filename2 ...

This is sort of correct, but with a big caveat. It has got to do with who controls the compilation step.

If you have ever published Windows desktop software, OS X desktop software or Java software, then you may think that you need to generate binaries yourself, and then tell the packaging tool where your binaries are and where on the filesystem they need to be placed. Debian packages can be made this way, but almost nobody does it this way. And indeed, the tooling doesn't make it easy for you to do it this way.

Instead, the Debian packaging tools require you to specify:

  1. How your software is compiled.
  2. How your compiled binaries should be placed on the filesystem.

When you run the packaging tools, they compile the software for you according to your instructions.

The reason why the Debian packaging tools work like this have to do with the fact that they evolved from an open source setting where packagers are not the original developers. As explained in Third-party packagers introduce lag, third-party packagers wait for the developer to release a source tarball. Then s/he writes a specification that describes how the software is compiled. This is good for repeatability and sharing. Instead of depending the packager to run ad-hoc compilation commands, the compilation instructions are clearly specified so that other people can reproduce the work.

The debian directory

So the package building process expects metadata such as descriptions and dependency information. It also expects a specification describing how to compile the software. The packaging tools expect you to supply all of this information through a directory. This directory is typically called debian and placed in the top-level source code directory of the software you want to package. Inside this directory are at least two files:

  • control -- metadata such as descriptions and dependency information.
  • rules -- a makefile describing the compilation steps.

This package is supposed to contain everything related to packaging the software. That is, everything related to adapting the software for integration in Debian/Ubuntu. So the directory may contain additional files such as:

  • Man pages
  • Init scripts
  • Pre- and post-install scripts.
  • Patches.

dpkg-buildpackage: low-level package building tool

There are multiple Debian packaging tools that form a hierarchy. On the lowest level is dpkg-buildpackage. This tool expects as input a directory containing:

  • the software's source code.
  • a debian subdirectory.

dpkg-buildpackage compiles the software using the debian/rules makefile and generates a package with the metadata specified in debian/control.

pbuilder: ensuring that the packaging specification is correct

dpkg-buildpackage operates directly on the current operating system environment. This is fairly straightforward, but it has a drawback: it depends implicitly on your system's state. What does this mean?

Most software has dependencies. Various dependencies need to be installed before a piece of software can be compiled. Since Debian packages are supposed to be repeatable (other people should be able to take your debian directory and generate a package too), Debian allows you to specify build dependencies in the debian/control file.

How do you know whether your list of build dependencies is correct and complete? What if you have a build dependency installed, but forgot to list it in the control file? dpkg-buildpackage cannot help you there.

To solve this problem, the pbuilder tool was invented. pbuilder creates a clean, isolated Debian/Ubuntu system inside a chroot environment, installs all your build dependencies in there, then runs dpkg-buildpackage in there. If you forgot to list a build dependency then you will encounter an error.

pbuilder-dist: building packages for multiple distributions

A drawback of both dpkg-buildpackage and pbuilder is that they can only build packages for the Debian/Ubuntu version that you are currently running. What if you need to publish packages for multiple distribution versions?

The pbuilder-dist tool addresses this problem. pbuilder-dist builds on top of pbuilder and allows you to choose which Debian/Ubuntu version to put in the chroot. This way you can build packages for multiple distribution versions without having to install each distribution version manually.

Binary vs source packages: the art of repeatability

To be precise, a .deb file is a binary package: it contains the final compiled binaries. But there is also this thing called source packages. Every binary package has a corresponding source package. Confusingly, a source package is not a single file: it is actually three files, and the collection of these three files is called a source package.

What is a source package good for? Well, a source package contains everything you need to generate the corresponding binary package. A source package contains a complete specification, so it is repeatable.

A source package contains:

  • the source code of the software that's being packaged. This is called the "orig tarball".
  • a debian directory in archive (either tar.gz or tar.xz).
  • a signature file (.dsc) containing signatures of the orig tarball and the debian directory archive.

You think devops invented the concept of deployment repeatability? The Debian authors (and Red Hat authors, because RPM has a similar concept) thought of this a long time ago. :) Everybody can take a source package and use it to build a binary package without fail, no matter what they already have or haven't installed yet. If a source package is able to build inside a pbuilder environment, then it is able to build for everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment