So you are a Debian/Ubuntu user. You search for a package with apt-cache search
. You install a package with apt-get install
. You already intuitively know these:
- Packages contain basic metadata such as names and descriptions.
- Packages may have dependencies.
- Packages contain files.
Indeed. A Debian package -- a .deb file -- is sort of like a tar.gz or zip file containing metadata and files. It's not actually a tar.gz or zip: the format is ar but that's not important.
If you are a developer who wants to publish a Debian package, then you may intuitively think that the procedure is as follows:
- Write a file describing the metadata.
- Compile binaries.
- Run some tool to package the metadata and binaries together, sort of like
tar -czf filename1 filename2 ...
This is sort of correct, but with a big caveat. It has got to do with who controls the compilation step.
If you have ever published Windows desktop software, OS X desktop software or Java software, then you may think that you need to generate binaries yourself, and then tell the packaging tool where your binaries are and where on the filesystem they need to be placed. Debian packages can be made this way, but almost nobody does it this way. And indeed, the tooling doesn't make it easy for you to do it this way.
Instead, the Debian packaging tools require you to specify:
- How your software is compiled.
- How your compiled binaries should be placed on the filesystem.
When you run the packaging tools, they compile the software for you according to your instructions.
The reason why the Debian packaging tools work like this have to do with the fact that they evolved from an open source setting where packagers are not the original developers. As explained in Third-party packagers introduce lag, third-party packagers wait for the developer to release a source tarball. Then s/he writes a specification that describes how the software is compiled. This is good for repeatability and sharing. Instead of depending the packager to run ad-hoc compilation commands, the compilation instructions are clearly specified so that other people can reproduce the work.
So the package building process expects metadata such as descriptions and dependency information. It also expects a specification describing how to compile the software. The packaging tools expect you to supply all of this information through a directory. This directory is typically called debian
and placed in the top-level source code directory of the software you want to package. Inside this directory are at least two files:
- control -- metadata such as descriptions and dependency information.
- rules -- a makefile describing the compilation steps.
This package is supposed to contain everything related to packaging the software. That is, everything related to adapting the software for integration in Debian/Ubuntu. So the directory may contain additional files such as:
- Man pages
- Init scripts
- Pre- and post-install scripts.
- Patches.
There are multiple Debian packaging tools that form a hierarchy. On the lowest level is dpkg-buildpackage
. This tool expects as input a directory containing:
- the software's source code.
- a
debian
subdirectory.
dpkg-buildpackage
compiles the software using the debian/rules
makefile and generates a package with the metadata specified in debian/control
.
dpkg-buildpackage
operates directly on the current operating system environment. This is fairly straightforward, but it has a drawback: it depends implicitly on your system's state. What does this mean?
Most software has dependencies. Various dependencies need to be installed before a piece of software can be compiled. Since Debian packages are supposed to be repeatable (other people should be able to take your debian
directory and generate a package too), Debian allows you to specify build dependencies in the debian/control
file.
How do you know whether your list of build dependencies is correct and complete? What if you have a build dependency installed, but forgot to list it in the control file? dpkg-buildpackage
cannot help you there.
To solve this problem, the pbuilder
tool was invented. pbuilder
creates a clean, isolated Debian/Ubuntu system inside a chroot environment, installs all your build dependencies in there, then runs dpkg-buildpackage
in there. If you forgot to list a build dependency then you will encounter an error.
A drawback of both dpkg-buildpackage
and pbuilder
is that they can only build packages for the Debian/Ubuntu version that you are currently running. What if you need to publish packages for multiple distribution versions?
The pbuilder-dist
tool addresses this problem. pbuilder-dist
builds on top of pbuilder
and allows you to choose which Debian/Ubuntu version to put in the chroot. This way you can build packages for multiple distribution versions without having to install each distribution version manually.
To be precise, a .deb file is a binary package: it contains the final compiled binaries. But there is also this thing called source packages. Every binary package has a corresponding source package. Confusingly, a source package is not a single file: it is actually three files, and the collection of these three files is called a source package.
What is a source package good for? Well, a source package contains everything you need to generate the corresponding binary package. A source package contains a complete specification, so it is repeatable.
A source package contains:
- the source code of the software that's being packaged. This is called the "orig tarball".
- a
debian
directory in archive (either tar.gz or tar.xz). - a signature file (.dsc) containing signatures of the orig tarball and the
debian
directory archive.
You think devops invented the concept of deployment repeatability? The Debian authors (and Red Hat authors, because RPM has a similar concept) thought of this a long time ago. :) Everybody can take a source package and use it to build a binary package without fail, no matter what they already have or haven't installed yet. If a source package is able to build inside a pbuilder environment, then it is able to build for everyone.