Why use Buck2? — or: from Make to Buck2

Let's say you have a Makefile like this:

SRCS := $(shell find . -type f -iname '*.c')
OBJS := $(patsubst %.c,%.o,$(SRCS))

%.o: %.c
	$(CC) -c $(CFLAGS) -o $@ $<

a.exe: $(OBJS)
	$(CC) -o $@ $<

test: a.exe

.PHONY: test

This isn't specifically for GNU Make, but for the purposes here, this should be readily understandable or translate-able to your favorite implementation of Make. This Makefile:

  • Turns a set of .c files into .o files.
  • Turns full set of .o files into an .exe
  • Allows you to run the .exe

Make is actually a great tool conceptually to implement, to test your programming chops with — try it when you want to learn a new language. That's because Make has a pretty simple operational model at a glance -- the model that describes how Make executes step by step. When we think of Make operationally, we are sort of thinking of its concrete algorithm, and the algorithm looks something like this, to be extremely hand-wavy:

  • Make tracks and is responsible for keeping files on disk up-to-date, which is its job.
  • A Makefile, for every file it's describing, has a command associated — and when run, creates that file.
  • A Makefile contains a representation of dependencies between files, i.e. this file needs to exist, before this command can get run, which generates this other file.
  • When Make is run, it is given a top-level file to produce, e.g. make out.exe — it looks at this file and dependent files it need.
  • Make recursively asks itself: is this file "up-to-date"? If it is, Make does nothing and continues on. If it isn't, it runs the command associated with that file to re-generate it. This task is recursive across all dependencies of the requested file, because if a dependency is out-of-date, then it needs to be brought up-to-date first.
  • It can do this in parallel, too. Some commands depend on different sets of input files, after all.
  • It continues this process until the desired file out.exe is "up-to-date"

The notion of 'up-to-date' is left abstract; though, in practice Make looks at the timestamps of files, so if it sees A: B..., A is considered up-to-date only if it exists, and has a newer timestamp than all dependent files B.... It does this recursively for all dependencies. This "existence+mtime" check is the core of how Make works for most users.

In this way, because Make only tracks files, it treats the filesystem as a caching layer, by assuming every action puts its result into a file on the filesystem — and it uses the "existence+mtime" as criteria for "cache hit". Thus turning the filesystem into a simplistic cache and database of record. You can imagine that, in a hypothetical world where the stat call also returned a SHA-256 hash of a file along with the mtime, then make could use a hash function instead of the mtime. Without that, Make would instead have to record that data out of band.

Also, some rules cannot be cached. They are .PHONY: rules, and the name of a phony rule doesn't actually have to correspond to a file, because no caching is taking place.

But if we dissect it, make actually embodies a particular set of semantics that we can pull apart. And I think there's a journey from this that would roughly result in something like buck2.

Make, denotationally

So, the purpose of a Makefile is simple: run a set of commands. But doesn't bash do that? And didn't we determine Make caches things, and checks for "up-to-date"-ness? What separates it from a raw Bash script, then? You could just put every command sequentially into a .sh file and run that. But Make is useful precisely because it recognizes some common things about such a series of commands:

  • Individual commands are computations that result in outputs. For example, the report file generated by running a test suite, or the object file generated by compiling a piece of code.

  • Even though the total set of inputs to the bash script may be large when looked at in aggregate, individual commands often only need a small set of the total set of inputs to do their job — inputs which often are the outputs of other, previous commands that were run — or input files written by hand, perhaps.

  • For most commands, the command's output is purely a function of its input parameters; files, command line arguments, stdin, environment variables, et cetera. If you run a command once with parameters A, B, and C, be they from the environment, command line, or stdin -- doing so again will always give the same output as a previous run. These are pure commands.

  • But some (few) commands need to be re-run every time, even with the same inputs. These are volatile commands. They often have some sort of side effect on the outside environment, or perform some ambient effect to be tracked (like running a test suite.)

  • Individual commands do not need to run sequentially. A fully defined sequential ordering might imply there is a total ordering on the set of commands. But there isn't, unless you arbitrarily break ties by some other total ordering, kicking the can down the road. Put another way, you can create a total ordering, but it is overly conservative and completely arbitrary.

  • Rather: any command may run once all of its needed inputs are available, including those that are outputs outputs of previous commands. Therefore, if we consider the set of all commands S there is a partial order on S, where the partial order relationship is defined by input/output edges, and X < Y means that command X must be run before command Y can be run, because the output of X is an input to Y.

  • Because there is a partial ordering on the set of commands to run, with respect to their inputs/outputs — you may construct a directed acyclic graph (DAG) from that set.

  • Because there is a DAG on the set of commands, the DAG naturally represents a graph which can be computed in parallel. Two commands X and Y may be run in parallel as long as they are not transitively dependent on one another, that is, there is no transitive chain of operators X < ... < Y relating them.

  • In this sense, a Makefile doesn't describe a particular set of commands to run; rather, it is a specification of commands to run, of which there are many potential realizations. For example, if two nodes X and Y are not transitively related, then their commands can be run in a purely sequential order as X, Y ("version 1") or Y, X ("version 2"). Even though the physical trace of commands is different, these two versions are "observably equivalent"; hence the DAG is a specification, not a concrete instance of the build. Again, this is true even when no parallelism is involved. For instance, in the original Makefile above — Make, even with no parallelism, could simply randomize the order of the .o files every time, but the result will be the same binary.

  • Now, you can pick some command in the set S. This command A results in an output. To get that output, you just walk the transitive chain of all things that A depends on and build all of them. This has a natural specification:

    Build(A) =
      DependenciesOfA = Build(D) for each D in DependenciesOf(A)
      return RunCommand(A, DependenciesOfA)

    This definition is naturally recursive, because it represents the recursive walk of all dependencies.

  • Finally, the graph of commands is incremental in a natural, straightforward way: if a command is a pure command, then you simply need to check "up-to-date"ness of the inputs. (It is assumed this is less expensive than re-running the command itself.) We say this is incremental because after running Build(A) once, rerunning Build a second time with different A is faster than recomputing it from scratch; the change in time being "proportional" by some measure to the change in A.

So the final specification for Build looks something like this:

Build(A) =
  DependenciesOfA = Build(D) for each D in DependenciesOf(A)

  // pure commands can be skipped, if everything is up to date
  if IsPure(A) then {
    Result = IsUpToDate(A, DependenciesOfA)
    if Result is not null {
      return Result // it is, so return

  // otherwise just recompute; if it's pure, then we need
  // to record its up-to-date-ness
  Result = RunCommand(A, DependenciesOfA)
  if IsPure(A) then {
    RecordUpToDateResult(Result, A, DependenciesOfA)
  return Result

This description might sound overly academic but there is purpose to it: it is roughly the set of insights, and the rough abstract algorithm that informs every build system! It's mostly a matter of things like "how do I recompute DependenciesOfA efficiently" and the record of up-to-date-ness, which we'll get to.

Also, note that this does not do any "invalidation" of consumers of A. If you call Build(A) and then call Build(B) and A < B, then it's assumed that IsUpToDate(B, DependenciesOfB) will return null — assuming A was changed according to the "up-to-date"-ness logic.

But this is a remarkably simple description! It's very important to understand this algorithm and the abstract description above.

Teasing apart the description

This abstract description teases out very important parts of the design of make:

  • Make only thinks of "inputs" and "outputs" as files and nothing more. Files depend on files, and that's it. But in the above, the description of A is more abstract.

  • The check for "up-to-date"-ness is totally abstract, as alluded to earlier. make uses the filesystem as a simple cache to store the result of a command itself, and also check for "up-to-dateness" of the object too by using mtime fields. This is part of its elegance (more on that later) but conflating these things has consequences in the large.

IsUpToDate(File, Dependencies) =
  if File does not exist {
    return null // needs to be built for the first time

  t = mtime(File) // grab the file mtime

  for each DepFile in Dependencies {
    // if a dependency has a newer mtime than the
    // original file, it was rebuilt, so this needs
    // to be rebuilt too
    if mtime(DepFile) newer than t {
      return null

  // it exists, everything is up to date, so return
  // the file itself as the result
  return File

RecordUpToDateResult(Result, File, Dependencies) =
  // just write the result of the command to the filesystem,
  // and make sure the mtime is recorded. dependencies aren't
  // relevant
  WriteFileWithMtime(File, Result)
  return null

However, this implementation is awkward. For example a cache should be able to hold multiple versions of a file. In fact a cache is just a glorified key-value store, so it's largely a matter of picking a good cache key, right? But a "modern filesystem" is a very poor cache when used the way make does, because only allows one file to exist at a given "cache key" or filesystem path. This gives extremely poor granularity from caching and makes many things awkward.

For example, in the original Makefile in the beginning, only a single a.exe file can exist next to that Makefile on a filesystem. So if you only change the input file itself, the cache works. Yet you may recompile the same a.exe multiple times in different configurations, where caching doesn't work:

  • Compile with make CFLAGS=-O2 to create a.exe
  • Force a recompilation with touch *.o; make CFLAGS=-O3
  • Force a recompilation with touch *.o; make CFLAGS=-O2

Because only one a.exe exists on the filesystem at any time, you cannot cache multiple versions of a.exe built with different compiler flags. Therefore you must re-compile and re-link a.exe in step 3, even though it was previously done already in step 1, because it got overwritten in step 2. To have multiple copies of the exe file, you need to modify the build system to use a build directory, or some other prefix on the filename to disambiguate them, and also introduce some mechansim to create it:

OUTDIR := out

SRCS := $(shell find . -type f -iname '*.c')
OBJS := $(patsubst %.c,$(OUTDIR)%.o,$(SRCS))

$(OUTDIR)/%.o: %.c
	mkdir -p $OUTDIR # do this as a hack
	$(CC) -c $(CFLAGS) -o $(OUTDIR)/$@ $<

$(OUTDIR)/a.exe: $(OJBS)
	$(CC) -o $@ $<

test: $(OUTDIR)/a.exe

.PHONY: test

Now we can do make CFLAGS=-O3 OUTDIR=out-O3 in order to separate the ouput directories. This introduces a prefix; in short, a kind of "prefix sharding" (directory) of the keyspace (filesystem). But this problem quickly proliferates when you have many combinations of options to mix and match:

  • Optimization flags
  • Sanitizers: thread, address, undefined
  • Fuzzing builds
  • Compilers (gcc vs clang)
  • Linkers (lld vs bfd)

It may be important to test all these configurations and different variations of them too. For example, you can combine address sanitizer and fuzzing, and -O3, but not with gcc. There is actually a cartesian product of all these inputs that are potentially valid.

By abusing the filesystem path as a cache key, the granularity of the cache is limited by how you construct file paths. In the above example $(OUTDIR) was proliferated everywhere. You'll also need to add special cases for all sanitizer features, if you want those to stay and not clobber each other.

By this metric, the above Makefile is still not fullproof, because it requires both CFLAGS and OUTDIR to be set by the user and doesn't stop you from overwriting entries in the "cache". You would also have to record CFLAGS into a file for it to be tracked by make properly, and then attach the CFLAGS to the OUTDIR path somehow. This is a lot of work to get granular caching at the level of these changes.

This circles back to the first point mentioned about make's design: the CFLAGS and LDFLAGS and compiler, and sanitizer mode — all those related variables are really inputs to the command, along with the .c file. But Make only recognizes files as inputs and only tracks files. Therefore we have to encode strings into filenames in order to abuse them as part of the cache key.

So how can we improve on this? Is there a better way to record results and check up-to-dateness? And can we improve on the conflation between "files" and "inputs"? We can.

Make, as a language

But first, I want to return to this Makefile:

OUTDIR := out

SRCS := $(shell find . -type f -iname '*.c')
OBJS := $(patsubst %.c,$OUTDIR/%.o,$(SRCS))

$(OUTDIR)/%.o: %.c
	mkdir -p $OUTDIR # do this as a hack
	$(CC) -c $(CFLAGS) -o $(OUTDIR)/$@ $<

$(OUTDIR)/a.exe: $(OJBS)
	$(CC) -o $@ $<

test: $(OUTDIR)/a.exe

.PHONY: test

This file is fine but if we could experiment a bit... what would something better look like? For example, imagine if you want to have the ability for the user to say make ASAN=1 and enable address sanitizer? Something like this would work:

OUTDIR := out

SRCS := $(shell find . -type f -iname '*.c')
OBJS := $(patsubst %.c,$OUTDIR/%.o,$(SRCS))

ifdef ASAN
CFLAGS += $(CFLAGS) -fsanitize=address

$(OUTDIR)/%.o: %.c
	mkdir -p $OUTDIR # do this as a hack
	$(CC) -c $(CFLAGS) -o $(OUTDIR)/$@ $<

$(OUTDIR)/a.exe: $(OJBS)
	$(CC) -o $@ $<

test: $(OUTDIR)/a.exe

.PHONY: test

This modifies the CFLAGS variable based on if the ASAN variable (a new input) exists. But the reusability is poor if we want to introduce new variables, or compose variables, or exclude features that are invalid, with long chains of ifdefs.

More importantly, it doesn't separate concerns. If I just write C code, I have to go modify the build system, or structure it in such a way that the user can pass extra flags. I, as an author of the Makefile, need to have knowledge that this is something users will do.

But what if there was another way? Instead of having a chain of ifdef calls for many things, maybe reusability could come another way?

For the sake of position, let's imagine if we could instead take a make rule like this

A: B C D
	cmd B C D > A

and abstract it out. For example, we could give it a name like a function, and then have placeholders for the parameters. Let's imagine a hypothetical syntax like the following for defining these "functions":

func do-cmd(out, inputs)
  $out: $inputs
    cmd $inputs > $out

To create the equivalent of the original code, we just apply this function to concrete arguments:

func do-cmd-rule(out, inputs)
  $out: $inputs
    cmd $inputs > $out

do-cmd-rule(A, B C D)

This is hypothetical but the intent is roughly clear. The call to the function instatiates the rule or "expands" it, like you expect. This looks like a lot of work for nothing, but it dramatically helps us reuse code almost immediately.

For example, we can completely abstract out the cflags rule from earlier:

func do-cc(out, input, cflags)
  $out: $input
    $(CC) -c $cflags -o $out $input

do-cc(%.o, %.c, $(CFLAGS))

We can imagine the pattern rules %.o working in a similar way to the original Makefile; it describes a set of files that match. But if we're adding things, why stop at these functions? We could use real for loops too!

SRCS=$(shell find . -type f -iname '*.c')

func do-cc(...)

for x in $(SRCS):
  do-cc(patsubst(%.c, %.o, $x), $x, $(CFLAGS))

We've moved away from the purely declarative nature of the Makefile with this. After all, pattern matching with the declarative make DSL is nice, but this adds a lot of flexibility. For example, we could just add another pattern with a prefix to the name for ASAN builds. Let's add variables too, why not?:

for x in $(SRCS):
  var opath = patsubst(%.c, %.o, $x)
  do-cc($opath, $x, $(CFLAGS))
  do-cc(asan-$opath, $x, $(CFLAGS) -fsanitize-address)

In short, we declare two rules for every object file, one named x.o and one named asan-x.o, and these are compiled with the same flags, but one includes -fsanitize=address. (You could also imagine an equivalent with the % pattern to this example, of course.)

This is actually a huge, huge step up, because no modification to any ambient state needs to happen. More importantly: this was implemented with no change to do-cc at all, where the original solution had to modify CFLAGS. It's not always clear that's correct. Here, there is no modification under what conditions CFLAGS can be mutated. But passing a new argument is easy.

Now let's go one final step further and just separate the functions and this part of the code into two files. Most of your Makefiles could then look like this:


SRCS := shell(find . -type f -iname '*.c')
OBJS := patsubst(%.c,$OUTDIR/%.o,$(SRCS))

for x in $SRCS:
  var opath = patsubst(%.c, %.o, $x)
  do-cc($opath, $x, $(CFLAGS))

do-link(a.exe, $(OBJS))

But then why stop there? Let's just combine do-cc and do-link into one, and then why even bother making the SRCS variable when we can just pass it in:

func do-exe(out, srcs)
  for x in srcs:
    var opath = patsubst(%.c, %.o, $x)
    do-cc($opath, $x, $(CFLAGS))



do-exe(a.exe, foo.c bar.c main.c)

This is a huge improvement! Why? Because we've separated the definition of the rule body (the function) from the definition of the rule (applying a function to arguments). This is key to abstracting builds at scale, because the implementation of do-link or do-cc or do-exe isn't revealed to any user.

In contrast, the original Makefile, while brief, exposed many implementation details in the body of a rule. Brevity is actually an advantage in some ways, but when builds get large enough, it becomes a massive problem to value brevity over reuse and modularity.

Doesn't make have functions already? I mean, macros?

Yes, but hear me out: they're kind of bad for the purpose we want here, which is reusability.

First off, it isn't clear which make you're talking about, though in practice everybody does mean "GNU Make" when they would ask something like this.

Second: they're bad at large scale. I say this with a lot of respect for make (again, more on that later.) But they aren't a good unit of code reuse. Again, the goal is reuse. There's two basic reasons they aren't so hot at this:

  • Because make mostly values brevity over readability, they have poor syntax, especially as you add more that call each other, and begin to deal with things like escaping rules. Shell is already very bad at this, and Make pushes it to new limits when you get far enough.
  • More importantly, they aren't hygenic, and Make has no notion of lexical scope. Therefore when you reference a $(VARIABLE) it is simply assumed to exist in the current scope (possibly having been defined somewhere else by a previous function), AKA it uses form dynamic scoping. We've moved away from this in larger scale programs for pretty good reason — it is easy to make a mess of it!

The first one is bad, but the second one is killer at large scale; it also feeds into the first problem, where the lack of scope and lexical hygiene really exacerbate the readability problems.

Here's a good example of what I'm talking about; the Glasgow Haskell Compiler for a very long time used a large, powerful GNU Make based build system. GHC is a large, multi-language project, and the compiler has to bootstrap itself, compile its runtime, and support many features (code coverage, profiling, dynamic linking). This puts a lot of demand on the build system; you have lots of rules with subtle dependencies, and lots of similar-but-not-quite-same rules. Here's a snippet of that code (from rules/ in GHC 8.0.x):

$$($1_$2_depfile_haskell) : $$($1_$2_HS_SRCS) $$($1_$2_HS_BOOT_SRCS) $$$$($1_$2_HC_MK_DEPEND_DEP) | $$$$(dir $$$$@)/.
	$$(call removeFiles,$$@.tmp)
ifneq "$$($1_$2_HS_SRCS)" ""
	"$$($1_$2_HC_MK_DEPEND)" -M \
	    $$($1_$2_$$(firstword $$($1_$2_WAYS))_MOST_DIR_HC_OPTS) \
	    $$($1_$2_MKDEPENDHS_FLAGS) \

This is part of a macro body for a macro named build-dependencies, and it is insane. And it isn't that way for no reason. A big part of it is just because Make doesn't even have concepts of lexical scope that match any programming language; or things like "parameters that have actual names". You can't even tell what $1 and $2 are in this context, while their real names hidden in comments are $dir and $distdir — would at least tell you something.

An incredible amount of effort is just spent on the semantics of variable expansion in various conditions. By the way, the above Makefile needs to work on Windows, too.

I used to work on the Glasgow Haskell Compiler as a core maintainer. The build system was easily the most impenetrable part of the whole codebase and daunting for every new and experienced contributor — for a production compiler that has academic research getting pumped into it year-round! There was even an entire design document laying out the ideas, but when the code you are dealing with is like this, things begin breaking down.

And before you say it: yes, we tried very very hard to avoid the above. But we in practice had multiple programming languages, half a dozen needed tools from manually invoked preprocessors, to code generators, to documentation tools — rich dependencies, hundreds of thousands of lines of code. You want to try and not have too many external dependencies so bootstrapping and porting is easier. You still need to support this thing for 2 more years because you have some users who are on ancient Linux machines but will upgrade after the support expiry. The list goes on and on.

"Don't have an insane build system" is a good goal to strive for. In practice, it is competing with 50 other goals. Nobody is writing 10,000 lines of GNU Make for fun, I can tell you that.

When you have a large enough system, you need primitives that allow code reuse. If you're going to add a unit of code reuse, the easiest one we know of is to add functions, and programmers generally have expections about functions wrt scope and name hygiene.

To end this section I will make my call to tool designers — please, if you are designing a DSL: do not repeat this mistake. Just add functions and function calls, at minimum. Maybe you need more than this, and maybe you even need a full blown language. But please, I am begging you, at least have real functions with a real notion of lexical scope.


More to be written soon.

