Skip to content

Instantly share code, notes, and snippets.

Last active June 14, 2024 00:43
Show Gist options
  • Save ekzhang/fcfa8cf1df4257cc51c02d5ddc4fe46b to your computer and use it in GitHub Desktop.
Save ekzhang/fcfa8cf1df4257cc51c02d5ddc4fe46b to your computer and use it in GitHub Desktop.
stumbling through buck2

Eric's Notes: Stumbling through Buck2

I tried to use Buck2 today, inspired by a tweet by @dtolnay.

I've never been especially fond of build systems. I've only written a couple Bazel files following patterns, and only used them if I absolutely had to. It was kind of a necessary but annoying incantation. But I have noticed some discomfort in the past with this, especially with larger and multi-language projects. I'm imagining that fancy build systems may help:

  • speed up builds (reduce redundant and slow steps)
  • make environments easier to reproduce (less “it works on my machine!”)
  • standardize processes across a company (less language-specific tools)

Buck2 is, in theory, based on academic literature and years of experience in industry. I might try writing some multi-language and library builds today and take notes? In the ethos of learning and exchanging ideas, maybe algorithmic ideas from Buck2 could apply to other problems in computing.

First impressions

  • Introduction page has you run the tool and doesn't provide any fancy hand-holding. Intended for experts I guess (they have the binary as a zstd file!). It worked and seems pretty nice.
  • I don't know much about build systems but can still use it just like I do with Bazel projects.
  • There's a fair amount of cxx_library, cxx_binary boilerplate, but that's expected.
  • I don't know what rules are builtin, but the prelude has a heck of a lot of stuff defined. Reading the source is still inscrutable right now, will come back to it maybe. Why does prelude need to be vendored in a submodule?
  • Given there's so much prelude magic happening, is it even practical to write your own Buck integrations for new languages? Unclear right now.
  • Target syntax and so on are similar to Bazel. Also the rules are written in Starlark, .bzl files.

How is Buck different or better than Bazel

  • ?? Hard to say, searching on Google didn't help. Still digging. Maybe it's obvious, and I don't know enough about Buck2 to understand the nuances yet?
  • Written in Rust instead of Go, but does that matter?
  • What makes it hermetic? Oh, it's not, that's fine.
    • "Meta uses an internal version of remote execution with builds always hooked up to remote execution. The open-source binding, which uses Buck2 without remote execution, may be less polished." — oof, makes sense :/
  • Oh, I'm looking at the original announcement now. Differences:
    • All language-specific rules are outside of core
    • "Single incremental dependency graph" (?)
    • Release of high-quality internal rules
    • Remote execution and virtual file systems; these seem more helpful for very large monorepo projects though.
    • Greater parallelism and speed than Buck1 (but what about vs Bazel?)
    • Console output format is prettier
    • Really strong primitives (dynamic_output, tsets) makes the tool more flexible, and therefore, hopefully less boilerplate for specifying dependencies.
  • "We expect Buck2 will be most interesting to moderately sized multi-language projects." Hmm that's good, sounds like work? Maybe?
  • I want to nerd out about the primitives they mention. They're right up my alley for the PL work I did with Datalog, CFGs, incremental computation and so on. Will save this in the back of my mind for later. Let's actually use the tool first.
  • There's a paper, Build Systems à la carte. Reading for later. It seems like an important paper and classifies build system by principle theoretically, so that's good to know. A framework for talking about them.
    • They mention Excel, which is really funny. All the best engineers seem to really admire Excel for what it does well.
    • That paper influenced other systems like Pants too.
    • This style of build system seems to be the meta these days. I guess it makes sense, with a very particular problem and group of experts who need control. Interface doesn't need to be the nicest because the companies have internal controls & dedicated experts.
  • Read a bunch of Reddit AMA answers. Lots of opinions, work in progress. I don't think I know enough to judge anything from this.

More examples

  • The with_prelude are minimal starters for different languages. For Rust, there is no Cargo dependency support, need a separate tool called reindeer to generate the BUCK files.
  • Buildbarn remote execution seems fun. There are other remote execution demos. Seems to involve a bit of more advanced setup though.
  • The most important thing to make Buck runnable in a work setting is build caching. I should be able to check if S3 caching would support that. It would speed things up a lot! The paper about build systems describes this as minimalism.

Writing a Buck2 project based on the guide

  • Okay, now creating a new directory from scratch and running init. Creates three things:
    • BUCK the main build file
    • prelude/** a submodule of the buck prelude
    • toolchains/BUCK what is this? it loads the system_genrule_toolchain rule from prelude (annotated @prelude) and uses that to produce a target named genrule
      • The prelude implementation is kind of magic. Don't know what it does.
      • See the definition of genrule for Bazel and genrule for Buck2. Okay, it sounds like the genrule toolchain is a customizable hook that affects the behavior of genrules?
      • The tutorial talks more about toolchains; you have to specify them here for each language to define how it gets built.
  • The tutorial went pretty smoothly. It compiled a C++ program. The program was kind of trivial though, and I'm left scratching my head — what else is there?
  • Reading the cxx BUCK file and mostly understand it, though there's still boilerplate. The //third-party dependencies are all from another file generated by reindeer, and then they're specified manually again for every package inside the project.
  • I guess, if you don't have that many Rust crates, it shouldn't be a problem. Especially since Rust crates tend to be pretty big compilation units.
  • Reindeer is still kind of awkward for Rust projects though. You need to write a dummy Cargo.toml file with a weird hack like [lib] path="/dev/null" on Linux, and this manifest is only used for the dependencies list. Then the dependencies are specified again in the real crate's Cargo.toml and their BUCK files, making you repeat them 3 times.
  • I guess you could also just make the Rust compilation a genrule invoking Cargo, but I feel like you then lose the benefits of an incremental system like Buck. But maybe not! Maybe it's just fine if you specify the dependencies carefully. Who knows?
  • Worth noting that Bazel has cargo raze, so it is possible to have slightly better translation of Cargo projects into rules. Just a matter of implementation.

Original project

  • I think I'm ready to try something myself. Let's start with Rust / PyO3 + Python?
  • That should be a reasonable but small system that would be a little awkward to set up in a typical setting, since it involves multiple steps that aren't necessarily compatible with each other and a linking phase. But with Buck, is it any easier?
  • Lots of these build systems support many options and weird linking steps, but they ultimately produce a single binary or library artifact (think: Chromium, gVisor, anything named lib***, the Android kernel, etc.). This seems a bit different from the kind of problems in multi-language builds you do for a project that involves infrastructure and network deployments.
  • I am honestly a little bit worried that I might need to be a wizard to use Buck2. >:(
  • Okay fuck it. I'm going to try to expose triple_accel as a Python extension module via PyO3. This is the simplest case since the library is prebuilt, and it has no system dependencies other than Rust std.
  • If I'm understanding correctly, the steps are:
    1. Add pyo3 and triple_accel as dependencies in a dummy third-party dir.
    2. Use reindeer to vendor and buckify a third-party package with BUCK build file.
    3. Write a custom rule for pyo3 libraries (probably a genrule).
    4. Write a Python library target.
    5. Write a Python binary target.
  • Work log
    • Huh, I'm installing Reindeer with Cargo, even though it's for a build tool. I guess even the people making big monorepo build tools still use ordinary tools too.
    • Hm, immediately I ran into an issue on step 2. Running reindeer buckify gives a bunch of warnings because about 15 dependencies have scripts, and reindeer says "I don't know what to do with it."
    • What to do. There's some incomplete docs about build scripts in the Reindeer README. I found a blog post from Steve Klabnik that mentions the issue, but this blog post promises a fix and doesn't offer it.
    • Ah, it looks like there's a fixups folder and a bunch of two-line TOML files that I need to add. Why are these necessary? Does it just run the buildscript? I'll try it.
    • That does indeed seem to fix the problem. I'll copy this TOML file 15 times for each dependency with a build script, I guess. No more warnings. (!)
    • Ah. Tried running buck build and I felt super powerful for like 2 seconds. Oh no, it can't find the OUT_DIR environment variable that cargo sets.
    • Searched the Buck2 repository, using Buck2, for OUT_DIR and found something for tonic.
    • Okay, now i know how I would manually edit rust_library to add build environment variables. Cannot find a definition or docs for the fixups.toml file though, so I'm reading the source code of Buck2, which parses it with serde. The buildscript.rustc_fixup thing is one of a number of enum options.
    • After reading the source code, I added an [env] fixup. It worked as expected and added env to the generated rust_library and rust_binary Buck2 rules. But then it's not working because the (location :target-lexicon-0.12.7) spec here is causing an infinite loop when resolving…
    • There's some fbsource// code in Buck that looks internal to Meta.
    • Okay I'm still trying to figure out how to get the right value for OUT_DIR here, or why it wasn't set automatically in the first place by reindeer. The location macro keeps giving cyclic references no matter where I put it.
    • Oh wow, there's a cargo_env = true option, which is also undocumented. I only found the option from the source code, guessing that it would help. It does add some environment variables but not OUT_DIR still.
    • Okay it's been over a couple hours, I think I'm calling it quits today.
  • This is a bit unfortunate. I thought the hard part of this project would be getting Rust to compile to a shared object with PyO3 and import that from Python in the right way, which requires keeping track of where output artifacts live and so on as a multi-language build. Instead, I had to go through a dozen different steps, which generated thousands of lines of configuration, and couldn't even get one of the Rust dependencies target-lexicon to compile at the end of it. I didn't write a single line of code during the process.
  • Maybe this would be better in the future when Buck2 is better documented? Not sure. Bazel also looks like its support is pretty iffy, at best.
  • It kind of sucks, since it was so difficult to set up builds for a project that I wasn't able to try using Buck2 and observing how its incrementality systems, remote building, and workflow feels. Maybe I'll just read the Build Systems à la carte paper in the future to get the conceptual insights without having to toil with configuration all the time.
Copy link

dtolnay commented Jun 20, 2023

If the build script generates sources, you would need a gen_srcs fixup.

# fixups/target-lexicon/fixups.toml


Copy link

ekzhang commented Jun 22, 2023

@dtolnay Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment