Skip to content

Instantly share code, notes, and snippets.

@ekzhang
Last active February 27, 2024 07:14
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ekzhang/fcfa8cf1df4257cc51c02d5ddc4fe46b to your computer and use it in GitHub Desktop.
Save ekzhang/fcfa8cf1df4257cc51c02d5ddc4fe46b to your computer and use it in GitHub Desktop.
stumbling through buck2

Eric's Notes: Stumbling through Buck2

I tried to use Buck2 today, inspired by a tweet by @dtolnay.

I've never been especially fond of build systems. I've only written a couple Bazel files following patterns, and only used them if I absolutely had to. It was kind of a necessary but annoying incantation. But I have noticed some discomfort in the past with this, especially with larger and multi-language projects. I'm imagining that fancy build systems may help:

  • speed up builds (reduce redundant and slow steps)
  • make environments easier to reproduce (less “it works on my machine!”)
  • standardize processes across a company (less language-specific tools)

Buck2 is, in theory, based on academic literature and years of experience in industry. I might try writing some multi-language and library builds today and take notes? In the ethos of learning and exchanging ideas, maybe algorithmic ideas from Buck2 could apply to other problems in computing.

First impressions

  • Introduction page has you run the tool and doesn't provide any fancy hand-holding. Intended for experts I guess (they have the binary as a zstd file!). It worked and seems pretty nice.
  • I don't know much about build systems but can still use it just like I do with Bazel projects.
  • There's a fair amount of cxx_library, cxx_binary boilerplate, but that's expected.
  • I don't know what rules are builtin, but the prelude has a heck of a lot of stuff defined. Reading the source is still inscrutable right now, will come back to it maybe. Why does prelude need to be vendored in a submodule?
  • Given there's so much prelude magic happening, is it even practical to write your own Buck integrations for new languages? Unclear right now.
  • Target syntax and so on are similar to Bazel. Also the rules are written in Starlark, .bzl files.

How is Buck different or better than Bazel

  • ?? Hard to say, searching on Google didn't help. Still digging. Maybe it's obvious, and I don't know enough about Buck2 to understand the nuances yet?
  • Written in Rust instead of Go, but does that matter?
  • What makes it hermetic? Oh, it's not, that's fine.
    • "Meta uses an internal version of remote execution with builds always hooked up to remote execution. The open-source binding, which uses Buck2 without remote execution, may be less polished." — oof, makes sense :/
  • Oh, I'm looking at the original announcement now. Differences:
    • All language-specific rules are outside of core
    • "Single incremental dependency graph" (?)
    • Release of high-quality internal rules
    • Remote execution and virtual file systems; these seem more helpful for very large monorepo projects though.
    • Greater parallelism and speed than Buck1 (but what about vs Bazel?)
    • Console output format is prettier
    • Really strong primitives (dynamic_output, tsets) makes the tool more flexible, and therefore, hopefully less boilerplate for specifying dependencies.
  • "We expect Buck2 will be most interesting to moderately sized multi-language projects." Hmm that's good, sounds like work? Maybe?
  • I want to nerd out about the primitives they mention. They're right up my alley for the PL work I did with Datalog, CFGs, incremental computation and so on. Will save this in the back of my mind for later. Let's actually use the tool first.
  • There's a paper, Build Systems à la carte. Reading for later. It seems like an important paper and classifies build system by principle theoretically, so that's good to know. A framework for talking about them.
    • They mention Excel, which is really funny. All the best engineers seem to really admire Excel for what it does well.
    • That paper influenced other systems like Pants too.
    • This style of build system seems to be the meta these days. I guess it makes sense, with a very particular problem and group of experts who need control. Interface doesn't need to be the nicest because the companies have internal controls & dedicated experts.
  • Read a bunch of Reddit AMA answers. Lots of opinions, work in progress. I don't think I know enough to judge anything from this.

More examples

  • The with_prelude are minimal starters for different languages. For Rust, there is no Cargo dependency support, need a separate tool called reindeer to generate the BUCK files.
  • Buildbarn remote execution seems fun. There are other remote execution demos. Seems to involve a bit of more advanced setup though.
  • The most important thing to make Buck runnable in a work setting is build caching. I should be able to check if S3 caching would support that. It would speed things up a lot! The paper about build systems describes this as minimalism.

Writing a Buck2 project based on the guide

  • Okay, now creating a new directory from scratch and running init. Creates three things:
    • BUCK the main build file
    • prelude/** a submodule of the buck prelude
    • toolchains/BUCK what is this? it loads the system_genrule_toolchain rule from prelude (annotated @prelude) and uses that to produce a target named genrule
      • The prelude implementation is kind of magic. Don't know what it does.
      • See the definition of genrule for Bazel and genrule for Buck2. Okay, it sounds like the genrule toolchain is a customizable hook that affects the behavior of genrules?
      • The tutorial talks more about toolchains; you have to specify them here for each language to define how it gets built.
  • The tutorial went pretty smoothly. It compiled a C++ program. The program was kind of trivial though, and I'm left scratching my head — what else is there?
  • Reading the cxx BUCK file and mostly understand it, though there's still boilerplate. The //third-party dependencies are all from another file generated by reindeer, and then they're specified manually again for every package inside the project.
  • I guess, if you don't have that many Rust crates, it shouldn't be a problem. Especially since Rust crates tend to be pretty big compilation units.
  • Reindeer is still kind of awkward for Rust projects though. You need to write a dummy Cargo.toml file with a weird hack like [lib] path="/dev/null" on Linux, and this manifest is only used for the dependencies list. Then the dependencies are specified again in the real crate's Cargo.toml and their BUCK files, making you repeat them 3 times.
  • I guess you could also just make the Rust compilation a genrule invoking Cargo, but I feel like you then lose the benefits of an incremental system like Buck. But maybe not! Maybe it's just fine if you specify the dependencies carefully. Who knows?
  • Worth noting that Bazel has cargo raze, so it is possible to have slightly better translation of Cargo projects into rules. Just a matter of implementation.

Original project

  • I think I'm ready to try something myself. Let's start with Rust / PyO3 + Python?
  • That should be a reasonable but small system that would be a little awkward to set up in a typical setting, since it involves multiple steps that aren't necessarily compatible with each other and a linking phase. But with Buck, is it any easier?
  • Lots of these build systems support many options and weird linking steps, but they ultimately produce a single binary or library artifact (think: Chromium, gVisor, anything named lib***, the Android kernel, etc.). This seems a bit different from the kind of problems in multi-language builds you do for a project that involves infrastructure and network deployments.
  • I am honestly a little bit worried that I might need to be a wizard to use Buck2. >:(
  • Okay fuck it. I'm going to try to expose triple_accel as a Python extension module via PyO3. This is the simplest case since the library is prebuilt, and it has no system dependencies other than Rust std.
  • If I'm understanding correctly, the steps are:
    1. Add pyo3 and triple_accel as dependencies in a dummy third-party dir.
    2. Use reindeer to vendor and buckify a third-party package with BUCK build file.
    3. Write a custom rule for pyo3 libraries (probably a genrule).
    4. Write a Python library target.
    5. Write a Python binary target.
  • Work log
    • Huh, I'm installing Reindeer with Cargo, even though it's for a build tool. I guess even the people making big monorepo build tools still use ordinary tools too.
    • Hm, immediately I ran into an issue on step 2. Running reindeer buckify gives a bunch of warnings because about 15 dependencies have build.rs scripts, and reindeer says "I don't know what to do with it."
    • What to do. There's some incomplete docs about build scripts in the Reindeer README. I found a blog post from Steve Klabnik that mentions the issue, but this blog post promises a fix and doesn't offer it.
    • Ah, it looks like there's a fixups folder and a bunch of two-line TOML files that I need to add. Why are these necessary? Does it just run the buildscript? I'll try it.
    • That does indeed seem to fix the problem. I'll copy this TOML file 15 times for each dependency with a build script, I guess. No more warnings. (!)
    • Ah. Tried running buck build and I felt super powerful for like 2 seconds. Oh no, it can't find the OUT_DIR environment variable that cargo sets.
    • Searched the Buck2 repository, using Buck2, for OUT_DIR and found something for tonic.
    • Okay, now i know how I would manually edit rust_library to add build environment variables. Cannot find a definition or docs for the fixups.toml file though, so I'm reading the source code of Buck2, which parses it with serde. The buildscript.rustc_fixup thing is one of a number of enum options.
    • After reading the source code, I added an [env] fixup. It worked as expected and added env to the generated rust_library and rust_binary Buck2 rules. But then it's not working because the (location :target-lexicon-0.12.7) spec here is causing an infinite loop when resolving…
    • There's some fbsource// code in Buck that looks internal to Meta.
    • Okay I'm still trying to figure out how to get the right value for OUT_DIR here, or why it wasn't set automatically in the first place by reindeer. The location macro keeps giving cyclic references no matter where I put it.
    • Oh wow, there's a cargo_env = true option, which is also undocumented. I only found the option from the source code, guessing that it would help. It does add some environment variables but not OUT_DIR still.
    • Okay it's been over a couple hours, I think I'm calling it quits today.
  • This is a bit unfortunate. I thought the hard part of this project would be getting Rust to compile to a shared object with PyO3 and import that from Python in the right way, which requires keeping track of where output artifacts live and so on as a multi-language build. Instead, I had to go through a dozen different steps, which generated thousands of lines of configuration, and couldn't even get one of the Rust dependencies target-lexicon to compile at the end of it. I didn't write a single line of code during the process.
  • Maybe this would be better in the future when Buck2 is better documented? Not sure. Bazel also looks like its build.rs support is pretty iffy, at best.
  • It kind of sucks, since it was so difficult to set up builds for a project that I wasn't able to try using Buck2 and observing how its incrementality systems, remote building, and workflow feels. Maybe I'll just read the Build Systems à la carte paper in the future to get the conceptual insights without having to toil with configuration all the time.
@dtolnay
Copy link

dtolnay commented Jun 20, 2023

If the build script generates sources, you would need a gen_srcs fixup.

# fixups/target-lexicon/fixups.toml

[[buildscript]]
[buildscript.gen_srcs]

@ekzhang
Copy link
Author

ekzhang commented Jun 22, 2023

@dtolnay Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment