Skip to content

Instantly share code, notes, and snippets.

@rlee287
Last active January 22, 2022 22:03
Show Gist options
  • Save rlee287/88d37a8fbff3775757eaf01045a01569 to your computer and use it in GitHub Desktop.
Save rlee287/88d37a8fbff3775757eaf01045a01569 to your computer and use it in GitHub Desktop.
rustc build comparison notes
--- config.toml.example 2022-01-04 02:07:27.613408300 +0000
+++ config.toml 2022-01-04 02:15:02.840182900 +0000
@@ -248,14 +248,14 @@
#locked-deps = false
# Indicate whether the vendored sources are used for Rust dependencies or not
-#vendor = false
+vendor = true
# Typically the build system will build the Rust compiler twice. The second
# compiler, however, will simply use its own libraries to link against. If you
# would rather to perform a full bootstrap, compiling the compiler three times,
# then you can set this option to true. You shouldn't ever need to set this
# option to true.
-#full-bootstrap = false
+full-bootstrap = true
# Enable a build of the extended Rust tool set which is not only the compiler
# but also tools such as Cargo. This will also produce "combined installers"
@@ -468,7 +468,7 @@
# The "channel" for the Rust build to produce. The stable/beta channels only
# allow using stable features, whereas the nightly and dev channels allow using
# nightly features
-#channel = "dev"
+channel = "stable"
# A descriptive string to be appended to `rustc --version` output, which is
# also used in places like debuginfo `DW_AT_producer`. This may be useful for
@@ -593,12 +593,12 @@
# default value is platform specific, and if not specified it may also depend on
# what platform is crossing to what platform.
# See `src/bootstrap/cc_detect.rs` for details.
-#cc = "cc" (path)
+cc = "clang-12"
# C++ compiler to be used to compile C++ code (e.g. LLVM and our LLVM shims).
# This is only used for host targets.
# See `src/bootstrap/cc_detect.rs` for details.
-#cxx = "c++" (path)
+cxx = "clang++-12"
# Archiver to be used to assemble static libraries compiled from C/C++ code.
# Note: an absolute path should be used, otherwise LLVM build will break.
@@ -612,7 +612,7 @@
# default value is platform specific, and if not specified it may also depend on
# what platform is crossing to what platform.
# Setting this will override the `use-lld` option for Rust code when targeting MSVC.
-#linker = "cc" (path)
+linker = "clang-12"
# Path to the `llvm-config` binary of the installation of a custom LLVM to link
# against. Note that if this is specified we don't compile LLVM at all for this
#!/bin/sh
curl --proto '=https' --tlsv1.2 -sSf https://static.rust-lang.org/dist/rustc-1.54.0-src.tar.gz -o rustc-1.54.0-src.tar.gz
tar -xzf rustc-1.54.0-src.tar.gz
# Keep the clean source for mrustc (?)
cp -r rustc-1.54.0-src rust-1.54.0-build

Last attempted with commit df08818.

Uncategorized notes and questions

  • Ask and hopefully find a way to build LLVM just once when having both?
  • Debug race conditions involving macro expansion when using make -j6 where minicargo.mk is involved
  • Use clang instead of gcc when building mrustc stuff
#!/bin/sh
cd rust-1.54.0-build
# Set up the config.toml
cp config.toml.example config.toml
patch config.toml ../config_toml.patch
# Actually build rustc stage2 and stage3
RUSTFLAGS='--remap-path-prefix /home/rustc_ddc/rust-1.54.0-build/build/x86_64-unknown-linux-gnu/stage1-rustc/=/home/rustc_ddc/rust-1.54.0-build/build/x86_64-unknown-linux-gnu/stage#-rustc/ --remap-path-prefix /home/rustc_ddc/rust-1.54.0-build/build/x86_64-unknown-linux-gnu/stage2-rustc/=/home/rustc_ddc/rust-1.54.0-build/build/x86_64-unknown-linux-gnu/stage#-rustc/' python3 x.py build --stage=3 || echo "Failed to build rustc"
# Compute hashes for comparison
cd build/x86_64-unknown-linux-gnu
sha256sum stage2/bin/rustc stage2/lib/*.so > ~/sha256sums_stage2
sha256sum stage3/bin/rustc stage3/lib/*.so > ~/sha256sums_stage3
cd
echo "sha256sum stage2"
cat sha256sums_stage2
echo "sha256sum stage3"
cat sha256sums_stage3

Ultimate goal: perform a DDC verification of rustc using mrustc as the other compiler.

Summary of unrelated config.toml changes (i.e. not made as part of trying to get the bootstrap to work)

  • Vendor the dependencies, as mrustc needs this.
  • Set full-bootstrap = true, as part of verifying that stage2 and stage3 are actually the same.
  • Set channel = "stable" as the ideal is to build a stable 1.54.0 toolchain.
  • Set clang-12 and clang++-12 as the C toolchain.

Summary of related changes:

  • Tried to set remap-debuginfo = true in config.toml to avoid leaking build paths (even though I'm doing all this in a dedicated VM), reverted this

Steps:

  1. Confirm the equivalence of stage2 and stage3 rustc builds, both to familiarize myself with the build system and to verify (as per the theoretical bootstrapping model) that the equivalence does happen. (In progress)
  2. Build mrustc, reusing the built LLVM from the rustc build system.
  3. Figure out the source of the differences (if any) between rustc 1.54.0 (from rustc 1.53.0) and rustc 1.54.0 (from mrustc master).

Build and comparison instructions

TODO: finish

  1. Copy download_src.sh, config_toml.patch and rustc_build_stage_3.sh into the home directory.
  2. Run download_src.sh to download and extract the source of Rust 1.54.0.
  3. Run rustc_build_stage_3.sh to build stage2 and stage3 and get the sha256sums of the files.
  4. Copy the built compilers out of the VM.

Uncategorized notes from the attempts to resolve the discrepancies

rustc 1.54.0 from git does not produce identical stage2 and stage3 build artifacts. I've been trying to track down the reasons for this, with assistance from @bjorn3 and others on Zulip.

The rustc binary itself is bit-for-bit identical.

First examining librustc_driver-<hash>.so, with divergences identified in the .rustc section, originating from .rmetas of some included libraries. Lucklily the actual code .text machine code is identical between stage2 and stage3 for this file.

bjorn3 identified typenum metadata as having absolute paths in .rmeta, which we're trying to eliminate:

  • Adding typenum = { version = "1.12.0", features = ["force_unix_path_separator"] } to compiler/rustc_driver/Cargo.toml, and running cargo check -p bootstrap (with the 1.53.0 bootstrap toolchain) to update Cargo.lock (unsuccessful, and I don't think it added any new features into Cargo.lock)
  • Setting remap-debuginfo = true in config.toml (unsuccessful, replaces a different part of the path)
  • Passing RUSTFLAGS=--remap-path-prefix stage1-rustc=stage#-rustc --remap-path-prefix stage2-rustc=stage#-rustc into x.py, with config change (unsuccessful)
  • Passing RUSTFLAGS='--remap-path-prefix /home/rustc_ddc/rust-1.54.0-w-llvm/build/x86_64-unknown-linux-gnu/stage1-rustc/=/home/rustc_ddc/rust-1.54.0-w-llvm/build/x86_64-unknown-linux-gnu/stage#-rustc/ --remap-path-prefix /home/rustc_ddc/rust-1.54.0-w-llvm/build/x86_64-unknown-linux-gnu/stage2-rustc/=/home/rustc_ddc/rust-1.54.0-w-llvm/build/x86_64-unknown-linux-gnu/stage#-rustc/' into x.py, without config change (success)

Extras for later:

  • Use a 1.54.0 src tarball (like mrustc uses). (done)
  • Use only LLVM tooling (e.g. lld), and see if GCC tools can be entirely removed.
  • Perform the same verification with crt-static=true and muslc.

This is a detailed description of how I set up the VMs. Details that are confirmed to matter are bolded. Unformatted details shouldn't matter but are included in case they end up causing differences.

The machine name only refers to bootstrapping for now, but it will be updated once a complete DDC procedure is figured out.

  1. Launch VirtualBox version 6.1.30.
  2. Create a new VM with the following settings: a. Name: rustc_bootstrap_verify b. Type: Linux c. Version: Ubuntu (64-bit) d. Memory: 12288 MB (make sure that the VM won't run out of memory while compiling) e. Create a virtual hard disk of type VDI, dynamically allocated, with size 32 GB (TODO: see how much space is actually needed) f. CPU count: 6
  3. Launch the VM and insert ubuntu-20.04.1-live-server-amd64.iso when prompted. (TODO: use a more recent version?)
  4. Complete the setup with the following settings: a. Language: English b. Update to the latest installer when prompted c. Confirm keyboard layout d. Use default network settings, and skip proxy e. Use default Ubuntu archive mirror f. Use default non-LVM disk setup, and confirm disk layout when prompted g. Your name: rustc_ddc, Your server's name: rustc_ddc, Pick a username: rustc_ddc, and a password h. I skipped OpenSSH server, although you may want to enable this depending on how you want to extract files later i. Do not install any featured server snaps j. Wait for Ubuntu to finish installing, and hit "Reboot Now" once it's ready. You may have to forcefully close the VM at this point, but this is unimportant.
  5. Start the VM after it the installation is done.
  6. Run sudo apt update && sudo apt upgrade to upgrade the system, and reboot.
  7. The server image already has git, ca-certificates, python3 and curl. We install the other needed packages with sudo apt install --no-install-recommends zlib1g-dev clang-12 cmake ninja-build libssl-dev pkg-config libgit2-dev.
  8. Set up a way to copy files into and out of the VM. We used scp to and from an artifact storage server to avoid having to install the VirtualBox Guest Additions into the VM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment