matt-nervos/Compiling CKB Contracts using alternate C compilers.md

## Compiling CKB Contracts using alternate C compilers.md

      
    Raw
  

              Compiling CKB Contracts using alternate C compilers.md
            
          
    Compiling CKB Contracts using alternate C compilers

Unlike some other blockchains, CKB is designed to use strictly standard-compliant RISC-V ISA as its language for building smart contracts. There was a period where we have to maintain forked toolchains since RISC-V was still an immature platform back then, but the ultimate goal has always been to use standard RISC-V compilers / toolchains when building CKB smart contracts. I did some checks back in 2022, it seems some tweaks were still needed back then. Now in 2023, the software story of RISC-V has progressed tremendously over the past few years. It’s now the time to revisit the questions here:

Can we use standard RISC-V compilers / toolchains to build CKB smart contracts?
More specifically, there are more C compilers other than gcc alone in the ecosystem, can we use other C compilers to build CKB smart contracts?

Compiling C contracts via clang

Let’s first get some terminologies straight:

Architecture: this refers to an internal property of a particular CPU. Different CPUs might use different architectures, such as x86_64(which is most widely used today), aarch64 / arm64 / Apple Silicon(in most cases these 3 refer to the same architecture, it really depends on which one people use), or riscv (the architecture we use in CKB as the language for smart contracts)
host: Host means the architecture of the CPU running in your computer. When you are using m1 / m2 mac, the host of your environment is aarch64 / arm64 / Apple Silicon. In other cases, it’s most likely that the host of your environment is x86_64. When you are installing new software, you would want software whose host matches your current machine, however, this is typically taken care of by the package manager. We won’t need to worry about this.
target: In many cases, we want to compile a binary that runs on the same machine, however, there are cases where you want to compile a binary to run in a different machine with a different architecture. For example, when writing CKB smart contracts, you would want to build a smart contract binary that runs on CKB-VM, which uses riscv  architecture, even though the machine you use might leverage x86_64 or aarch64 architecture. We use target here to refer to the architecture of the compiled binary.

When writing CKB smart contracts, we would want a compiler whose host matches the architecture of our machine, and whose target is riscv. There are certainly more details than this(some of you might heard about different extensions of riscv), but the above part provides enough backgrounds for us to move on.
The GNU toolchain is typically architected so one compiler suite is only suitable for building binaries of one particular target. That’s why we would typically see packages for each different target, the actual compiler command one uses, is also with target specific prefixes, such as riscv64-unknown-elf-gcc. LLVM (with clang), on the other hand, is designed so one compiler suite can be used to generate binaries for many different targets. This can be confirmed using llc:
$ llc-16 --version
Ubuntu LLVM version 16.0.6
  Optimized build.
  Default target: x86_64-pc-linux-gnu
  Host CPU: tigerlake

  Registered Targets:
    aarch64     - AArch64 (little endian)
    aarch64_32  - AArch64 (little endian ILP32)
    aarch64_be  - AArch64 (big endian)
    amdgcn      - AMD GCN GPUs
    arm         - ARM
    arm64       - ARM64 (little endian)
    arm64_32    - ARM64 (little endian ILP32)
    armeb       - ARM (big endian)
    avr         - Atmel AVR Microcontroller
    bpf         - BPF (host endian)
    bpfeb       - BPF (big endian)
    bpfel       - BPF (little endian)
    hexagon     - Hexagon
    lanai       - Lanai
    loongarch32 - 32-bit LoongArch
    loongarch64 - 64-bit LoongArch
    m68k        - Motorola 68000 family
    mips        - MIPS (32-bit big endian)
    mips64      - MIPS (64-bit big endian)
    mips64el    - MIPS (64-bit little endian)
    mipsel      - MIPS (32-bit little endian)
    msp430      - MSP430 [experimental]
    nvptx       - NVIDIA PTX 32-bit
    nvptx64     - NVIDIA PTX 64-bit
    ppc32       - PowerPC 32
    ppc32le     - PowerPC 32 LE
    ppc64       - PowerPC 64
    ppc64le     - PowerPC 64 LE
    r600        - AMD GPUs HD2XXX-HD6XXX
    riscv32     - 32-bit RISC-V
    riscv64     - 64-bit RISC-V
    sparc       - Sparc
    sparcel     - Sparc LE
    sparcv9     - Sparc V9
    systemz     - SystemZ
    thumb       - Thumb
    thumbeb     - Thumb (big endian)
    ve          - VE
    wasm32      - WebAssembly 32-bit
    wasm64      - WebAssembly 64-bit
    x86         - 32-bit X86: Pentium-Pro and above
    x86-64      - 64-bit X86: EM64T and AMD64
    xcore       - XCore
    xtensa      - Xtensa 32
I’m using Ubuntu 22.04 with LLVM 16 installed from the official distribution. So all the binaries from LLVM 16 has -16 suffix, such as clang-16, llc-16, llvm-readelf-16. If you are on other systems, the binaries might use different names. For example, LLVM installed via homebrew on macOS does not have this suffix, so clang would simply be clang.
When using gcc, what weou would typically do is:
$ riscv64-unknown-elf-gcc foo.c -o foo
While using clang, we would use command line arguments to specify target:
$ clang-16 --target=riscv64 --march=rv64imc foo.c -o foo
Or for better performance, we can enable extensions available to CKB-VM:
$ clang-16 --target=riscv64 --march=rv64imc_zba_zbb_zbc_zbs foo.c -o foo
This is really the basis of instructing clang to build binaries of riscv target. Of course porting a project using gcc to clang has more than that, we have to deal with the differences in flags between gcc and clang, extensive testings are also required to make sure the clang version does generate correct code in riscv target environment. Luckily, as of 2023 now, clang 16 seems to be in a mature enough state, where we can build and run CKB smart contracts which are used to be built using gcc, with non-trivial performance improvements:
$ git clone https://github.com/xxuejie/ckb-vm-bench-scripts
$ cd ckb-vm-bench-scripts
$ git checkout 434073e050ab5231acb451b7b45f1307838508c2
$ git submodule update --init --recursive
$ make -f clang.makefile
$ ckb-vm-runner build/secp256k1_bench_clang secp256k1 033f8cf9c4d51a33206a6c1c6b27d2cc5129da
a19dbd1fc148d395284f6b26411f 304402203679d909f43f073c7c1dcf8468a485090589079ee834e6eed92fea9b09b06a2402201e46f1075afa18f306715e7db87493e7b7e779569aa13c64ab3d09980b3560a3
 foo bar
asm exit=Ok(0) cycles=855032 r[a1]=1
$ ckb-vm-runner build/schnorr_bench_clang schnorr 4103c5b538d6f695a961e916e7308211c8c917e1e02ca28
a21b0989596a9ffb6 e45408b5981ec7fd6e72faa161776fe5db17dd92226d1ad784816fb843e151127d9ccb615f364f317a35e2ddddc91bbf30ad103ddfd3ad7e839f508dbfe6298a foo bar
asm exit=Ok(0) cycles=872363 r[a1]=0
There are several points you should know before jumping to replicating the above results:

I’m running those steps on a Ubuntu 22.04 machine with LLVM 16 installed. If you are on other environments(macOS for example), you might run into errors saying things like clang-16 does not exist. In this case, you can alter the build command to make -f clang.makefile CC=clang DUMP=llvm-objdump . Either way, we are using the official distribution of clang packages here, no changes to the compilers are required for CKB smart contracts.
I’m running the generated binaries using ckb-vm-runner included in ckb-vm crate. If you don’t have it, you can install it via cargo install ckb-vm --example ckb-vm-runner --features=asm

We can also do a direct comparison on the same code executed here with GCC(note the following build command assumes you have docker installed):
$ make all-via-docker
$ ckb-vm-runner build/secp256k1_bench secp256k1 033f8cf9c4d51a33206a6c1c6b27d2cc5129daa19dbd
1fc148d395284f6b26411f 304402203679d909f43f073c7c1dcf8468a485090589079ee834e6eed92fea9b09b06a2402201e46f1075afa18f306715e7db87493e7b7e779569aa13c64ab3d09980b3560a3 foo b
ar
asm exit=Ok(0) cycles=1048934 r[a1]=0
$ ckb-vm-runner build/schnorr_bench schnorr 4103c5b538d6f695a961e916e7308211c8c917e1e02ca28a21b0989596a9ffb6 e4540
8b5981ec7fd6e72faa161776fe5db17dd92226d1ad784816fb843e151127d9ccb615f364f317a35e2ddddc91bbf30ad103ddfd3ad7e839f508dbfe6298a foo bar
asm exit=Ok(0) cycles=1065268 r[a1]=0
When running the same secp256k1/schnorr verification code, the binaries generated by clang 16 consume 855032/872363 cycles, while the binaries produced by gcc 12.2.0 consume 1048934/1065268 cycles. By switching to clang, not only can we get rid of forked toolchains, we are also seeing ~18% performance increase brought by clang.
In addition to standalone smart contracts, clang can also be used to build dynamic linked libraries. We have modified a slightly older version(picking an older version is simply due to the fact that ckb-auth is being developed at a fast pace) of ckb-auth so clang is used in the build process:
$ git clone https://github.com/xxuejie/ckb-auth
$ cd ckb-auth
$ git checkout c7c500749e95603a3ed53338d91e8da9bdd4f4ae
$ git submodule update --init --recursive
$ make
Similarly, if you were running into problems, maybe the clang binary is in a different name, you can try with make CC=clang LD=ld.lld OBJCOPY=llvm-objcopy AR=llvm-ar.
This time, we can run the test cases provided by ckb-auth directly to verify the correctness of our binary built by clang:
$ cd tests/auth_rust && bash run.sh
(some log lines omitted..)

running 20 tests
test tests::ckbmultisig_verify_sing_size_failed ... ok
test tests::litecoin_verify_official ... ok
test tests::bitcoin_pubkey_recid_verify ... ok
test tests::convert_btc_error ... ok
test tests::convert_tron_error ... ok
test tests::convert_doge_error ... ok
test tests::convert_eos_error ... ok
test tests::convert_lite_error ... ok
test tests::convert_eth_error ... ok
test tests::abnormal_algorithm_type ... ok
test tests::ethereum_verify ... ok
test tests::ckb_verify ... ok
test tests::bitcoin_verify ... ok
test tests::dogecoin_verify ... ok
test tests::litecoin_verify ... ok
test tests::eos_verify ... ok
test tests::tron_verify ... ok
test tests::bitcoin_uncompress_verify ... ok
test tests::schnorr_verify ... ok
test tests::ckbmultisig_verify ... ok

test result: ok. 20 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.11s

     Running unittests src/bin/ckb-auth-cli.rs (target/debug/deps/ckb_auth_cli-6d91be255770f835)

(some more log lines omitted..)
We don’t have any performance number here yet, but I would personally expect the binary generated by clang to be at least the same performance level as those binaries produced by GCC.
There are some quirks with clang indeed, one of the most noticeable one, being that text section is not aligned at page boundaries, causing some problems with dynamic loading libraries. We do have patches addressing this, which has already been integrated into latest ckb-std v0.14.3 version. Apart from those quirks, it does seem that clang 16 has been mature enough to provide highly performant yet robust binaries, and I would personally recommend everyone to switch to clang 16 for building CKB smart contracts. We might not need forked compilers anymore.
Compiling C contracts via CompCert

CompCert is another interesting C compiler. Citing from the authors of CompCert(reference at August 2023):

CompCert is the only production compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from miscompilation issues.

Using CompCert does not eliminate bugs in our C code, what it does instead, is to make sure that bugs buried deep in the compiler optimization pipeline can be very rare or completely eliminated. Meaning the generated binary, indeed match the behavior of the original C code according to C specification. As modern compilers are getting increasingly sophisticated, this can be a truly rare virtue that can shine in security critical environments. For example, CompCert has been famously used in the aviation and the nuclear industries.
The wonderful news is that: we can also leverage CompCert to build CKB smart contracts! However, this would require some tweakings now.
The first thing we need to know, is that a C compiler actually has a lot of moving parts underneath. When we enter the following command:
$ riscv64-unknown-elf-gcc foo.c -o foo

A series of actions are actually happening underneath:

gcc first invokes a preprocessor that executes and (in the end) removes all C macros included in the source code. You can also manually trigger this process by riscv64-unknown-elf-gcc  -E foo.c
The processed source code is then fed into the actual C compiler, a true C compiler only takes in processed C source code without any macros, and generates an assembly source file instead. The assembly source file will contain assembly code of the specific compiler target
The assembly source file is then fed into an assembler(in case of GCC, this accompanying assembler is named GNU as), the assembler would translate the assembly source file into an object binary file
The object binary file is finally linked together via a linker with other supporting files, generating the final binary we can use.

With the -v option, you can actually inspect closer on most of the above steps:
$ riscv64-unknown-elf-gcc foo.c -o foo -v

The only missing part is that since gcc includes both the preprocessor and the C compiler, it merges the two steps together in a single step.
With the full steps of the C compilation process explained, it is worth noting that CompCert is really only the true C compiler part. Meaning that CompCert only handles the work of compiling processed C source code without any C macros into an assembly source file. It uses glue code to invoke gcc for the rest of the parts. While it might sound counter-intuitive, it’s the C compiler that is most likely to result in bugs or vulnerabilities, and focusing on formalizing this part alone can already greatly improved security. The complexity for all the rest of the parts, including preprocessor, assembler and linker has already been wide studied. For more details on this design choice, feel free to refer to CompCert related publications.
While clang also has the preprocessor and assembler included, we will have to resolved to GCC for this experiment. The reason is that CompCert is hardwired to use GCC for now. Here we will use Ubuntu 22.04 for demonstration purposes, if you are using other OSes, the exact steps might vary. Or you can create a docker container to run the experiments here.
First, let’s install gcc package for riscv64 on Ubuntu:
$ sudo apt-get install gcc-riscv64-unknown-elf binutils-riscv64-unknown-elf

For this particular experiment, we can simply use the official package provided by Ubuntu, there is not need to use the forked gcc.
There are still some tweaks that is required for CompCert to be productive in CKB’s environment. We do wish the relevant changes can land on upstream CompCert soon, but for now, we will need a patched CompCert so as to work.
To build CompCert, we will need to install a specific version of Coq(with a specific version of OPAM) here:
$ # install opam
$ bash -c "sh <(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)"
$ # init opam
$ opam init
$ # install OCaml 4.14.1
$ opam switch create compcert 4.14.1
$ eval $(opam env --switch=compcert)
$ # install coq & menhir
$ opam pin coq 8.15.2
$ opam install menhir

Now let’s compiled patched CompCert:
$ git clone https://github.com/xxuejie/CompCert
$ git checkout e9dccc8cfa0a3a1faa0711b5fba7391b21045a3f
$ mkdir dist_rv64
$ ./configure -toolprefix riscv64-unknown-elf- -prefix `pwd`/dist_rv64 rv64-linux
$ make
$ make install
$ export COMPCERT=`pwd`/dist_rv64

Now we can build the same secp256k1 & schnorr code above, but using CompCert as the core C compiler:
$ # if you haven't done those already
$ git clone https://github.com/xxuejie/ckb-vm-bench-scripts
$ cd ckb-vm-bench-scripts
$ git checkout 434073e050ab5231acb451b7b45f1307838508c2
$ git submodule update --init --recursive

$ make -f compcert.makefile CCOMP=$COMPCERT/bin/ccomp

Notice this build steps actually make use of all 3 C compilers:

CompCert is used as the core C compiler
CompCert would leverage riscv64-unknown-elf-gcc as preprocessor and assembler
Some supporting C files(mainly the functions requiring inline assembly) are compiled using clang. The final binary is also linked via lld, which is the linker used by clang/LLVM

I tend to use clang 16 when I can, since I believe clang is the most mature one these days with the most eyes on it. However you can also choose to only use CompCert and gcc to build the final binary:
$ make -f compcert.makefile CCOMP=$COMPCERT/bin/ccomp \
	CLANG=riscv64-unknown-elf-gcc LD=riscv64-unknown-elf-ld \
	DUMP=riscv64-unknown-elf-objdump \
	CLANG_CFLAGS="-Wno-invalid-noreturn -ffunction-sections -fdata-sections -fvisibility=hidden -DCKB_DECLARATION_ONLY -O3 -g -I deps/secp256k1/src -nostdlib -I deps/ckb-c-stdlib/libc -I deps/ckb-c-stdlib"

If you have followed the steps here, you might notice that CompCert takes a super long time to compile the C code, this is one of the drawbacks of CompCert, just grab a cup of coffee and wait till it finishes. When it does, we can run the contracts similar to the above:
$ ckb-vm-runner build/secp256k1_bench_compcert secp256k1 033f8cf9c4d51a33206a6c1c6b27d2cc5129daa19dbd1fc148d395284
f6b26411f 304402203679d909f43f073c7c1dcf8468a485090589079ee834e6eed92fea9b09b06a2402201e46f1075afa18f306715e7db87493e7b7e779569aa13c64ab3d09980b3560a3 foo bar
asm exit=Ok(0) cycles=3176926 r[a1]=4503595332402223
$ ckb-vm-runner build/schnorr_bench_compcert schnorr 4103c5b538d6f695a961e916e7308211c8c917e1e02ca28a21b0989596a9f
fb6 e45408b5981ec7fd6e72faa161776fe5db17dd92226d1ad784816fb843e151127d9ccb615f364f317a35e2ddddc91bbf30ad103ddfd3ad7e839f508dbfe6298a foo bar
asm exit=Ok(0) cycles=3220551 r[a1]=4503595332402223

The C code compiled via CompCert does take tremendously more amount of time compared to the clang and the gcc versions. This is because for a compiler to be fully formally verified, only certain optimizations are available, and it takes even more amount of time to build the proof to verify the optimizations. So it really remains a tradeoff between using a secure compiler and using a fast compiler.
Finally, we do want to mention that CompCert is not free software. The source code of CompCert is available for research usage, but production usage of CompCert would require a commercial license from AbsInt. We kindly ask anyone in the Nervos ecosystem to be obligated to buy a license should they choose to use CompCert in their tech stack.
Rust Tweaks

We have so far been talking about C compilers extensively. However, we should not forgot that most people in the Nervos ecosystem write smart contracts in Rust. Only a handful of crypto heavy smart contracts remain in C. One shall naturally ask here: is it worthwhile to spend the length to talk about C compiler here?
The answer is yes, we still have C code used in ckb-std, not to mention that some crates still do use C code underneath. In the old days, one still needs gcc, sometimes a forked gcc so as to be able to compile Rust smart contracts. This means even now we can use stable Rust to build CKB smart contracts, we are still limited to environments with a riscv-enabled gcc. Now that clang can be used to build CKB smart contracts in C, can we leverage the same technique in Rust contracts?
We have already implemented a new feature build-with-clang which is shipped in ckb-std v0.14.3. When this feature is enabled, clang will be used to build the C code in ckb-std. What’s more, if you have a Rust crate including C code, you can use similar technique here, so your smart contracts can be built with clang as well. We are now entering a world where CKB smart contracts can be built with the official distribution of Rust, and the official distribution of LLVM/clang. No forked toolchain is required anymore.
Official Rust vs Custom Rust

In a previous article, I’ve been writing about a custom Rust toolchain that enables a set of specific optimizations:

Using clang to build C source code directly
B extension in codegen
std setup for CKB smart contracts
Some smaller tweaks

Maintaining a custom toolchain is always some redundant work, so with the above experiments, let’s revisit this choice again: do we really need a custom Rust toolchain?
First of all, the above mentioned work in ckb-std means that we can already use clang as the underlying C compiler when building Rust smart contracts.
Having B extension certainly helps with codegen, however, upon further research, this is also achievable in normal Rust compiler. Simply tweaking RUSTFLAGS can do the job: https://github.com/xxuejie/mmr-post-demo/commit/709da2bd83ea7f09c58cd60d5daa17c8d99be122.
std is a tricky part, on the one hand, having std certainly helps simplifying certain code, however, there is actually more problems intrinsic to porting Rust to a new operating system environment. In many places, crates are not assuming std alone, but doing inferring work on target_os as well:

https://github.com/rust-random/getrandom/blob/2e483d68aaa57168a84489349d6473b492e05478/src/lib.rs#L219-L292
https://github.com/dylni/os_str_bytes/blob/71033fe1bec5610a57457d4d0e4626c3d5570cef/src/common/mod.rs#L9-L18

This means providing std alone is not enough to support as many crates out there as we want. Many crates like the above ones require manual tweaking, and it would not be feasible amount of work for us to persuade each crate to introduce ckb as a new OS environment. For the time being, it might make more sense to keep CKB smart contracts in a no_std environment. In addition, the story of no_std in the Rust ecosystem has been getting much better in 2023. Many crates do now have no_std options. It might not be that hard anymore to write code in no_std environment.
Given those considerations, I’m reaching the conclusion, that maintaining a custom Rust toolchain is not providing enough benefits anymore. I do recommend a path where we are using official Rust + official clang when writing CKB smart contracts.