Skip to content

Instantly share code, notes, and snippets.

@htfy96
Last active April 2, 2024 16:48
Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save htfy96/e71f523fcc5ebabc82dd6e91eeaca78b to your computer and use it in GitHub Desktop.
Save htfy96/e71f523fcc5ebabc82dd6e91eeaca78b to your computer and use it in GitHub Desktop.
C++ Build profiles for 2024 projects

C++ Build profiles for 2024 projects

Standard development profile

This profile achieves 50% - 80% release profile performance, while also provides a reasonable amount of safety checks and debugging support. This should also be the profile for your CI build.

Compilation flags

-Og -Wall -Wextra -D_FORTIFY_SOURCE=2 -fstack-protector-strong -g -D_GLIBCXX_ASSERTIONS
  • Enables most warnings. Tune the warnings with -Wfallthrough or Wno-fallthrough. Didn't include compiler-specific flags due to varied opinion.
  • -Og is usually 30% slower than -O2 but provides much better debug experience
  • -D_FORTIFY_SOURCE=2 (could bump to =3 if you have gcc >= 12) performs checked memory and string operations
  • -fstack-protector-strong catches a fraction of stack buffer overflow with very little performance cost
  • -D_GLIBCXX_ASSERTIONS enables the bounds check on C++ stdlib containers, plus the non-emptiness check for unique_ptr/shared_ptr/optional/variant. Incurs ~10-20% overhead depending on your use case

You may want to check Reproducible builds too to reduce output flakes.

Runtime configurations

GLIBC_TUNABLES=glibc.malloc.perturb=204

This makes glibc overwrite all freed memory region with 0xCC, so that use-after-free could be more easily caught. Choosing this value because 0xcc maps to INT 3 on x86-64, and also is a non-printable character under ASCII. It could also be a negative value when interpreted as a uint8, allowing users to catch it visually.

Bonus: use a hardened malloc

hardened_malloc is a hardened malloc implementation, designed to catch many common heap memory issues. Simply build it and run your program with ./preload.sh {PROGARM} [ARGS...] and it will automatically replaces malloc/free implementation.

Release profile

tl;dr: Follow RedHat or OpenSSF recommendations.

Note that some of the recommendations have a small runtime performance cost and you could tune based on your need. My personal experience is:

  • 10%-20% slowdown with ``-D_GLIBCXX_ASSERTIONS`
  • 2%-10% slowdown with -fno-delete-null-pointer-checks (I write a lot of low-level pointer manipulation code and YMMV)
  • 0%-20% slowdown with -fno-strict-aliasing. Affects vectorizer a lot and probably worth tuning off in numerical calculation sources
  • -ftrivial-auto-var-init=zero: This is definitely safer for production if you have a lot of uninitialized variables that might be read, but personally I prefer initializing them manually with the clang-tidy check and opt out when necessary. Still defense in depth though.

Also don't forget to strip your binary if you don't want the customer to find out the function names and sources. It also results in a smaller binary.

Debug profile

This is a profile to maximize debug checks and debugging experience. As a tradeoff, it's typically 2x-10x slower than the release profile. At my company, we run it once every day (because each run takes half a day :(). Only use it when you want to actively debug a program.

-Og -g -Wall -Wextra -D_FORTIFY_SOURCE=2 -fsanitize=undefine,address,float-divide-by-zero,nullability

You can also attach -fsanitize=thread if your program is multithreaded, but it has to be in a separate build as TSAN conflicts with ASAN.

Runtime:

ASAN_OPTIONS=detect_stack_use_after_return=true:quarantine_size_mb=1024 UBSAN_OPTIONS=print_stacktrace=1

The default quarantine_size_mb (256MB) is often too low for programs using a lot of memory. Bump it to a reasonable number to make sure that is enough to cover a full lifecycle of control logic.

Miscellaneous tricks

Even faster builds

-Os -fuse-ld=lld

This profile is about 30% faster to recompile, at the cost of no warnings or safety checks. Helpful for compile-retry workflow

Thin debug info

The default -g produces a lot of debug info. If you only want to have a stacktrace with source line number, pass -g1 to speed up compilation and reduce binary size

@dannyvankooten
Copy link

dannyvankooten commented Mar 20, 2024

Nice summary, thank you for sharing! Doesn't -D_FORTIFY_SOURCE=2 only work when compiled with -O1 or higher though? (source)

@htfy96
Copy link
Author

htfy96 commented Mar 20, 2024

Nice summary, thank you for sharing! Doesn't -D_FORTIFY_SOURCE=2 only work when compiled with -O1 or higher though? (source)

Yeah. _FORTIFY_SOURCE only works when optimization is enabled. I believe -Og suffices as I used it in work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment