Skip to content

Instantly share code, notes, and snippets.

@Pharap
Last active June 13, 2021 17:32
Show Gist options
  • Save Pharap/f0f38b46d47afd92f5f996c92de348f3 to your computer and use it in GitHub Desktop.
Save Pharap/f0f38b46d47afd92f5f996c92de348f3 to your computer and use it in GitHub Desktop.
A rough description of C++ compilation. Good enough for an introduction, but not particularly technical.

The C++ build process is roughly as thus:

  1. All .cpp (and possibly .c) files are compiled:
    1. First the preprocessor processes the file, it (as required):
      • Parses and processes all #defined macros, storing their definitions in a symbol table (a string to string dictionary).
      • Parses and processes all #if, #elif and #endif macros, thus performing conditional compilation.
      • Parses and processes all #included header (.h) files, which includes inserting the contents of the header file into the point at which it was #included (or performs a process that achieves the equivalent effect).
      • Parses and processes all #pragmas, implementing their compiler-specific behaviour.
        • Note that the most common and widely supported #pragma is #pragma once, which acts as an alternative to include guards.
        • Other #pragma examples include #pragma omp, which implements 'Open Multi-Processing'
    2. The resulting preprocessed C++ source code (now free of all preprocessor directives and macros) is parsed and processed according to the rules of the C++ language. This phase results in the generation of ‘object files’ (file extensions and formats vary by OS and by compiler, e.g. .o files) which contain ‘object code’ (specially formatted machine code) and debugging symbols. This phase typically includes multiple optimisations.
    3. An optional phase of optimisation
  2. All the object files resulting from the compilation of the C++ code are ‘linked’ together by the linker, thus generating an executable file, which again varies by compiler target.
    • On Windows this is typically a .dll or an .exe
    • For code targetting an AVR/Arduino device this, the generated result is actually an .elf file - an executable format popularly used by Linux
  3. Mostly specific to embedded systems, the resulting executable format is stripped of all unnecessary information (debugging symbols, function relocation information, et cetera) and transformed into a blob of raw data. For an AVR/Arduino device this is a typically a .hex file, whilst for ARM this is a .bin file

For a more abstract, technical and/or specific explanation, see:
https://en.cppreference.com/w/cpp/language/translation_phases

Also relevant is the difference between declarations and definitions, and the ‘one definition rule’:
https://en.cppreference.com/w/cpp/language/definition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment