Skip to content

Instantly share code, notes, and snippets.

@icyveins7
Last active September 13, 2023 04:17
Show Gist options
  • Save icyveins7/4723a125c44a0d393d8fa0aac5f3603c to your computer and use it in GitHub Desktop.
Save icyveins7/4723a125c44a0d393d8fa0aac5f3603c to your computer and use it in GitHub Desktop.
On the C++ compiler with unused code

Introduction

The number of posts around this topic are few, and many posts just revert to the 'oh let the compiler handle this'. Some people want to learn what the compiler is doing..

My question started with this: in developing a library I started by using template specializations to prevent 'disallowed' types from working. However, I soon realised that this may have been overkill, and just using simple overloading would have been enough, since I was template specializing for every type allowed anyway.

But then one should begin to ask:

  1. Since templates are supposedly compile-time constructs which only 'generate code' when they are invoked, then the template specializations, when compiled, should have less 'dead' code in them than if a bunch of overloaded functions are used right?
  2. In that case, when do unused overloaded functions get stripped out?
  3. Or more generally, when do unused functions/templates get stripped out?

Answers

I highly recommend reading this great post to start. Then let's test it ourselves as well.

The post above suggests using the compile flags -ffunction-sections -Wl,--gc-sections to remove unused functions. This indeed works (for macOS, you have to use -ffunction-sections -Wl,-dead-strip instead). I wrote a slightly different piece of test code to try this.

Test code

#include <iostream>

// Declare the general template but with no definition
template <typename T>
void unused_template_print(T i);

// Only allow specializations
template <>
void unused_template_print<int>(int i)
{
    std::cout << i + 1 << std::endl;
}

template <>
void unused_template_print<double>(double i)
{
    std::cout << i + 1.0 << std::endl;
}

// Use function overloading instead
void unused_print(int i)
{
    std::cout << i << std::endl;
}

void unused_print(double i)
{
    std::cout << i << std::endl;
}

int main()
{
    unused_print(1.0);
    unused_template_print(1.0);

    return 0;
}

As you can see, here we are testing two separate ways of doing the 'same' thing: one via template specializations, and one via function overloads. In both cases I only call the double version.

Results

g++ test.cpp

objdump -tC a.out | grep "unused"
0000000100003e40 l     F __TEXT,__text __GLOBAL__sub_I_test_unused_func.cpp
0000000100003dac g     F __TEXT,__text unused_print(double)
0000000100003d78 g     F __TEXT,__text unused_print(int)
0000000100003d3c g     F __TEXT,__text void unused_template_print<double>(double)
0000000100003d00 g     F __TEXT,__text void unused_template_print<int>(int)

Binary size is 34562 bytes.

g++ test.cpp -ffunction-sections -Wl,-dead_strip

objdump -tC a.out | grep "unused"
0000000100003eb0 l     F __TEXT,__text __GLOBAL__sub_I_test_unused_func.cpp
0000000100003e1c g     F __TEXT,__text unused_print(double)
0000000100003de0 g     F __TEXT,__text void unused_template_print<double>(double)

Binary size is 34194 bytes.

g++ test.cpp -ffunction-sections -Wl,-dead_strip -O2

objdump -tC a.out | grep "unused"
0000000100003e44 l     F __TEXT,__text_startup __GLOBAL__sub_I_test_unused_func.cpp
0000000100003d90 g     F __TEXT,__text unused_print(double)
0000000100003d00 g     F __TEXT,__text void unused_template_print<double>(double)

Binary size is 34354 bytes.

g++ test.cpp -ffunction-sections -Wl,-dead_strip -O3

objdump -tC a.out | grep "unused"
0000000100003e84 l     F __TEXT,__text_startup __GLOBAL__sub_I_test_unused_func.cpp

Binary size is 34290 bytes.

inline on template specializations

Often times when defining templates in separate headers, specializations will be defined inline (to avoid multiply defined symbols when including in multiple .cpp files). Doing it to a template like this is supposed to 'almost literally' just swap the inlined specialization definition wherever it's called. So here we retry the above but inline the specializations:

#include <iostream>

// Declare the general template but with no definition
template <typename T>
void unused_template_print(T i);

// Only allow specializations
template <>
inline void unused_template_print<int>(int i)
{
    std::cout << i + 1 << std::endl;
}

template <>
inline void unused_template_print<double>(double i)
{
    std::cout << i + 1.0 << std::endl;
}

// Use function overloading instead
void unused_print(int i)
{
    std::cout << i << std::endl;
}

void unused_print(double i)
{
    std::cout << i << std::endl;
}

int main()
{
    unused_print(1.0);
    unused_template_print(1.0);

    return 0;
}

Results with inline

g++ test.cpp

objdump -tC a.out | grep "unused"
0000000100003e64 l     F __TEXT,__text __GLOBAL__sub_I_test_unused_func.cpp
0000000100003dd0 g     F __TEXT,__text unused_print(double)
0000000100003d9c g     F __TEXT,__text unused_print(int)
0000000100003d60  w    F __TEXT,__text void unused_template_print<double>(double)

Binary size is 34546 bytes. Cool, this left out the unused template now. But we could have done this with our linker optimizations above, so let's just check that again..

g++ test.cpp -ffunction-sections -Wl,-dead_strip

objdump -tC a.out | grep "unused"
0000000100003ea4 l     F __TEXT,__text __GLOBAL__sub_I_test_unused_func.cpp
0000000100003e10 g     F __TEXT,__text unused_print(double)
0000000100003dd4  w    F __TEXT,__text void unused_template_print<double>(double)

Binary size is 34242 bytes. Same functions left in binary as before, but somehow slightly bigger (34242 vs 34194 previously). Let's try with the optimisations back on.

g++ test.cpp -ffunction-sections -Wl,-dead_strip -O2

objdump -tC a.out | grep "unused"
0000000100003e74 l     F __TEXT,__text_startup __GLOBAL__sub_I_test_unused_func.cpp
0000000100003d40 g     F __TEXT,__text unused_print(double)

Binary size is 34258 bytes. Interestingly, with -O2 now the used template is also completely inlined, whereas previously it was still there!

g++ test.cpp -ffunction-sections -Wl,-dead_strip -O3

objdump -tC a.out | grep "unused"
0000000100003e84 l     F __TEXT,__text_startup __GLOBAL__sub_I_test_unused_func.cpp

Binary size is 34290 bytes. With -O3, we are back to everything being inlined, and the binary size being identical.

At a Glance

I put all 3 tests - templates, inlined templates, overloads - into 1 file and compiled all the different combinations: optimisation levels (none to O3) and with/without dead_strip. The results are in the table below

| | No opt | -O1 | -O2 | -O3 | | | -------- | ------- | ------- | ------- | | No dead_strip | TODO | TODO | TODO | TODO | | With dead_strip | TODO | TODO | TODO | TODO |

Conclusions

Using inline template specializations actually gets the compiler to remove the function definition at earlier levels of optimisation than -O3. At -O3 itself, everything is effectively the same.

Versions Used in Testing

g++/gcc 13.2.0, Homebrew (if it matters) on macOS.

Windows and MSVC

A note here on the equivalent of the linker optimization -ffunction-sections -Wl,-dead_strip on MSVC. It appears that the default configuration for Release in Visual Studio 2019 will automatically enable /opt:REF and /Gy which appears to suggest they do the same thing (TODO: actually test it, along with optimisation levels in MSVC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment