Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save AaronDMarasco/e6bbe395d3f460f3975577082753f31a to your computer and use it in GitHub Desktop.
Save AaronDMarasco/e6bbe395d3f460f3975577082753f31a to your computer and use it in GitHub Desktop.
Beware macros when using C++ STL <algorithm> functions

Super short summary: Using ntohs in code resulted in huge bloat with both gcc and clang. The preprocessor must see it as a function call, ntohs(), to replace it with its built-in equivalent. So using std::transform on something that is normally a macro but has a library fallback will cause your code to slow down considerably (and in my case not be able to keep up with a real-time processing flow).

https://gcc.godbolt.org/z/dM2Ge4 is a MWE (also attached to this gist). As of 4 May 2020, both compilers' trunks with -std=c++17 -Ofast will call the ntohs from an internal library with the "naked" std::transform call. If your compiler supports lambdas, the second iteration works around the issue. If not, the hand-unrolled third version produces the exact same assembly as the second.

This is because if you dig deep enough, you will find in inet/netinet/in.h that ntohs is a macro pointing to __bswap_16. But without (), your preprocessor doesn't know this, so it has to use the library fallback.

#include <arpa/inet.h>
#include <algorithm>
#include <cassert>
#include <vector>
void my_byte_swap(const uint16_t *in1, const uint16_t *in1_end, uint16_t *out) {
std::transform(in1, in1_end, out, ntohs);
}
void my_byte_swap2(const uint16_t *in1, const uint16_t *in1_end, uint16_t *out) {
std::transform(in1, in1_end, out, [](uint16_t x) {return ntohs(x);});
}
void my_byte_swap3(const uint16_t *in1, const uint16_t *in1_end, uint16_t *out) {
for (; in1 != in1_end; ++in1, ++out)
*out = ntohs(*in1);
}
/*
void mymain() {
std::vector<uint16_t> tempo1(35, 0x0123), tempo2;
tempo2.resize(tempo1.size());
my_byte_swap(tempo1.data(), tempo1.data()+tempo1.size(), tempo2.data());
// my_byte_swap2(tempo1.data(), tempo1.data()+tempo1.size(), tempo2.data());
// my_byte_swap3(tempo1.data(), tempo1.data()+tempo1.size(), tempo2.data());
assert(tempo2[34] == 0x2301);
}
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment