Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
32-bit byte swap
Version a:
byteswap32(uint32_t x)
{
uint32_t y = (x >> 24) & 0xff;
y |= (x >> 8) & 0xff00;
y |= (x << 8) & 0xff0000;
y |= (x << 24) & 0xff00000u;
}
(or any reordering thereof, or substitute |= with +=)
Version b:
byteswap32(uint32_t x)
{
uint32_t y = (x >> 24) & 0xff;
y |= ((x >> 16) & 0xff) << 8;
y |= ((x >> 8) & 0xff) << 16;
y |= (x & 0xff) << 24;
}
(or any reordering thereof, or substitute |= with +=)
Version c:
byteswap32(uint32_t x)
{
// static_assert that sizeof(uint32_t) == 4 if you want.
uint8_t bytes[4];
uint32_t y;
memcpy(bytes, &x, 4);
std::swap(bytes[0], bytes[3]);
std::swap(bytes[1], bytes[2]);
memcpy(&y, bytes, 4);
return y;
}
(again, up to reordering. Or use type punning through a union. Or use two buffers and copy
with reordering instead of swapping in-place.)
Version d:
uint32_t byteswap32(uint32_t x)
{
return (byteswap16(x & 0xffff) << 16) | byteswap16(x >> 16);
}
(up to reordering. With multiple possible implementations for byteswap16.)
Version e:
uint32_t byteswap32(uint32_t x)
{
// This looks strange but happens to map directly to 3 PowerPC instructions
// (rlwinm, rlwimi, rlwimi) that form the standard byte reverse sequence on
// that target.
uint32_t y = (x << 24) | (x >> 8); // rlwinm
y = (y & ~0x00ff0000u) | ((x << 8) & 0x00ff0000u); // rlwimi
y = (y & ~0x000000ffu) | ((x >> 24) & 0x000000ffu); // rlwimi
}
I have seen all these basic variants (and many of the noted variations) in
production code. That's why "just puttern matching during instruction selection"
doesn't work: there is no canonical way this is always written. If you want to
handle this, you can either:
a) perform sufficient analysis to detect any of these patterns, or
b) introduce a canonical way to write it, make that fast, and recommend people use it.
Now, I've argued elsewhere that exposing byte swaps directly is kind of unfortunate
in the first place, since mostly byteswaps gets used when loading data with a known
endianness, on a target architecture that has a different endianness. The preferable
way to handle that is to state the target endian directly, not have logic to figure
out whether to swap or not. But the same concern applies to other constructs such as,
say, loading a little-endian value by doing
bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24)
It's great when you can agree people to always write it exactly that way, but there are
many variants floating around, and many such cases to catch. You can make pure
pattern matching work if you make clear from the outset that there is one blessed way
to do say an unaligned little endian load (say the code sequence above), and ensure that
all compilers handle that correctly. But with C/C++ that ship has sailed; there are many
variants in common use, and different compilers disagree on what the right thing to
pattern-match is, if they implement it at all!
Again, this is a lot simpler if there's a known construct that people are supposed to
use, and making that an official part of the language is pretty much the only way you
get both the compilers and the users to actually handle it well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment