Skip to content

Instantly share code, notes, and snippets.

@orlp
Last active April 21, 2024 15:11
Show Gist options
  • Star 73 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save orlp/3551590 to your computer and use it in GitHub Desktop.
Save orlp/3551590 to your computer and use it in GitHub Desktop.
int64_t ipow(int64_t base, uint8_t exp) {
static const uint8_t highest_bit_set[] = {
0, 1, 2, 2, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 255, // anything past 63 is a guaranteed overflow with base > 1
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255,
};
int64_t result = 1;
switch (highest_bit_set[exp]) {
case 255: // we use 255 as an overflow marker and return 0 on overflow/underflow
if (base == 1) {
return 1;
}
if (base == -1) {
return 1 - 2 * (exp & 1);
}
return 0;
case 6:
if (exp & 1) result *= base;
exp >>= 1;
base *= base;
case 5:
if (exp & 1) result *= base;
exp >>= 1;
base *= base;
case 4:
if (exp & 1) result *= base;
exp >>= 1;
base *= base;
case 3:
if (exp & 1) result *= base;
exp >>= 1;
base *= base;
case 2:
if (exp & 1) result *= base;
exp >>= 1;
base *= base;
case 1:
if (exp & 1) result *= base;
default:
return result;
}
}
@Clemapfel
Copy link

this can easily be made constexpr / consteval by moving highest_bit_set into the same namespace as the function and making that constexpr too which I recommend as it would be in the spirit of optimal runtime

@Zorgatone
Copy link

@orlp

Does this work with uint64_t as well?
Do I need to add elements to highest_bit_set past exponent 63?

@unclefunctor
Copy link

The optimization here is at a whole new level. Thanks for sharing and well done sir!

@xsbee
Copy link

xsbee commented Jun 18, 2022

Use switch fallthroughs.

@chemoelectric
Copy link

chemoelectric commented Feb 5, 2023

That is some dang clever loop unrolling.

@QVRE
Copy link

QVRE commented Sep 20, 2023

Wouldn't renaming 255 to 7 make this code easier to turn into an array lookup for the compiler ?

@MOJNICK
Copy link

MOJNICK commented Jan 29, 2024

@ian-abbott Have you tested performance for static and non-static "highest_bit_set version"?

@MOJNICK
Copy link

MOJNICK commented Jan 29, 2024

@Zorgatone yes you need to add, It will be equal to 7. And you need to add switch-case 7: . + Change signed parameters/values to unsigned. Or if you are sure the results you are aiming are <= int_64_max then of course you can use this function straight away ( in that case casting rules of C applies )

@ian-abbott
Copy link

ian-abbott commented Jan 30, 2024

@MOJNICK If highest_bit_set is non-static, the code initializes the array contents every time the function is called, so it is slower than the static version.

@nschaeff
Copy link

nschaeff commented Mar 8, 2024

the lookup table for highest bit set can be replaced with __builtin_clz if it is available (gcc, clang). It will make it faster (especially if the lookup table is not in cache, which is likely in a real-world application doing other things than that) and smaller.

Simply replace highest_bit_set[exp] with (exp>0) ? 32 - __builtin_clz(exp) : 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment