Skip to content

Instantly share code, notes, and snippets.

@shuax
Forked from loicmolinari/fast_pow10_division.md
Created August 31, 2023 02:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shuax/14819c243c9df152569b8553b9f44f53 to your computer and use it in GitHub Desktop.
Save shuax/14819c243c9df152569b8553b9f44f53 to your computer and use it in GitHub Desktop.
Fast power-of-10 division using multiply and shift magic constants in a look-up table instead of a costly division operation.
U64
DivPow10(U64 a, S64 index)
{
    ASSERT(index >= 0 && index < 19);

    // Generated using libdivide's reference implementation adapted to 64-bit unsigned integers.
    static const struct { U64 mul; U32 shr1; U32 shr2; } kPow10Magics[19] = {
        { 0xcccccccccccccccd,  0,  3 },  // 10^1
        { 0x28f5c28f5c28f5c3,  2,  2 },  // 10^2
        { 0x20c49ba5e353f7cf,  3,  4 },  // 10^3
        { 0x346dc5d63886594b,  0, 11 },  // 10^4
        { 0x0a7c5ac471b47843,  5,  7 },  // 10^5
        { 0x431bde82d7b634db,  0, 18 },  // 10^6
        { 0xd6bf94d5e57a42bd,  0, 23 },  // 10^7
        { 0xabcc77118461cefd,  0, 26 },  // 10^8
        { 0x0044b82fa09b5a53,  9, 11 },  // 10^9
        { 0xdbe6fecebdedd5bf,  0, 33 },  // 10^10
        { 0xafebff0bcb24aaff,  0, 36 },  // 10^11
        { 0x232f33025bd42233,  0, 37 },  // 10^12
        { 0x384b84d092ed0385,  0, 41 },  // 10^13
        { 0x0b424dc35095cd81,  0, 42 },  // 10^14
        { 0x00024075f3dceac3, 15, 20 },  // 10^15
        { 0x39a5652fb1137857,  0, 51 },  // 10^16
        { 0x00005c3bd5191b53, 17, 22 },  // 10^17
        { 0x000049c97747490f, 18, 24 },  // 10^18
        { 0x760f253edb4ab0d3,  0, 62 }   // 10^19
    };
    // '& 63' is a compiler hint indicating shr1 is in [0, 63]. Otherwise, depending on the
    // architecture, the compiler adds various workarounds since right shifting by a count greater
    // than or equal to the bit width is Undefined Behavior. Without the hint on x86-64, both GCC
    // and clang would emit 'shrd' (or 'shrx') followed by a conditional move to keep the original
    // value when the count is 64.
    const U128 r1 = (U128)a >> (kPow10Magics[index].shr1 & 63);
    const U128 r2 = r1 * (U128)kPow10Magics[index].mul;
    // '>> 64' is another compiler hint indicating it can simply get rid of the least significant
    // 64-bit register from the destination register pair used by the 'mul' above.
    const U64 r3 = r2 >> 64;
    return r3 >> (kPow10Magics[index].shr2 & 63);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment