Skip to content

Instantly share code, notes, and snippets.

@foobaz
Last active September 1, 2024 07:25
Show Gist options
  • Save foobaz/3287f153d125277eefea to your computer and use it in GitHub Desktop.
Save foobaz/3287f153d125277eefea to your computer and use it in GitHub Desktop.
Integer square root algorithm
/*
Just like integer division, these functions round down to an integer.
Running this program should produce the following output:
sqrt(3735928559) == 61122
61122^2 == 3735898884
sqrt(244837814094590) == 15647294
15647294^2 == 244837809522436
This algorithm comes from Jack W. Crenshaw's 1998 article in Embedded:
http://www.embedded.com/electronics-blogs/programmer-s-toolbox/4219659/Integer-Square-Roots
*/
#include <stdio.h>
#include <stdint.h>
uint16_t gapsqrt32(uint32_t a);
uint32_t gapsqrt64(uint64_t a);
int main(void) {
uint32_t small = 0xdeadbeef;
uint32_t sqrt_small = gapsqrt32(small);
uint32_t check_small = sqrt_small * sqrt_small;
printf("sqrt(%u) == %u\n", small, sqrt_small);
printf("%u^2 == %u\n", sqrt_small, check_small);
uint64_t large = 0xdeadbeefcafe;
uint64_t sqrt_large = gapsqrt64(large);
uint64_t check_large = sqrt_large * sqrt_large;
printf("sqrt(%lu) == %lu\n", large, sqrt_large);
printf("%lu^2 == %lu\n", sqrt_large, check_large);
return 0;
}
uint16_t gapsqrt32(uint32_t a) {
uint32_t rem = 0, root = 0;
for (int i = 32 / 2; i > 0; i--) {
root <<= 1;
rem = (rem << 2) | (a >> (32 - 2));
a <<= 2;
if (root < rem) {
rem -= root | 1;
root += 2;
}
}
return root >> 1;
}
uint32_t gapsqrt64(uint64_t a) {
uint64_t rem = 0, root = 0;
for (int i = 64 / 2; i > 0; i--) {
root <<= 1;
rem = (rem << 2) | (a >> (64 - 2));
a <<= 2;
if (root < rem) {
rem -= root | 1;
root += 2;
}
}
return root >> 1;
}
@JensGrabner
Copy link

`// integer square root of a positive number
// http://ww1.microchip.com/downloads/en/AppNotes/91040a.pdf
uint32_t sqrt(uint64_t num) {
uint32_t res = 0; // result
uint32_t add = 1;
add <<= 31; // 'additional' bit is in position 31
uint32_t temp; // 'A'
uint64_t quad; // 'A^2'

while ( add > 0 ) { // 32x test and shift right
temp = res + add;
quad = temp;
quad *= temp;
if ( num >= quad ) {
res = temp;
}
add >>= 1; // shift right the 'additional' bit
}
return res;
}
`

@Aclassifier
Copy link

Aclassifier commented Aug 31, 2024

Thanks! The good thing about the code is that they produce the right answer for the below extreme MAX cases. Squaring these roots will produce a slightly lower value than the starting point. 32 bits root then squared do not leak into the 64 bits, and 64 bits of the same does not leak into 128 bits. Here are the two cases, each for decimal and hex:

//    UINT32_MAX         0xFFFFFFFF           4294967295
//    UINT64_MAX 0xFFFFFFFFFFFFFFFF 18446744073709551615
// 
//    gapsqrt32(4294967295) =  65535 squared = 4294836225 ( 131070 lower than UINT32_MAX)
//    gapsqrt32(0xFFFFFFFF) = 0xFFFF squared = 0xFFFE0001 (0x1FFFE lower than UINT32_MAX)
// 
//    gapsqrt64(18446744073709551615) = 4294967295 squared = 18446744065119617025  (8589934590 lower than UINT64_MAX)
//    gapsqrt64(0xFFFFFFFFFFFFFFFF)   = 0xFFFFFFFF squared =   0xFFFFFFFE00000001 (0x1FFFFFFFE lower than UINT64_MAX)

I thought this was obvious until I detected than the library I had used with a slightly different implementation had got this wrong for its dsp_math_int_sqrt function. The root was 65536 which of course when squared causes BIT32 to be set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment