Skip to content

Instantly share code, notes, and snippets.

@rmunn
Created July 20, 2017 04:50
Show Gist options
  • Save rmunn/bc49d32a586cdfa5bcab1c3e7b45d7ac to your computer and use it in GitHub Desktop.
Save rmunn/bc49d32a586cdfa5bcab1c3e7b45d7ac to your computer and use it in GitHub Desktop.
Bitcount (aka popcount) implementation in F#, for 32 and 64-bit ints
let bitcount (n : int) =
let count2 = n - ((n >>> 1) &&& 0x55555555)
let count4 = (count2 &&& 0x33333333) + ((count2 >>> 2) &&& 0x33333333)
let count8 = (count4 + (count4 >>> 4)) &&& 0x0f0f0f0f
(count8 * 0x01010101) >>> 24
let bitcount64 (n : int64) =
let count2 = n - ((n >>> 1) &&& 0x5555555555555555L)
let count4 = (count2 &&& 0x3333333333333333L) + ((count2 >>> 2) &&& 0x3333333333333333L)
let count8 = (count4 + (count4 >>> 4)) &&& 0x0f0f0f0f0f0f0f0fL
(count8 * 0x0101010101010101L) >>> 56 |> int
bitcount -1 // Result: 32
bitcount64 (-1L) // Result: 64
@rmunn
Copy link
Author

rmunn commented Jul 20, 2017

The way this works is:

  1. First, divide the original number conceptually into groups of 2 bits, e.g. AABBCCDD. Now count2 has the structure aabbccdd, where aa contains the bit count of AA. I.e., if AA was 11, aa will be 10. If AA was either 01 or 10, aa will be 01, and if AA was 00, aa will also be 00. (Check it for yourself via bit math if you want).
  2. Now sum these 2-bit counts into 4-bit counts: count4 now has the structure bbbbdddd, where bb in count4 is equal to aa+bb in count2, and dd in count4 is equal to cc+dd in count2, and so on.
  3. Now sum these 4-bit counts into 8-bit counts the same way: count8 now has, every 8 bits, the bitcount of the corresponding 8 bits from the original number. (And since the maximum number of bits set in 8 bits is, of course, 8, that means that every 8 bits of count8 must have the pattern 0000nnnn, where nnnn can be, at most, 8 (or 1000). So the top four bits of every 8 bits of count8 are guaranteed to be 0.
  4. Finally, add up all those 8-bit counts in the top 8 bits of the integer, and then downshift them into the bottom 8 bits. There's the result.

Multiplying by 0x01010101 is just a clever, and more efficient, way of doing count8 <<< 0 + count8 <<< 8 + count8 <<< 16 + count8 <<< 24. The top 8 bits of that number end up being the sum of all those 8-bit values, and there's no danger of bit overflow interfering because the top 4 bits of every 8-bit segment of count8 are guaranteed to be 0.

The 64-bit function works exactly the same way, except that we only guarantee that the top three bits of every 8 bits of count8 will be 0. That's still enough to ensure no overflow in the final multiplication step.

If you have direct access to the processor, the CPU's popcnt instruction is the best way to go, but that's not available in F# (or in C#), so this is the next best approach.

Source: https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel (which notes that this algorithm is in the public domain).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment