pervognsen/bit_algebra.md

## bit_algebra.md

      
    Raw
  

              bit_algebra.md
            
          
    Here's an example of an algebraic approach to bitwise operations. I'm intentionally choosing something that is obvious
from the programmer's point of view to see what it corresponds to algebraically.
We know that bitwise rotation of an n-bit word x can be done with left/shift shifts:
(x << 1) | (x >> (n-1)) = (x << 1) ^ (x >> (n-1)).

Algebraically, this means that the rotation operator C satisfies C = L + R^(n-1). Since C is a rotation it must satisfy C^n = 1, i.e. if
we rotate n times we should end up where we started. This corresponds to the identity (L + R^(n-1))^n = 1.
What do we know about L and R algebraically as operators? They are shifts, so they satisfy L^n = 0 and R^n = 0: if we shift an n-bit word n times, it will yield 0 regardless of the initial value.  We'll discover further algebraic properties of L and R along the way.
The goal of this exercise isn't to prove the obvious fact about shifts and rotations but to explore the algebra.
To get warmed up, let's start with n = 2, so we want to prove (L + R)^2 = 1. Expanding this with the distributive law, we get
(L + R)^2 = L^2 + LR + RL + R^2 = LR + RL, where we used that L^2 = 0 and R^2 = 0. Now consider what LR and RL mean. LR means "right shift then left shift". This leaves the second bit in its place and sets the first bit to 0. Similarly, RL means "left shift then right shit" and leaves the first bit in its place and sets the second bit to 0. We call these projections onto the first and second bit. So LR = P_1 and RL = P_0. It's then obvious that 1 = P_1 + P_0. In bitwise terms, this says that x == (x & 1) ^ (x & 2), so P_i corresponds to bitwise masking by 1 << i.
To tackle the general case we can expand (L + R^(n-1))^n. But to avoid getting lost in the noise, we need to take a step back and consider
what will happen. Our expectation from the first example is that n of the terms will equal the P_i projections and everything else will be 0.
If I told you as a programmer to isolate the ith bit with left and right shifts you'd probably come up with this:
x & (1 << i) == ((x >> i) << (n-1)) >> (n-i-1)

In algebraic terms, this means that P_i = R^(n-i-1) L^(n-1) R^i. However, note that our definition of C only involves R^(n-1), so can't compute arbitrary powers like R^i. So let's use an alternative expression where we started by left shifting:
x & (1 << i) == ((x << (n-i-1)) >> (n-1)) << i

This corresponds to P_i = L^(n-i-1) R^(n-1) L^i, which only involves R^(n-1).
For the n = 3 case we have:
P_0 = L^2 R^2
P_1 = L R^2 L
P_2 = R^2 L^2

You can see the first and last case only requires two shifts.
You should have noticed that these P_i projections involve exactly one R^2 = R^(n-1) factor. In the expansion of (L + R^(n-1))^n, we can therefore usefully classify terms in three groups:

Term with no R^(n-1) factor: The only such term is L^n, and L^n = 0.
Term with one R^(n-1) factor: There are n such terms. These are our projections P_i.
Term with more than one R^(n-1) factor. We want to prove these are all 0.

The proof of this last claim should be pretty clear: if a term contains two or more R^(n-1) factors, after the first R^(n-1) shift, only 1 entry of the bit vector can be nonzero. Because there's not enough factors of L to have an intervening L^(n-1) factor to cancel out the first R^(n-1) factor (with 2 or more factors of R^(n-2) there's at most n-2 factors of L), the second R^(n-1) factor is guaranteed to kill the last surviving entry.