Skip to content

Instantly share code, notes, and snippets.

@corburn
Created January 10, 2014 19:36
Show Gist options
  • Save corburn/8361068 to your computer and use it in GitHub Desktop.
Save corburn/8361068 to your computer and use it in GitHub Desktop.
MIT 6.172 Performance Engineering of Software Systems Open Courseware Lecture 2
// Swap two integers x and y without a temporary
// Performance poor with instruction-level parallelism (ILP)
// Each step has to wait for the previous step to compute before it can execute
x = x ^ y
y = x ^ y
x = x ^ y
// Minimum of two integers x and y without a branch.
// Can execute in about 4 cycles. Without a branch, the CPU doesn't have to clear
// the pipeline if it predicts the wrong branch.
min = y ^ ((x ^ y) & -(x ^ y))
// Modular Addition r = (x + y) % n
r = z - (n & -(z >= n))
// Round up to a power of 2
// Example for a 64 bit integer
// This will flood the lower bits with one and increment to get the next power of 2.
// The first step is to decrement so it will not round if the number is already a
// power of 2
--n
n |= n >> 1
n |= n >> 2
n |= n >> 4
n |= n >> 8
n |= n >> 16
n |= n >> 32
++n
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment