Skip to content

Instantly share code, notes, and snippets.

@FrankNiemeyer
Created June 9, 2015 09:01
for j = 0 to (m / 4) - 1 do // assume m % 4 = 0
let i = 4 * j
// load four x components [x0, x1, x2, x3] into a SIMD register
let xs = u.x.[i .. i+3]
let ys = u.y.[i .. i+3]
let zs = u.z.[i .. i+3]
// component-wise multiplication and addition -> 4 DPs
dp.[i .. i+3] <- xs * xs + ys * ys + zs * zs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment