Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@rygorous
Last active June 19, 2021 21:06
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rygorous/f5e11f6589088f42dddd2711582afb1e to your computer and use it in GitHub Desktop.
Save rygorous/f5e11f6589088f42dddd2711582afb1e to your computer and use it in GitHub Desktop.
Single-precision FMA using double-precision mul/add ops
We agreed beforehand that we don't care about tininess-before-or-after-rounding shenanigans. :)
With that out of the way, the idea is to do something like this:
fma(af, bf, cf):
# All arithmetic done with RN
ad = float_to_double(af)
bd = float_to_double(bf)
cd = float_to_double(cf)
# exact: double has sufficient exponent range and mantissa bits
prod = mul_double(ad, bd)
# not necessarily exact!
sum0 = add_double(prod, cd)
# round-to-odd fixup
# ultimately we just need to determine whether the sum above was
# exact or not
fixup0 = sub_double(sum0, prod)
if False:
# VERSION 1 (would be nicer, but alas... it doesn't work)
# take prod = 1.0, cd = 2^54 for a counterexample
fixup1 = cmpneq_double(fixup0, cd)
else:
# VERSION 2 (more work but this one I'm pretty confident about)
# this is the rest of two-sum(prod, cd)
err1 = sub_double(sum0, fixup0)
err2 = sub_double(prod, err1)
err3 = sub_double(cd, fixup0)
err4 = add_double(err2, err3)
# err4 is now the exact error in the sum above
# check whether it's nonzero
fixup1 = cmpneq_double(err4, 0.0)
fixup2 = bitwise_and(fixup1, 1)
sum1 = bitwise_or(sum0, fixup2)
# round result to float
# round-to-odd above makes double rounding give the correct result
return convert_to_float(sum1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment