Skip to content

Instantly share code, notes, and snippets.

@simonbyrne
Created December 3, 2014 17:25
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save simonbyrne/1e4f5234eaa70725c30f to your computer and use it in GitHub Desktop.
function rint_llvm(x::Float64)
Base.llvmcall("""%x = call double @llvm.rint.f64(double %0)
ret double %x""",Float64,(Float64,),x)
end
function trunc_llvm(x::Float64)
Base.llvmcall("""%x = call double @llvm.trunc.f64(double %0)
ret double %x""",Float64,(Float64,),x)
end
# warm up llvmcall: ignore errors
rint_llvm(1.0)
rint_llvm(1.0)
trunc_llvm(1.0)
trunc_llvm(1.0)
# use "add remainder" trick
function round_rem(x::Float64)
t = trunc_llvm(x)
trunc_llvm(x+(x-t))
end
# use "prevfloat(0.5)" trick
round_prev(x::Float64) = trunc_llvm(x+copysign(0.49999999999999994,x))
round_rem(1.0)
round_rem(1.0)
round_prev(1.0)
round_prev(1.0)
function test_rint_llvm(N)
t = 1.0
s = 0.0
for i = 1:N
t += eps()
s += rint_llvm(t)
end
s
end
function test_round_rem(N)
t = 1.0
s = 0.0
for i = 1:N
t += eps()
s += round_rem(t)
end
s
end
function test_round_prev(N)
t = 1.0
s = 0.0
for i = 1:N
t += eps()
s += round_prev(t)
end
s
end
@time test_rint_llvm(100_000_000);
@time test_rint_llvm(100_000_000);
@time test_round_rem(100_000_000);
@time test_round_rem(100_000_000);
@time test_round_prev(100_000_000);
@time test_round_prev(100_000_000);
@simonbyrne
Copy link
Author

times after warmup:

julia> @time test_rint_llvm(100_000_000);
elapsed time: 0.212698866 seconds (96 bytes allocated)

julia> @time test_round_rem(100_000_000);
elapsed time: 0.536767063 seconds (96 bytes allocated)

julia> @time test_round_prev(100_000_000);
elapsed time: 0.219526082 seconds (96 bytes allocated)

julia> versioninfo()
Julia Version 0.4.0-dev+1916
Commit 11d5e8c* (2014-12-02 12:53 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-2677M CPU @ 1.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Looking at the @code_native, only unvectorised (vroundsd) instructions were used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment