Skip to content

Instantly share code, notes, and snippets.

@toivoh
Created November 2, 2014 09:08
Show Gist options
  • Save toivoh/436cb54b6d23cdfb9888 to your computer and use it in GitHub Desktop.
Save toivoh/436cb54b6d23cdfb9888 to your computer and use it in GitHub Desktop.
Trying to get Julia to emit an efficient sequence of SIMD instructions with the aid of llvmcall
module TestSIMD2
# requires Julia 0.4 for llvmcall
typealias Uint64x2 NTuple{2, Uint64}
function ($)(x::Uint64x2, y::Uint64x2)
Base.llvmcall("""%3 = xor <2 x i64> %1, %0
ret <2 x i64> %3""",
Uint64x2, (Uint64x2, Uint64x2), x, y)
end
function innerloop!{T}(dest::Vector{T}, dest_ofs, src::Vector{T}, src_ofs)
@inbounds s = ( src[1 + 2*src_ofs], src[2 + 2*src_ofs])
@inbounds d = (dest[1 + 2*dest_ofs], dest[2 + 2*dest_ofs])
d $= s
@inbounds (dest[1 + 2*dest_ofs], dest[2 + 2*dest_ofs]) = d
end
T = Uint64
code_native(innerloop!, (Vector{T}, Int, Vector{T}, Int))
end
@Keno
Copy link

Keno commented Nov 2, 2014

Unfortunately, Julia's arrays can also hold unaligned data. For fun, I told LLVM this anyway, but unfortunately that does not seem to make a difference in the generated code.

@Keno
Copy link

Keno commented Nov 2, 2014

Note: If you really want you can use llvmcall with a load <2xi64> ... align 16 to get the desired behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment