Skip to content

Instantly share code, notes, and snippets.

@toivoh
Last active August 29, 2015 14:08
Show Gist options
  • Save toivoh/3cd3481dc66d9224811f to your computer and use it in GitHub Desktop.
Save toivoh/3cd3481dc66d9224811f to your computer and use it in GitHub Desktop.
Trying out the SLP vectorizer with Julia
module TestSLP
function rmw!{T}(dest::Ptr{T}, src::Ptr{T})
s1 = unsafe_load(src, 1)
s2 = unsafe_load(src, 2)
d1 = unsafe_load(dest, 1)
d2 = unsafe_load(dest, 2)
d1 $= s1
d2 $= s2
unsafe_store!(dest, d1, 1)
unsafe_store!(dest, d2, 2)
nothing
end
function rmw2!{T}(dest::Vector{T}, src::Vector{T})
psrc, pdest = pointer(src), pointer(dest)
s1 = unsafe_load(psrc, 1)
s2 = unsafe_load(psrc, 2)
d1 = unsafe_load(pdest, 1)
d2 = unsafe_load(pdest, 2)
d1 $= s1
d2 $= s2
unsafe_store!(pdest, d1, 1)
unsafe_store!(pdest, d2, 2)
nothing
end
function rmw2b!{T}(dest::Vector{T}, src::Vector{T})
@inbounds s1 = src[1]
@inbounds s2 = src[2]
@inbounds d1 = dest[1]
@inbounds d2 = dest[2]
d1 $= s1
d2 $= s2
pdest = pointer(dest)
unsafe_store!(pdest, d1, 1)
unsafe_store!(pdest, d2, 2)
nothing
end
function rmw3!{T}(dest::Vector{T}, src::Vector{T})
@inbounds s1 = src[1]
@inbounds s2 = src[2]
@inbounds d1 = dest[1]
@inbounds d2 = dest[2]
d1 $= s1
d2 $= s2
@inbounds dest[1] = d1
@inbounds dest[2] = d2
nothing
end
println("\n\nrmw!: ")
code_native(rmw!, (Ptr{Uint64}, Ptr{Uint64}))
println("\n\nrmw2!: ")
code_native(rmw2!, (Vector{Uint64}, Vector{Uint64}))
println("\n\nrmw2b!: ")
code_native(rmw2b!, (Vector{Uint64}, Vector{Uint64}))
println("\n\nrmw3!: ")
code_native(rmw3!, (Vector{Uint64}, Vector{Uint64}))
end # module
@toivoh
Copy link
Author

toivoh commented Nov 11, 2014

The problem seems to be something to do with the stores though, with LLVM 3.5.0 and branch adr/slpvector I can also vectorize rmw2b! (just added above) which uses @inbounds for the loads but unsafe_store! for the stores. I tried to look at the difference in code_llvm output between rmw2b! and rmw3!, but the main difference I see is that the first has been vectorized :) Maybe there should be a version of code_llvm that gives you the llvm code before optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment