Skip to content

Instantly share code, notes, and snippets.

@simonster
Last active August 29, 2015 14:11
Show Gist options
  • Save simonster/b1b4cc2ad0daa8e20a99 to your computer and use it in GitHub Desktop.
Save simonster/b1b4cc2ad0daa8e20a99 to your computer and use it in GitHub Desktop.
missingness sum benchmark
immutable CheapMulBool
x::Bool
end
immutable Negated{T}
x::T
end
immutable BitWrapper
x::Vector{Uint64}
end
Base.getindex(x::BitWrapper, i) = CheapMulBool(!Base.unsafe_bitgetindex(x.x, i))
Base.getindex(x::Negated, i) = CheapMulBool(unsafe_load(convert(Ptr{Uint8}, pointer(x.x)), i) == 0)
*(x::CheapMulBool, y::Number) = ifelse(x.x, y, zero(y))
function f(x, y)
z = zero(eltype(x))
@simd for i = 1:length(x)
@inbounds z += y[i]*x[i]
end
z
end
x = randn(10000000)
ybool = rand(Bool, length(x))
ybit = bitpack(ybool)
yfloat = convert(Vector{Float64}, !ybool)
f(x, Negated(ybool))
f(x, BitWrapper(ybit.chunks))
f(x, yfloat)
@time for i = 1:50 f(x, Negated(ybool)) end
@time for i = 1:50 f(x, BitWrapper(ybit.chunks)) end
@time for i = 1:50 f(x, yfloat) end
@simonster
Copy link
Author

And for comparison, the NaN-based approach, again on LLVM SVN:

function g(x)
    z = zero(eltype(x))
    @simd for i = 1:length(x)
        @inbounds z += ifelse(isnan(x[i]), zero(eltype(x[i])), x[i])
    end
    z
end
julia> @time for i = 1:50; g(x); end
elapsed time: 0.206984494 seconds (800 bytes allocated)

(This is basically the same as an ordinary sum.)

@nalimilan
Copy link

So the moral seems to be that the BitArray approach can potentially be made quite fast in the future. Any idea why you observe a slow down with LLVM SVN compared to 3.4 (or 3.5?) when not unrolling?

@simonster
Copy link
Author

The comparison is to LLVM 3.3, which partially vectorizes the loop whereas LLVM SVN doesn't vectorize it at all. I can't explain why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment