Skip to content

Instantly share code, notes, and snippets.

@lcw
Created August 23, 2019 20:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lcw/76262a9abc02342fab2efeb51e615f4b to your computer and use it in GitHub Desktop.
Save lcw/76262a9abc02342fab2efeb51e615f4b to your computer and use it in GitHub Desktop.
broken?
using GPUifyLoops, Cthulhu, CuArrays, CUDAnative
function kernel!(A, B)
@inbounds @loop for i in (1:size(A,1);
(blockIdx().x-1)*blockDim().x + threadIdx().x)
A[i] = B[i]
end
nothing
end
a = CuArray(rand(Float32, 10^3))
b = similar(a)
threads = 1024
blocks = ceil(Int, size(a,1)/threads)
@descend @launch(CUDA(), threads=threads, blocks=blocks, kernel!(a, b))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment