Skip to content

Instantly share code, notes, and snippets.

@sklam
Created February 13, 2015 19:58
Show Gist options
  • Save sklam/90a3f4b51c827a97fa3b to your computer and use it in GitHub Desktop.
Save sklam/90a3f4b51c827a97fa3b to your computer and use it in GitHub Desktop.
Numba concurrent kernels
from numba import cuda
import numpy as np
@cuda.jit
def foo(arr):
for i in range(arr.size):
arr[i] += i
A = np.arange(10000)
B = np.arange(10000)
dA=cuda.to_device(A)
dB=cuda.to_device(A)
cuda.synchronize()
streamA = cuda.stream()
streamB = cuda.stream()
for _ in range(100):
foo[1, 1, streamA](dA)
foo[1, 1, streamB](dB)
cuda.synchronize()
dA.copy_to_host(A)
dB.copy_to_host(B)
print(A)
print(B)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment