Skip to content

Instantly share code, notes, and snippets.

View jekbradbury's full-sized avatar

James Bradbury jekbradbury

View GitHub Profile
@jekbradbury
jekbradbury / softmax.jl
Last active June 11, 2018 03:42
CUDAnative softmax, ported from Marian-NMT
# ported from Marian-NMT
# https://github.com/marian-nmt/marian-dev/blob/8fbfa656/src/tensors/gpu/tensor_operators.cu#L206-L320
# licensed under MIT
using CUDAdrv
using CUDAnative
using BenchmarkTools
const MAX_THREADS = 256 # seems to work best (1024 max)
const MAX_BLOCKS = 2^31 - 1 # benchmark only exercises 2048