dtype = numpy.float32
batch_size = 2
in_channels = 3
out_channels =2
kernel_size = (3, 3, 1)
stride = (1, 1, 1)
pad = (0, 0, 0)
in_dims = (1000, 768, 3)
out_dims = (998, 766, 3)
x_shape = (batch_size, in_channels) + in_dims
View numo-linalg with openblas at
$ brew install openblas
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local,
because macOS provides BLAS and LAPACK in the Accelerate framework.

For compilers to find this software you may need to set:
    LDFLAGS:  -L/usr/local/opt/openblas/lib
    CPPFLAGS: -I/usr/local/opt/openblas/include
For pkg-config to find this software you may need to set:
View gist:bdce8e970227eeb7bb0d29d0fa03452c
Is is asynchronous? or synchronous? How can we run asynchronously?
View Ruby without GVL in C

rb_thread_call_without_gvl を使う


nvrtcResult status = nvrtcCreateProgram(&_prog, _src, _name, _numHeaders, _headers, _includeNames);

となる C 関数の呼び出しを、GVLを外して呼び出すには、

View gdb with

bundle exec gdb とやると環境変数が渡って楽

$ bundle exec gdb ruby
gdb> run test.rb

コマンドラインで ruby への引数も渡したいときは --args オプションが使える。 ちなみに man gdb には出てこないが、gdb --help には出てくるオプション。

View CUDA Bus

It occurred when I did not wait GPU process finishes.

#include <stdio.h>
#include <cuda_runtime.h>

void my_kernel(int val, int *A, int N)
    int i = threadIdx.x;
View preprocessor.pyx
# A trick to embed preprocessors in cython
cdef extern from *:
cdef void EMIT_IF_PYTHON_VERSION_HEX_LT_37 "#if PY_VERSION_HEX < 0x03070000 //" ()
cdef void EMIT_ELSE "#else //" ()
cdef void EMIT_ENDIF "#endif //" ()
Integration Auto-Disable? Notify? What we need to do NOTE
email No Amazingly, it is not disabled
github Yes Yes Click enable link on the notification
google calendar don't know
hubot don't know
View bm_string_interpolation.rb
b = 'b'
1_000_000.times { "a#{b}c" }
$ nvcc -O2 -o simpleCallback
$ nvprof -f -o simpleCallback.nvvp ./simpleCallback | grep elapsed
No callback: elapsed time = 1.534s
One callback: elapsed time = 1.498s
Two callback: elapsed time = 3.718s
Four callback: elapsed time = 5.194s

As increasing callbacks, it becomes slow...