View gist:bdce8e970227eeb7bb0d29d0fa03452c
Is is asynchronous? or synchronous? How can we run asynchronously?
View Ruby without GVL in C

rb_thread_call_without_gvl を使う


nvrtcResult status = nvrtcCreateProgram(&_prog, _src, _name, _numHeaders, _headers, _includeNames);

となる C 関数の呼び出しを、GVLを外して呼び出すには、

View gdb with

bundle exec gdb とやると環境変数が渡って楽

$ bundle exec gdb ruby
gdb> run test.rb

コマンドラインで ruby への引数も渡したいときは --args オプションが使える。 ちなみに man gdb には出てこないが、gdb --help には出てくるオプション。

View CUDA Bus

It occurred when I did not wait GPU process finishes.

#include <stdio.h>
#include <cuda_runtime.h>

void my_kernel(int val, int *A, int N)
    int i = threadIdx.x;
View Ruby GC in C
View preprocessor.pyx
# A trick to embed preprocessors in cython
cdef extern from *:
cdef void EMIT_IF_PYTHON_VERSION_HEX_LT_37 "#if PY_VERSION_HEX < 0x03070000 //" ()
cdef void EMIT_ELSE "#else //" ()
cdef void EMIT_ENDIF "#endif //" ()
Integration Auto-Disable? Notify? What we need to do NOTE
email No Amazingly, it is not disabled
github Yes Yes Click enable link on the notification
google calendar don't know
hubot don't know
View bm_string_interpolation.rb
b = 'b'
1_000_000.times { "a#{b}c" }
$ nvcc -O2 -o simpleCallback
$ nvprof -f -o simpleCallback.nvvp ./simpleCallback | grep elapsed
No callback: elapsed time = 1.534s
One callback: elapsed time = 1.498s
Two callback: elapsed time = 3.718s
Four callback: elapsed time = 5.194s

As increasing callbacks, it becomes slow...


Usually, located at /usr/local/cuda/bin

Non-Visual Profiler

$ nvprof python

I prefer to use --print-gpu-trace.