Skip to content

Instantly share code, notes, and snippets.

@ilkerkesen
Created December 3, 2017 18:33
Show Gist options
  • Save ilkerkesen/13aec9d548d11b4ddf8e15f15b1219ee to your computer and use it in GitHub Desktop.
Save ilkerkesen/13aec9d548d11b4ddf8e15f15b1219ee to your computer and use it in GitHub Desktop.
RNNLM Profiling
─────────────────────────────────────────────────────────────────────────────────────
Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 207s / 26.0% 99.3MiB / 55.3%
Section ncalls time %tot avg alloc %tot avg
─────────────────────────────────────────────────────────────────────────────────────
back1.Knet.logp 1.00k 10.2s 19.0% 10.2ms 2.94MiB 5.36% 3.01KiB
forw.* 1.00k 10.1s 18.8% 10.1ms 2.77MiB 5.06% 2.84KiB
back2.AutoGrad.broadcast#+ 1.00k 6.05s 11.3% 6.05ms 652KiB 1.16% -
forw.Knet.logp 1.00k 5.50s 10.2% 5.50ms 3.52MiB 6.41% 3.60KiB
back2.* 1.00k 4.42s 8.24% 4.42ms 903KiB 1.61% -
forw.Ac_mul_B 1.00k 4.41s 8.22% 4.41ms 464KiB 0.83% -
forw.AutoGrad.broadcast#+ 1.00k 4.17s 7.77% 4.17ms 2.11MiB 3.85% 2.16KiB
back2.Knet.rnnforw 1.00k 4.08s 7.61% 4.08ms 3.19MiB 5.82% 3.27KiB
sumg1.getindex 7.00k 2.99s 5.58% 428μs 13.0MiB 23.7% 1.90KiB
back1.* 1.00k 2.88s 5.36% 2.88ms 903KiB 1.61% -
forw.A_mul_Bc 1.00k 2.87s 5.35% 2.87ms 464KiB 0.83% -
forw.Knet.rnnforw 1.00k 2.45s 4.58% 2.45ms 3.61MiB 6.58% 3.69KiB
forw.getindex 9.00k 638ms 1.19% 70.8μs 16.9MiB 30.8% 1.92KiB
back1.sum 1.00k 101ms 0.19% 101μs 538KiB 0.96% -
forw.sum 1.00k 35.7ms 0.07% 35.7μs 547KiB 0.97% -
back1.getindex 7.00k 18.5ms 0.03% 2.65μs 625KiB 1.11% -
back3.Knet.rnnforw 1.00k 6.69ms 0.01% 6.69μs 359KiB 0.64% -
forw.reshape 1.00k 5.74ms 0.01% 5.74μs 639KiB 1.14% -
forw./ 1.00k 5.41ms 0.01% 5.41μs 936KiB 1.67% -
back1.- 1.00k 3.75ms 0.01% 3.75μs 93.8KiB 0.17% -
forw.- 1.00k 3.71ms 0.01% 3.71μs 563KiB 1.00% -
back1.reshape 1.00k 3.38ms 0.01% 3.38μs 93.1KiB 0.17% -
back1.AutoGrad.broadcast#+ 1.00k 3.04ms 0.01% 3.04μs 31.3KiB 0.06% -
back1./ 1.00k 1.60ms 0.00% 1.60μs 109KiB 0.19% -
sumg1.- 1.00k 697μs 0.00% 697ns 15.6KiB 0.03% -
sumg1.sum 1.00k 617μs 0.00% 617ns - 0.00% -
sumg1.Knet.logp 1.00k 611μs 0.00% 611ns - 0.00% -
GRAD 18.0k 590μs 0.00% 32.8ns - 0.00% -
NODE 16.0k 567μs 0.00% 35.4ns - 0.00% -
sumg1.reshape 1.00k 468μs 0.00% 468ns - 0.00% -
sumg2.Knet.rnnforw 1.00k 301μs 0.00% 301ns - 0.00% -
sumg1.* 1.00k 228μs 0.00% 228ns - 0.00% -
sumg2.* 1.00k 214μs 0.00% 214ns - 0.00% -
sumg2.AutoGrad.broadcast#+ 1.00k 162μs 0.00% 162ns - 0.00% -
sumg1.AutoGrad.broadcast#+ 1.00k 146μs 0.00% 146ns - 0.00% -
sumg3.Knet.rnnforw 1.00k 138μs 0.00% 138ns - 0.00% -
sumg1./ 1.00k 133μs 0.00% 133ns 15.6KiB 0.03% -
─────────────────────────────────────────────────────────────────────────────────────
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment