Created
December 3, 2017 18:33
-
-
Save ilkerkesen/13aec9d548d11b4ddf8e15f15b1219ee to your computer and use it in GitHub Desktop.
RNNLM Profiling
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
───────────────────────────────────────────────────────────────────────────────────── | |
Time Allocations | |
────────────────────── ─────────────────────── | |
Tot / % measured: 207s / 26.0% 99.3MiB / 55.3% | |
Section ncalls time %tot avg alloc %tot avg | |
───────────────────────────────────────────────────────────────────────────────────── | |
back1.Knet.logp 1.00k 10.2s 19.0% 10.2ms 2.94MiB 5.36% 3.01KiB | |
forw.* 1.00k 10.1s 18.8% 10.1ms 2.77MiB 5.06% 2.84KiB | |
back2.AutoGrad.broadcast#+ 1.00k 6.05s 11.3% 6.05ms 652KiB 1.16% - | |
forw.Knet.logp 1.00k 5.50s 10.2% 5.50ms 3.52MiB 6.41% 3.60KiB | |
back2.* 1.00k 4.42s 8.24% 4.42ms 903KiB 1.61% - | |
forw.Ac_mul_B 1.00k 4.41s 8.22% 4.41ms 464KiB 0.83% - | |
forw.AutoGrad.broadcast#+ 1.00k 4.17s 7.77% 4.17ms 2.11MiB 3.85% 2.16KiB | |
back2.Knet.rnnforw 1.00k 4.08s 7.61% 4.08ms 3.19MiB 5.82% 3.27KiB | |
sumg1.getindex 7.00k 2.99s 5.58% 428μs 13.0MiB 23.7% 1.90KiB | |
back1.* 1.00k 2.88s 5.36% 2.88ms 903KiB 1.61% - | |
forw.A_mul_Bc 1.00k 2.87s 5.35% 2.87ms 464KiB 0.83% - | |
forw.Knet.rnnforw 1.00k 2.45s 4.58% 2.45ms 3.61MiB 6.58% 3.69KiB | |
forw.getindex 9.00k 638ms 1.19% 70.8μs 16.9MiB 30.8% 1.92KiB | |
back1.sum 1.00k 101ms 0.19% 101μs 538KiB 0.96% - | |
forw.sum 1.00k 35.7ms 0.07% 35.7μs 547KiB 0.97% - | |
back1.getindex 7.00k 18.5ms 0.03% 2.65μs 625KiB 1.11% - | |
back3.Knet.rnnforw 1.00k 6.69ms 0.01% 6.69μs 359KiB 0.64% - | |
forw.reshape 1.00k 5.74ms 0.01% 5.74μs 639KiB 1.14% - | |
forw./ 1.00k 5.41ms 0.01% 5.41μs 936KiB 1.67% - | |
back1.- 1.00k 3.75ms 0.01% 3.75μs 93.8KiB 0.17% - | |
forw.- 1.00k 3.71ms 0.01% 3.71μs 563KiB 1.00% - | |
back1.reshape 1.00k 3.38ms 0.01% 3.38μs 93.1KiB 0.17% - | |
back1.AutoGrad.broadcast#+ 1.00k 3.04ms 0.01% 3.04μs 31.3KiB 0.06% - | |
back1./ 1.00k 1.60ms 0.00% 1.60μs 109KiB 0.19% - | |
sumg1.- 1.00k 697μs 0.00% 697ns 15.6KiB 0.03% - | |
sumg1.sum 1.00k 617μs 0.00% 617ns - 0.00% - | |
sumg1.Knet.logp 1.00k 611μs 0.00% 611ns - 0.00% - | |
GRAD 18.0k 590μs 0.00% 32.8ns - 0.00% - | |
NODE 16.0k 567μs 0.00% 35.4ns - 0.00% - | |
sumg1.reshape 1.00k 468μs 0.00% 468ns - 0.00% - | |
sumg2.Knet.rnnforw 1.00k 301μs 0.00% 301ns - 0.00% - | |
sumg1.* 1.00k 228μs 0.00% 228ns - 0.00% - | |
sumg2.* 1.00k 214μs 0.00% 214ns - 0.00% - | |
sumg2.AutoGrad.broadcast#+ 1.00k 162μs 0.00% 162ns - 0.00% - | |
sumg1.AutoGrad.broadcast#+ 1.00k 146μs 0.00% 146ns - 0.00% - | |
sumg3.Knet.rnnforw 1.00k 138μs 0.00% 138ns - 0.00% - | |
sumg1./ 1.00k 133μs 0.00% 133ns 15.6KiB 0.03% - | |
───────────────────────────────────────────────────────────────────────────────────── |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment