-
-
Save ToucheSir/2fecbe99e5b304fed11e25e42c535cc3 to your computer and use it in GitHub Desktop.
Flux RNN View vs Slice
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LSTM CPU c=32 n=32 ts=4 | |
forward | |
40.249 μs (53 allocations: 88.14 KiB) | |
backward | |
155.457 μs (608 allocations: 233.70 KiB) | |
forw and back | |
336.703 μs (1495 allocations: 405.92 KiB) | |
LSTM CUDA c=32 n=32 ts=4 | |
forward | |
449.046 μs (1289 allocations: 70.70 KiB) | |
backward | |
1.079 ms (3075 allocations: 208.31 KiB) | |
forw and back | |
1.702 ms (5005 allocations: 321.98 KiB) | |
LSTM CPU c=32 n=32 ts=16 | |
forward | |
177.645 μs (209 allocations: 364.23 KiB) | |
backward | |
584.899 μs (2300 allocations: 950.73 KiB) | |
forw and back | |
1.219 ms (5575 allocations: 1.59 MiB) | |
LSTM CUDA c=32 n=32 ts=16 | |
forward | |
1.384 ms (5153 allocations: 282.67 KiB) | |
backward | |
3.566 ms (12219 allocations: 831.17 KiB) | |
forw and back | |
5.574 ms (19597 allocations: 1.23 MiB) | |
LSTM CPU c=32 n=32 ts=32 | |
forward | |
360.835 μs (417 allocations: 732.36 KiB) | |
backward | |
1.154 ms (4557 allocations: 1.86 MiB) | |
forw and back | |
2.370 ms (11017 allocations: 3.19 MiB) | |
LSTM CUDA c=32 n=32 ts=32 | |
forward | |
2.631 ms (10305 allocations: 565.30 KiB) | |
backward | |
6.852 ms (24412 allocations: 1.62 MiB) | |
forw and back | |
10.746 ms (39055 allocations: 2.46 MiB) | |
LSTM CPU c=32 n=32 ts=64 | |
forward | |
733.224 μs (833 allocations: 1.43 MiB) | |
backward | |
2.329 ms (9069 allocations: 3.73 MiB) | |
forw and back | |
4.713 ms (21897 allocations: 6.38 MiB) | |
LSTM CUDA c=32 n=32 ts=64 | |
forward | |
5.068 ms (20609 allocations: 1.10 MiB) | |
backward | |
13.402 ms (48796 allocations: 3.24 MiB) | |
forw and back | |
21.120 ms (77967 allocations: 4.91 MiB) | |
LSTM CPU c=32 n=128 ts=4 | |
forward | |
153.963 μs (53 allocations: 342.64 KiB) | |
backward | |
273.555 μs (608 allocations: 722.83 KiB) | |
forw and back | |
597.819 μs (1495 allocations: 1.16 MiB) | |
LSTM CUDA c=32 n=128 ts=4 | |
forward | |
467.039 μs (1297 allocations: 70.83 KiB) | |
backward | |
1.109 ms (3119 allocations: 209.00 KiB) | |
forw and back | |
1.761 ms (5049 allocations: 322.67 KiB) | |
LSTM CPU c=32 n=128 ts=16 | |
forward | |
644.988 μs (209 allocations: 1.38 MiB) | |
backward | |
1.065 ms (2300 allocations: 2.86 MiB) | |
forw and back | |
2.241 ms (5575 allocations: 4.70 MiB) | |
LSTM CUDA c=32 n=128 ts=16 | |
forward | |
1.426 ms (5185 allocations: 283.17 KiB) | |
backward | |
3.689 ms (12395 allocations: 833.92 KiB) | |
forw and back | |
5.764 ms (19773 allocations: 1.24 MiB) | |
LSTM CPU c=32 n=128 ts=32 | |
forward | |
1.302 ms (417 allocations: 2.79 MiB) | |
backward | |
2.168 ms (4557 allocations: 5.73 MiB) | |
forw and back | |
4.473 ms (11017 allocations: 9.41 MiB) | |
LSTM CUDA c=32 n=128 ts=32 | |
forward | |
2.663 ms (10369 allocations: 566.30 KiB) | |
backward | |
7.051 ms (24764 allocations: 1.63 MiB) | |
forw and back | |
10.977 ms (39407 allocations: 2.46 MiB) | |
LSTM CPU c=32 n=128 ts=64 | |
forward | |
2.627 ms (833 allocations: 5.59 MiB) | |
backward | |
4.497 ms (9069 allocations: 11.46 MiB) | |
forw and back | |
8.940 ms (21897 allocations: 18.84 MiB) | |
LSTM CUDA c=32 n=128 ts=64 | |
forward | |
5.145 ms (20737 allocations: 1.11 MiB) | |
backward | |
13.790 ms (49500 allocations: 3.26 MiB) | |
forw and back | |
21.547 ms (78671 allocations: 4.92 MiB) | |
LSTM CPU c=32 n=512 ts=4 | |
forward | |
1.477 ms (64 allocations: 1.32 MiB) | |
backward | |
1.701 ms (636 allocations: 2.60 MiB) | |
forw and back | |
3.646 ms (1546 allocations: 4.18 MiB) | |
LSTM CUDA c=32 n=512 ts=4 | |
forward | |
468.979 μs (1334 allocations: 71.66 KiB) | |
backward | |
1.175 ms (3147 allocations: 209.44 KiB) | |
forw and back | |
1.834 ms (5146 allocations: 325.08 KiB) | |
LSTM CPU c=32 n=512 ts=16 | |
forward | |
5.619 ms (256 allocations: 5.46 MiB) | |
backward | |
7.090 ms (2412 allocations: 10.51 MiB) | |
forw and back | |
14.776 ms (5782 allocations: 16.99 MiB) | |
LSTM CUDA c=32 n=512 ts=16 | |
forward | |
1.435 ms (5342 allocations: 286.62 KiB) | |
backward | |
3.871 ms (12519 allocations: 835.86 KiB) | |
forw and back | |
6.144 ms (20134 allocations: 1.24 MiB) | |
LSTM CPU c=32 n=512 ts=32 | |
forward | |
12.595 ms (512 allocations: 10.98 MiB) | |
backward | |
14.100 ms (4781 allocations: 21.06 MiB) | |
forw and back | |
29.444 ms (11432 allocations: 34.07 MiB) | |
LSTM CUDA c=32 n=512 ts=32 | |
forward | |
2.721 ms (10686 allocations: 573.25 KiB) | |
backward | |
7.438 ms (25016 allocations: 1.63 MiB) | |
forw and back | |
11.761 ms (40120 allocations: 2.48 MiB) | |
LSTM CPU c=32 n=512 ts=64 | |
forward | |
25.441 ms (1024 allocations: 22.03 MiB) | |
backward | |
28.548 ms (9517 allocations: 42.15 MiB) | |
forw and back | |
58.672 ms (22728 allocations: 68.23 MiB) | |
LSTM CUDA c=32 n=512 ts=64 | |
forward | |
5.208 ms (21374 allocations: 1.12 MiB) | |
backward | |
14.488 ms (50008 allocations: 3.26 MiB) | |
forw and back | |
23.170 ms (80088 allocations: 4.94 MiB) | |
LSTM CPU c=128 n=32 ts=4 | |
forward | |
53.703 μs (53 allocations: 88.14 KiB) | |
backward | |
196.536 μs (608 allocations: 413.70 KiB) | |
forw and back | |
399.235 μs (1495 allocations: 537.92 KiB) | |
LSTM CUDA c=128 n=32 ts=4 | |
forward | |
484.813 μs (1289 allocations: 70.70 KiB) | |
backward | |
1.125 ms (3075 allocations: 208.31 KiB) | |
forw and back | |
1.769 ms (5005 allocations: 321.98 KiB) | |
LSTM CPU c=128 n=32 ts=16 | |
forward | |
232.433 μs (209 allocations: 364.23 KiB) | |
backward | |
741.138 μs (2300 allocations: 1.67 MiB) | |
forw and back | |
1.420 ms (5575 allocations: 2.14 MiB) | |
LSTM CUDA c=128 n=32 ts=16 | |
forward | |
1.469 ms (5153 allocations: 282.67 KiB) | |
backward | |
3.697 ms (12219 allocations: 831.17 KiB) | |
forw and back | |
5.811 ms (19597 allocations: 1.23 MiB) | |
LSTM CPU c=128 n=32 ts=32 | |
forward | |
475.835 μs (417 allocations: 732.36 KiB) | |
backward | |
1.496 ms (4557 allocations: 3.35 MiB) | |
forw and back | |
2.802 ms (11017 allocations: 4.30 MiB) | |
LSTM CUDA c=128 n=32 ts=32 | |
forward | |
2.735 ms (10305 allocations: 565.30 KiB) | |
backward | |
7.067 ms (24412 allocations: 1.62 MiB) | |
forw and back | |
11.125 ms (39055 allocations: 2.46 MiB) | |
LSTM CPU c=128 n=32 ts=64 | |
forward | |
959.431 μs (833 allocations: 1.43 MiB) | |
backward | |
3.159 ms (9069 allocations: 6.72 MiB) | |
forw and back | |
5.664 ms (21897 allocations: 8.62 MiB) | |
LSTM CUDA c=128 n=32 ts=64 | |
forward | |
5.234 ms (20609 allocations: 1.10 MiB) | |
backward | |
13.496 ms (48796 allocations: 3.24 MiB) | |
forw and back | |
21.335 ms (77967 allocations: 4.91 MiB) | |
LSTM CPU c=128 n=128 ts=4 | |
forward | |
311.561 μs (53 allocations: 342.64 KiB) | |
backward | |
1.003 ms (616 allocations: 1.16 MiB) | |
forw and back | |
1.899 ms (1499 allocations: 1.43 MiB) | |
LSTM CUDA c=128 n=128 ts=4 | |
forward | |
476.439 μs (1297 allocations: 70.83 KiB) | |
backward | |
1.125 ms (3119 allocations: 209.00 KiB) | |
forw and back | |
1.769 ms (5049 allocations: 322.67 KiB) | |
LSTM CPU c=128 n=128 ts=16 | |
forward | |
2.236 ms (209 allocations: 1.38 MiB) | |
backward | |
4.116 ms (2332 allocations: 4.72 MiB) | |
forw and back | |
7.460 ms (5591 allocations: 5.81 MiB) | |
LSTM CUDA c=128 n=128 ts=16 | |
forward | |
1.438 ms (5185 allocations: 283.17 KiB) | |
backward | |
3.694 ms (12395 allocations: 833.92 KiB) | |
forw and back | |
5.800 ms (19773 allocations: 1.24 MiB) | |
LSTM CPU c=128 n=128 ts=32 | |
forward | |
4.504 ms (417 allocations: 2.79 MiB) | |
backward | |
8.002 ms (4621 allocations: 9.46 MiB) | |
forw and back | |
14.503 ms (11049 allocations: 11.65 MiB) | |
LSTM CUDA c=128 n=128 ts=32 | |
forward | |
2.700 ms (10369 allocations: 566.30 KiB) | |
backward | |
7.140 ms (24764 allocations: 1.63 MiB) | |
forw and back | |
11.057 ms (39407 allocations: 2.46 MiB) | |
LSTM CPU c=128 n=128 ts=64 | |
forward | |
9.043 ms (833 allocations: 5.59 MiB) | |
backward | |
16.704 ms (9197 allocations: 18.94 MiB) | |
forw and back | |
29.730 ms (21961 allocations: 23.32 MiB) | |
LSTM CUDA c=128 n=128 ts=64 | |
forward | |
5.205 ms (20737 allocations: 1.11 MiB) | |
backward | |
14.014 ms (49500 allocations: 3.26 MiB) | |
forw and back | |
21.858 ms (78671 allocations: 4.92 MiB) | |
LSTM CPU c=128 n=512 ts=4 | |
forward | |
1.537 ms (64 allocations: 1.32 MiB) | |
backward | |
2.000 ms (636 allocations: 4.18 MiB) | |
forw and back | |
3.908 ms (1546 allocations: 5.01 MiB) | |
LSTM CUDA c=128 n=512 ts=4 | |
forward | |
498.242 μs (1334 allocations: 71.66 KiB) | |
backward | |
1.193 ms (3147 allocations: 209.44 KiB) | |
forw and back | |
1.856 ms (5146 allocations: 325.08 KiB) | |
LSTM CPU c=128 n=512 ts=16 | |
forward | |
6.646 ms (256 allocations: 5.46 MiB) | |
backward | |
8.319 ms (2412 allocations: 16.88 MiB) | |
forw and back | |
15.881 ms (5782 allocations: 20.35 MiB) | |
LSTM CUDA c=128 n=512 ts=16 | |
forward | |
1.488 ms (5342 allocations: 286.62 KiB) | |
backward | |
3.926 ms (12519 allocations: 835.86 KiB) | |
forw and back | |
6.031 ms (20134 allocations: 1.24 MiB) | |
LSTM CPU c=128 n=512 ts=32 | |
forward | |
13.529 ms (512 allocations: 10.98 MiB) | |
backward | |
16.747 ms (4781 allocations: 33.80 MiB) | |
forw and back | |
31.685 ms (11432 allocations: 40.81 MiB) | |
LSTM CUDA c=128 n=512 ts=32 | |
forward | |
2.830 ms (10686 allocations: 573.25 KiB) | |
backward | |
7.573 ms (25016 allocations: 1.63 MiB) | |
forw and back | |
11.650 ms (40120 allocations: 2.48 MiB) | |
LSTM CPU c=128 n=512 ts=64 | |
forward | |
27.174 ms (1024 allocations: 22.03 MiB) | |
backward | |
33.567 ms (9517 allocations: 67.64 MiB) | |
forw and back | |
63.216 ms (22728 allocations: 81.71 MiB) | |
LSTM CUDA c=128 n=512 ts=64 | |
forward | |
5.410 ms (21374 allocations: 1.12 MiB) | |
backward | |
14.782 ms (50008 allocations: 3.26 MiB) | |
forw and back | |
22.877 ms (80088 allocations: 4.94 MiB) | |
LSTM CPU c=512 n=32 ts=4 | |
forward | |
296.655 μs (53 allocations: 88.14 KiB) | |
backward | |
902.290 μs (623 allocations: 1.11 MiB) | |
forw and back | |
1.507 ms (1506 allocations: 1.04 MiB) | |
LSTM CUDA c=512 n=32 ts=4 | |
forward | |
495.875 μs (1309 allocations: 71.02 KiB) | |
backward | |
1.152 ms (3107 allocations: 208.81 KiB) | |
forw and back | |
1.801 ms (5057 allocations: 322.80 KiB) | |
LSTM CPU c=512 n=32 ts=16 | |
forward | |
1.234 ms (209 allocations: 364.23 KiB) | |
backward | |
3.671 ms (2363 allocations: 4.62 MiB) | |
forw and back | |
5.887 ms (5622 allocations: 4.34 MiB) | |
LSTM CUDA c=512 n=32 ts=16 | |
forward | |
1.492 ms (5233 allocations: 283.92 KiB) | |
backward | |
3.738 ms (12347 allocations: 833.17 KiB) | |
forw and back | |
5.898 ms (19805 allocations: 1.24 MiB) | |
LSTM CPU c=512 n=32 ts=32 | |
forward | |
2.483 ms (417 allocations: 732.36 KiB) | |
backward | |
7.368 ms (4684 allocations: 9.29 MiB) | |
forw and back | |
11.759 ms (11112 allocations: 8.75 MiB) | |
LSTM CUDA c=512 n=32 ts=32 | |
forward | |
2.761 ms (10465 allocations: 567.80 KiB) | |
backward | |
7.149 ms (24668 allocations: 1.63 MiB) | |
forw and back | |
11.274 ms (39471 allocations: 2.46 MiB) | |
LSTM CPU c=512 n=32 ts=64 | |
forward | |
5.011 ms (833 allocations: 1.43 MiB) | |
backward | |
12.625 ms (9324 allocations: 18.65 MiB) | |
forw and back | |
23.286 ms (22088 allocations: 17.56 MiB) | |
LSTM CUDA c=512 n=32 ts=64 | |
forward | |
5.330 ms (20929 allocations: 1.11 MiB) | |
backward | |
13.971 ms (49308 allocations: 3.25 MiB) | |
forw and back | |
22.006 ms (78799 allocations: 4.92 MiB) | |
LSTM CPU c=512 n=128 ts=4 | |
forward | |
598.161 μs (53 allocations: 342.64 KiB) | |
backward | |
1.382 ms (623 allocations: 2.99 MiB) | |
forw and back | |
2.212 ms (1506 allocations: 2.51 MiB) | |
LSTM CUDA c=512 n=128 ts=4 | |
forward | |
529.644 μs (1317 allocations: 71.14 KiB) | |
backward | |
1.173 ms (3151 allocations: 209.50 KiB) | |
forw and back | |
1.868 ms (5101 allocations: 323.48 KiB) | |
LSTM CPU c=512 n=128 ts=16 | |
forward | |
2.532 ms (209 allocations: 1.38 MiB) | |
backward | |
5.782 ms (2363 allocations: 12.17 MiB) | |
forw and back | |
8.975 ms (5622 allocations: 10.26 MiB) | |
LSTM CUDA c=512 n=128 ts=16 | |
forward | |
1.601 ms (5265 allocations: 284.42 KiB) | |
backward | |
3.813 ms (12523 allocations: 835.92 KiB) | |
forw and back | |
6.066 ms (19981 allocations: 1.24 MiB) | |
LSTM CPU c=512 n=128 ts=32 | |
forward | |
5.206 ms (417 allocations: 2.79 MiB) | |
backward | |
9.077 ms (4684 allocations: 24.41 MiB) | |
forw and back | |
17.836 ms (11112 allocations: 20.59 MiB) | |
LSTM CUDA c=512 n=128 ts=32 | |
forward | |
2.990 ms (10529 allocations: 568.80 KiB) | |
backward | |
7.328 ms (25020 allocations: 1.63 MiB) | |
forw and back | |
11.564 ms (39823 allocations: 2.47 MiB) | |
LSTM CPU c=512 n=128 ts=64 | |
forward | |
10.575 ms (833 allocations: 5.59 MiB) | |
backward | |
23.239 ms (9324 allocations: 48.88 MiB) | |
forw and back | |
35.565 ms (22088 allocations: 41.27 MiB) | |
LSTM CUDA c=512 n=128 ts=64 | |
forward | |
5.804 ms (21057 allocations: 1.11 MiB) | |
backward | |
14.272 ms (50012 allocations: 3.26 MiB) | |
forw and back | |
22.629 ms (79503 allocations: 4.93 MiB) | |
LSTM CPU c=512 n=512 ts=4 | |
forward | |
1.775 ms (64 allocations: 1.32 MiB) | |
backward | |
3.575 ms (643 allocations: 10.51 MiB) | |
forw and back | |
5.246 ms (1553 allocations: 8.34 MiB) | |
LSTM CUDA c=512 n=512 ts=4 | |
forward | |
524.592 μs (1354 allocations: 71.97 KiB) | |
backward | |
1.235 ms (3179 allocations: 209.94 KiB) | |
forw and back | |
1.906 ms (5198 allocations: 325.89 KiB) | |
LSTM CPU c=512 n=512 ts=16 | |
forward | |
7.986 ms (256 allocations: 5.46 MiB) | |
backward | |
14.480 ms (2443 allocations: 42.33 MiB) | |
forw and back | |
20.785 ms (5813 allocations: 33.80 MiB) | |
LSTM CUDA c=512 n=512 ts=16 | |
forward | |
1.586 ms (5422 allocations: 287.88 KiB) | |
backward | |
4.138 ms (12647 allocations: 837.86 KiB) | |
forw and back | |
6.273 ms (20342 allocations: 1.25 MiB) | |
LSTM CPU c=512 n=512 ts=32 | |
forward | |
16.118 ms (512 allocations: 10.98 MiB) | |
backward | |
29.159 ms (4844 allocations: 84.75 MiB) | |
forw and back | |
41.383 ms (11495 allocations: 67.75 MiB) | |
LSTM CUDA c=512 n=512 ts=32 | |
forward | |
3.017 ms (10846 allocations: 575.75 KiB) | |
backward | |
7.970 ms (25272 allocations: 1.64 MiB) | |
forw and back | |
12.031 ms (40536 allocations: 2.48 MiB) | |
LSTM CPU c=512 n=512 ts=64 | |
forward | |
32.323 ms (1024 allocations: 22.03 MiB) | |
backward | |
58.327 ms (9644 allocations: 169.59 MiB) | |
forw and back | |
82.442 ms (22855 allocations: 135.66 MiB) | |
LSTM CUDA c=512 n=512 ts=64 | |
forward | |
5.773 ms (21694 allocations: 1.12 MiB) | |
backward | |
16.019 ms (50520 allocations: 3.27 MiB) | |
forw and back | |
23.525 ms (80920 allocations: 4.96 MiB) | |
GRU CPU c=32 n=32 ts=4 | |
forward | |
37.056 μs (49 allocations: 61.70 KiB) | |
backward | |
171.009 μs (844 allocations: 246.17 KiB) | |
forw and back | |
350.779 μs (1737 allocations: 428.23 KiB) | |
GRU CUDA c=32 n=32 ts=4 | |
forward | |
486.435 μs (1429 allocations: 82.20 KiB) | |
backward | |
1.490 ms (3466 allocations: 213.53 KiB) | |
forw and back | |
2.230 ms (5809 allocations: 364.19 KiB) | |
GRU CPU c=32 n=32 ts=16 | |
forward | |
164.148 μs (193 allocations: 264.30 KiB) | |
backward | |
672.080 μs (3149 allocations: 1.00 MiB) | |
forw and back | |
1.278 ms (6479 allocations: 1.72 MiB) | |
GRU CUDA c=32 n=32 ts=16 | |
forward | |
1.452 ms (5713 allocations: 328.67 KiB) | |
backward | |
4.948 ms (13607 allocations: 844.50 KiB) | |
forw and back | |
7.461 ms (22671 allocations: 1.39 MiB) | |
GRU CPU c=32 n=32 ts=32 | |
forward | |
338.151 μs (385 allocations: 534.42 KiB) | |
backward | |
1.315 ms (6221 allocations: 2.02 MiB) | |
forw and back | |
2.501 ms (12799 allocations: 3.44 MiB) | |
GRU CUDA c=32 n=32 ts=32 | |
forward | |
2.712 ms (11425 allocations: 657.30 KiB) | |
backward | |
9.540 ms (27127 allocations: 1.65 MiB) | |
forw and back | |
14.249 ms (45151 allocations: 2.77 MiB) | |
GRU CPU c=32 n=32 ts=64 | |
forward | |
682.412 μs (769 allocations: 1.05 MiB) | |
backward | |
2.647 ms (12365 allocations: 4.06 MiB) | |
forw and back | |
4.982 ms (25439 allocations: 6.90 MiB) | |
GRU CUDA c=32 n=32 ts=64 | |
forward | |
5.276 ms (22849 allocations: 1.28 MiB) | |
backward | |
18.634 ms (54167 allocations: 3.29 MiB) | |
forw and back | |
27.950 ms (90111 allocations: 5.54 MiB) | |
GRU CPU c=32 n=128 ts=4 | |
forward | |
145.167 μs (49 allocations: 238.02 KiB) | |
backward | |
298.838 μs (844 allocations: 747.30 KiB) | |
forw and back | |
595.941 μs (1737 allocations: 1.15 MiB) | |
GRU CUDA c=32 n=128 ts=4 | |
forward | |
475.030 μs (1437 allocations: 82.33 KiB) | |
backward | |
1.497 ms (3502 allocations: 214.09 KiB) | |
forw and back | |
2.254 ms (5845 allocations: 364.75 KiB) | |
GRU CPU c=32 n=128 ts=16 | |
forward | |
616.850 μs (193 allocations: 1.00 MiB) | |
backward | |
1.153 ms (3149 allocations: 3.09 MiB) | |
forw and back | |
2.215 ms (6479 allocations: 4.83 MiB) | |
GRU CUDA c=32 n=128 ts=16 | |
forward | |
1.444 ms (5745 allocations: 329.17 KiB) | |
backward | |
5.056 ms (13751 allocations: 846.75 KiB) | |
forw and back | |
7.498 ms (22815 allocations: 1.39 MiB) | |
GRU CPU c=32 n=128 ts=32 | |
forward | |
1.242 ms (385 allocations: 2.02 MiB) | |
backward | |
2.346 ms (6221 allocations: 6.23 MiB) | |
forw and back | |
4.434 ms (12799 allocations: 9.73 MiB) | |
GRU CUDA c=32 n=128 ts=32 | |
forward | |
2.699 ms (11489 allocations: 658.30 KiB) | |
backward | |
9.703 ms (27415 allocations: 1.65 MiB) | |
forw and back | |
14.495 ms (45439 allocations: 2.78 MiB) | |
GRU CPU c=32 n=128 ts=64 | |
forward | |
2.507 ms (769 allocations: 4.07 MiB) | |
backward | |
4.859 ms (12365 allocations: 12.51 MiB) | |
forw and back | |
8.842 ms (25439 allocations: 19.54 MiB) | |
GRU CUDA c=32 n=128 ts=64 | |
forward | |
5.261 ms (22977 allocations: 1.29 MiB) | |
backward | |
18.872 ms (54743 allocations: 3.30 MiB) | |
forw and back | |
28.257 ms (90687 allocations: 5.55 MiB) | |
GRU CPU c=32 n=512 ts=4 | |
forward | |
1.406 ms (56 allocations: 933.47 KiB) | |
backward | |
1.806 ms (880 allocations: 2.67 MiB) | |
forw and back | |
3.487 ms (1788 allocations: 4.06 MiB) | |
GRU CUDA c=32 n=512 ts=4 | |
forward | |
462.937 μs (1458 allocations: 82.66 KiB) | |
backward | |
1.541 ms (3594 allocations: 217.84 KiB) | |
forw and back | |
2.293 ms (5998 allocations: 370.22 KiB) | |
GRU CPU c=32 n=512 ts=16 | |
forward | |
5.263 ms (224 allocations: 3.93 MiB) | |
backward | |
7.605 ms (3305 allocations: 11.35 MiB) | |
forw and back | |
14.322 ms (6698 allocations: 17.15 MiB) | |
GRU CUDA c=32 n=512 ts=16 | |
forward | |
1.437 ms (5838 allocations: 330.62 KiB) | |
backward | |
5.297 ms (14131 allocations: 861.94 KiB) | |
forw and back | |
7.771 ms (23400 allocations: 1.41 MiB) | |
GRU CPU c=32 n=512 ts=32 | |
forward | |
11.747 ms (448 allocations: 7.95 MiB) | |
backward | |
15.211 ms (6537 allocations: 22.91 MiB) | |
forw and back | |
28.691 ms (13242 allocations: 34.60 MiB) | |
GRU CUDA c=32 n=512 ts=32 | |
forward | |
2.719 ms (11678 allocations: 661.25 KiB) | |
backward | |
10.270 ms (28179 allocations: 1.68 MiB) | |
forw and back | |
15.028 ms (46600 allocations: 2.82 MiB) | |
GRU CPU c=32 n=512 ts=64 | |
forward | |
23.613 ms (896 allocations: 15.99 MiB) | |
backward | |
30.489 ms (13001 allocations: 46.05 MiB) | |
forw and back | |
56.960 ms (26330 allocations: 69.50 MiB) | |
GRU CUDA c=32 n=512 ts=64 | |
forward | |
5.290 ms (23358 allocations: 1.29 MiB) | |
backward | |
20.239 ms (56275 allocations: 3.36 MiB) | |
forw and back | |
30.735 ms (93000 allocations: 5.62 MiB) | |
GRU CPU c=128 n=32 ts=4 | |
forward | |
48.161 μs (49 allocations: 61.70 KiB) | |
backward | |
215.723 μs (844 allocations: 405.17 KiB) | |
forw and back | |
408.124 μs (1737 allocations: 539.23 KiB) | |
GRU CUDA c=128 n=32 ts=4 | |
forward | |
482.053 μs (1429 allocations: 82.20 KiB) | |
backward | |
1.496 ms (3466 allocations: 213.53 KiB) | |
forw and back | |
2.217 ms (5809 allocations: 364.19 KiB) | |
GRU CPU c=128 n=32 ts=16 | |
forward | |
209.448 μs (193 allocations: 264.30 KiB) | |
backward | |
795.439 μs (3149 allocations: 1.65 MiB) | |
forw and back | |
1.444 ms (6479 allocations: 2.18 MiB) | |
GRU CUDA c=128 n=32 ts=16 | |
forward | |
1.489 ms (5713 allocations: 328.67 KiB) | |
backward | |
4.985 ms (13607 allocations: 844.50 KiB) | |
forw and back | |
7.435 ms (22671 allocations: 1.39 MiB) | |
GRU CPU c=128 n=32 ts=32 | |
forward | |
426.944 μs (385 allocations: 534.42 KiB) | |
backward | |
1.595 ms (6221 allocations: 3.33 MiB) | |
forw and back | |
2.840 ms (12799 allocations: 4.37 MiB) | |
GRU CUDA c=128 n=32 ts=32 | |
forward | |
2.745 ms (11425 allocations: 657.30 KiB) | |
backward | |
9.446 ms (27127 allocations: 1.65 MiB) | |
forw and back | |
14.175 ms (45151 allocations: 2.77 MiB) | |
GRU CPU c=128 n=32 ts=64 | |
forward | |
859.223 μs (769 allocations: 1.05 MiB) | |
backward | |
3.385 ms (12365 allocations: 6.68 MiB) | |
forw and back | |
5.770 ms (25439 allocations: 8.77 MiB) | |
GRU CUDA c=128 n=32 ts=64 | |
forward | |
5.302 ms (22849 allocations: 1.28 MiB) | |
backward | |
18.317 ms (54167 allocations: 3.29 MiB) | |
forw and back | |
27.563 ms (90111 allocations: 5.54 MiB) | |
GRU CPU c=128 n=128 ts=4 | |
forward | |
494.162 μs (49 allocations: 238.02 KiB) | |
backward | |
1.078 ms (852 allocations: 1.17 MiB) | |
forw and back | |
1.902 ms (1741 allocations: 1.40 MiB) | |
GRU CUDA c=128 n=128 ts=4 | |
forward | |
465.380 μs (1437 allocations: 82.33 KiB) | |
backward | |
1.477 ms (3502 allocations: 214.09 KiB) | |
forw and back | |
2.239 ms (5845 allocations: 364.75 KiB) | |
GRU CPU c=128 n=128 ts=16 | |
forward | |
2.125 ms (193 allocations: 1.00 MiB) | |
backward | |
4.355 ms (3181 allocations: 4.86 MiB) | |
forw and back | |
7.485 ms (6495 allocations: 5.85 MiB) | |
GRU CUDA c=128 n=128 ts=16 | |
forward | |
1.477 ms (5745 allocations: 329.17 KiB) | |
backward | |
5.129 ms (13751 allocations: 846.75 KiB) | |
forw and back | |
7.632 ms (22815 allocations: 1.39 MiB) | |
GRU CPU c=128 n=128 ts=32 | |
forward | |
4.230 ms (385 allocations: 2.02 MiB) | |
backward | |
8.788 ms (6285 allocations: 9.78 MiB) | |
forw and back | |
14.835 ms (12831 allocations: 11.78 MiB) | |
GRU CUDA c=128 n=128 ts=32 | |
forward | |
2.758 ms (11489 allocations: 658.30 KiB) | |
backward | |
9.839 ms (27415 allocations: 1.65 MiB) | |
forw and back | |
14.647 ms (45439 allocations: 2.78 MiB) | |
GRU CPU c=128 n=128 ts=64 | |
forward | |
8.546 ms (769 allocations: 4.07 MiB) | |
backward | |
17.662 ms (12493 allocations: 19.62 MiB) | |
forw and back | |
30.030 ms (25503 allocations: 23.65 MiB) | |
GRU CUDA c=128 n=128 ts=64 | |
forward | |
5.346 ms (22977 allocations: 1.29 MiB) | |
backward | |
19.017 ms (54743 allocations: 3.30 MiB) | |
forw and back | |
28.545 ms (90687 allocations: 5.55 MiB) | |
GRU CPU c=128 n=512 ts=4 | |
forward | |
1.467 ms (56 allocations: 933.47 KiB) | |
backward | |
2.078 ms (880 allocations: 4.23 MiB) | |
forw and back | |
3.707 ms (1788 allocations: 4.87 MiB) | |
GRU CUDA c=128 n=512 ts=4 | |
forward | |
471.027 μs (1458 allocations: 82.66 KiB) | |
backward | |
1.575 ms (3594 allocations: 217.84 KiB) | |
forw and back | |
2.332 ms (5998 allocations: 370.22 KiB) | |
GRU CPU c=128 n=512 ts=16 | |
forward | |
6.161 ms (224 allocations: 3.93 MiB) | |
backward | |
8.740 ms (3305 allocations: 17.62 MiB) | |
forw and back | |
15.252 ms (6698 allocations: 20.42 MiB) | |
GRU CUDA c=128 n=512 ts=16 | |
forward | |
1.481 ms (5838 allocations: 330.62 KiB) | |
backward | |
5.411 ms (14131 allocations: 861.94 KiB) | |
forw and back | |
7.896 ms (23400 allocations: 1.41 MiB) | |
GRU CPU c=128 n=512 ts=32 | |
forward | |
10.098 ms (448 allocations: 7.95 MiB) | |
backward | |
17.682 ms (6537 allocations: 35.47 MiB) | |
forw and back | |
30.554 ms (13242 allocations: 41.15 MiB) | |
GRU CUDA c=128 n=512 ts=32 | |
forward | |
2.800 ms (11678 allocations: 661.25 KiB) | |
backward | |
10.446 ms (28179 allocations: 1.68 MiB) | |
forw and back | |
15.398 ms (46600 allocations: 2.82 MiB) | |
GRU CPU c=128 n=512 ts=64 | |
forward | |
25.142 ms (896 allocations: 15.99 MiB) | |
backward | |
35.435 ms (13001 allocations: 71.16 MiB) | |
forw and back | |
60.727 ms (26330 allocations: 82.62 MiB) | |
GRU CUDA c=128 n=512 ts=64 | |
forward | |
5.479 ms (23358 allocations: 1.29 MiB) | |
backward | |
20.468 ms (56275 allocations: 3.36 MiB) | |
forw and back | |
30.247 ms (93000 allocations: 5.62 MiB) | |
GRU CPU c=512 n=32 ts=4 | |
forward | |
311.081 μs (49 allocations: 61.70 KiB) | |
backward | |
937.930 μs (859 allocations: 1.02 MiB) | |
forw and back | |
1.183 ms (1748 allocations: 982.38 KiB) | |
GRU CUDA c=512 n=32 ts=4 | |
forward | |
500.746 μs (1449 allocations: 82.52 KiB) | |
backward | |
1.524 ms (3498 allocations: 214.03 KiB) | |
forw and back | |
2.272 ms (5861 allocations: 365.00 KiB) | |
GRU CPU c=512 n=32 ts=16 | |
forward | |
1.297 ms (193 allocations: 264.30 KiB) | |
backward | |
3.842 ms (3212 allocations: 4.24 MiB) | |
forw and back | |
6.129 ms (6526 allocations: 4.01 MiB) | |
GRU CUDA c=512 n=32 ts=16 | |
forward | |
1.538 ms (5793 allocations: 329.92 KiB) | |
backward | |
5.103 ms (13735 allocations: 846.50 KiB) | |
forw and back | |
7.769 ms (22879 allocations: 1.40 MiB) | |
GRU CPU c=512 n=32 ts=32 | |
forward | |
2.623 ms (385 allocations: 534.42 KiB) | |
backward | |
7.902 ms (6348 allocations: 8.53 MiB) | |
forw and back | |
12.426 ms (12894 allocations: 8.08 MiB) | |
GRU CUDA c=512 n=32 ts=32 | |
forward | |
2.858 ms (11585 allocations: 659.80 KiB) | |
backward | |
9.831 ms (27383 allocations: 1.65 MiB) | |
forw and back | |
14.704 ms (45567 allocations: 2.78 MiB) | |
GRU CPU c=512 n=32 ts=64 | |
forward | |
5.302 ms (769 allocations: 1.05 MiB) | |
backward | |
15.853 ms (12620 allocations: 17.12 MiB) | |
forw and back | |
24.895 ms (25630 allocations: 16.22 MiB) | |
GRU CUDA c=512 n=32 ts=64 | |
forward | |
5.513 ms (23169 allocations: 1.29 MiB) | |
backward | |
18.977 ms (54679 allocations: 3.30 MiB) | |
forw and back | |
28.441 ms (90943 allocations: 5.55 MiB) | |
GRU CPU c=512 n=128 ts=4 | |
forward | |
587.296 μs (49 allocations: 238.02 KiB) | |
backward | |
1.421 ms (859 allocations: 2.91 MiB) | |
forw and back | |
2.214 ms (1748 allocations: 2.40 MiB) | |
GRU CUDA c=512 n=128 ts=4 | |
forward | |
490.172 μs (1457 allocations: 82.64 KiB) | |
backward | |
1.532 ms (3534 allocations: 214.59 KiB) | |
forw and back | |
2.317 ms (5897 allocations: 365.56 KiB) | |
GRU CPU c=512 n=128 ts=16 | |
forward | |
2.443 ms (193 allocations: 1.00 MiB) | |
backward | |
6.062 ms (3212 allocations: 11.94 MiB) | |
forw and back | |
8.953 ms (6526 allocations: 9.94 MiB) | |
GRU CUDA c=512 n=128 ts=16 | |
forward | |
1.482 ms (5825 allocations: 330.42 KiB) | |
backward | |
5.141 ms (13879 allocations: 848.75 KiB) | |
forw and back | |
7.694 ms (23023 allocations: 1.40 MiB) | |
GRU CPU c=512 n=128 ts=32 | |
forward | |
4.997 ms (385 allocations: 2.02 MiB) | |
backward | |
12.083 ms (6348 allocations: 23.99 MiB) | |
forw and back | |
17.802 ms (12894 allocations: 19.99 MiB) | |
GRU CUDA c=512 n=128 ts=32 | |
forward | |
2.768 ms (11649 allocations: 660.80 KiB) | |
backward | |
9.899 ms (27671 allocations: 1.65 MiB) | |
forw and back | |
14.722 ms (45855 allocations: 2.79 MiB) | |
GRU CPU c=512 n=128 ts=64 | |
forward | |
10.132 ms (769 allocations: 4.07 MiB) | |
backward | |
24.323 ms (12620 allocations: 48.07 MiB) | |
forw and back | |
35.536 ms (25630 allocations: 40.10 MiB) | |
GRU CUDA c=512 n=128 ts=64 | |
forward | |
5.351 ms (23297 allocations: 1.29 MiB) | |
backward | |
19.270 ms (55255 allocations: 3.31 MiB) | |
forw and back | |
28.835 ms (91519 allocations: 5.56 MiB) | |
GRU CPU c=512 n=512 ts=4 | |
forward | |
1.769 ms (56 allocations: 933.47 KiB) | |
backward | |
3.704 ms (887 allocations: 10.48 MiB) | |
forw and back | |
5.138 ms (1795 allocations: 8.11 MiB) | |
GRU CUDA c=512 n=512 ts=4 | |
forward | |
522.495 μs (1478 allocations: 82.97 KiB) | |
backward | |
1.630 ms (3626 allocations: 218.34 KiB) | |
forw and back | |
2.422 ms (6050 allocations: 371.03 KiB) | |
GRU CPU c=512 n=512 ts=16 | |
forward | |
7.603 ms (224 allocations: 3.93 MiB) | |
backward | |
14.938 ms (3336 allocations: 42.71 MiB) | |
forw and back | |
20.314 ms (6729 allocations: 33.50 MiB) | |
GRU CUDA c=512 n=512 ts=16 | |
forward | |
1.633 ms (5918 allocations: 331.88 KiB) | |
backward | |
5.625 ms (14259 allocations: 863.94 KiB) | |
forw and back | |
8.189 ms (23608 allocations: 1.42 MiB) | |
GRU CPU c=512 n=512 ts=32 | |
forward | |
15.317 ms (448 allocations: 7.95 MiB) | |
backward | |
30.130 ms (6600 allocations: 85.68 MiB) | |
forw and back | |
40.710 ms (13305 allocations: 67.36 MiB) | |
GRU CUDA c=512 n=512 ts=32 | |
forward | |
3.070 ms (11838 allocations: 663.75 KiB) | |
backward | |
10.983 ms (28435 allocations: 1.68 MiB) | |
forw and back | |
15.912 ms (47016 allocations: 2.82 MiB) | |
GRU CPU c=512 n=512 ts=64 | |
forward | |
30.749 ms (896 allocations: 15.99 MiB) | |
backward | |
60.205 ms (13128 allocations: 171.62 MiB) | |
forw and back | |
80.993 ms (26457 allocations: 135.07 MiB) | |
GRU CUDA c=512 n=512 ts=64 | |
forward | |
6.003 ms (23678 allocations: 1.30 MiB) | |
backward | |
22.428 ms (56787 allocations: 3.37 MiB) | |
forw and back | |
31.418 ms (93832 allocations: 5.64 MiB) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
branch | rnn_type | device | features | batch_size | timesteps | passes | time_μs | alloc_num | alloc_KiB | |
---|---|---|---|---|---|---|---|---|---|---|
master | LSTM | CPU | 32 | 32 | 4 | forward | 39.939998626709 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 32 | 32 | 4 | backward | 156.791000366211 | 608 | 233.699996948242 | |
master | LSTM | CPU | 32 | 32 | 4 | forw and back | 334.136993408203 | 1495 | 405.920013427734 | |
master | LSTM | CUDA | 32 | 32 | 4 | forward | 443.264007568359 | 1289 | 70.6999969482422 | |
master | LSTM | CUDA | 32 | 32 | 4 | backward | 1068.0 | 3075 | 208.309997558594 | |
master | LSTM | CUDA | 32 | 32 | 4 | forw and back | 1699.0 | 5005 | 321.980010986328 | |
master | LSTM | CPU | 32 | 32 | 16 | forward | 176.345993041992 | 209 | 364.230010986328 | |
master | LSTM | CPU | 32 | 32 | 16 | backward | 587.888977050781 | 2300 | 950.72998046875 | |
master | LSTM | CPU | 32 | 32 | 16 | forw and back | 1216.0 | 5575 | 1590.0 | |
master | LSTM | CUDA | 32 | 32 | 16 | forward | 1384.0 | 5153 | 282.670013427734 | |
master | LSTM | CUDA | 32 | 32 | 16 | backward | 3591.0 | 12219 | 831.169982910156 | |
master | LSTM | CUDA | 32 | 32 | 16 | forw and back | 5630.0 | 19597 | 1230.0 | |
master | LSTM | CPU | 32 | 32 | 32 | forward | 361.653015136719 | 417 | 732.359985351562 | |
master | LSTM | CPU | 32 | 32 | 32 | backward | 1169.0 | 4557 | 1860.0 | |
master | LSTM | CPU | 32 | 32 | 32 | forw and back | 2377.0 | 11017 | 3190.0 | |
master | LSTM | CUDA | 32 | 32 | 32 | forward | 2631.0 | 10305 | 565.299987792969 | |
master | LSTM | CUDA | 32 | 32 | 32 | backward | 6895.0 | 24412 | 1620.0 | |
master | LSTM | CUDA | 32 | 32 | 32 | forw and back | 10829.0 | 39055 | 2460.0 | |
master | LSTM | CPU | 32 | 32 | 64 | forward | 734.004028320312 | 833 | 1430.0 | |
master | LSTM | CPU | 32 | 32 | 64 | backward | 2354.0 | 9069 | 3730.0 | |
master | LSTM | CPU | 32 | 32 | 64 | forw and back | 4719.0 | 21897 | 6380.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | forward | 5097.0 | 20609 | 1100.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | backward | 13517.0 | 48796 | 3240.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | forw and back | 21213.0 | 77967 | 4910.0 | |
master | LSTM | CPU | 32 | 128 | 4 | forward | 155.343994140625 | 53 | 342.640014648437 | |
master | LSTM | CPU | 32 | 128 | 4 | backward | 275.822998046875 | 608 | 722.830017089844 | |
master | LSTM | CPU | 32 | 128 | 4 | forw and back | 600.094970703125 | 1495 | 1160.0 | |
master | LSTM | CUDA | 32 | 128 | 4 | forward | 462.463989257812 | 1297 | 70.8300018310547 | |
master | LSTM | CUDA | 32 | 128 | 4 | backward | 1115.0 | 3119 | 209.0 | |
master | LSTM | CUDA | 32 | 128 | 4 | forw and back | 1754.0 | 5049 | 322.670013427734 | |
master | LSTM | CPU | 32 | 128 | 16 | forward | 651.596984863281 | 209 | 1380.0 | |
master | LSTM | CPU | 32 | 128 | 16 | backward | 1069.0 | 2300 | 2860.0 | |
master | LSTM | CPU | 32 | 128 | 16 | forw and back | 2247.0 | 5575 | 4700.0 | |
master | LSTM | CUDA | 32 | 128 | 16 | forward | 1414.0 | 5185 | 283.170013427734 | |
master | LSTM | CUDA | 32 | 128 | 16 | backward | 3729.0 | 12395 | 833.919982910156 | |
master | LSTM | CUDA | 32 | 128 | 16 | forw and back | 5785.0 | 19773 | 1240.0 | |
master | LSTM | CPU | 32 | 128 | 32 | forward | 1309.0 | 417 | 2790.0 | |
master | LSTM | CPU | 32 | 128 | 32 | backward | 2192.0 | 4557 | 5730.0 | |
master | LSTM | CPU | 32 | 128 | 32 | forw and back | 4490.0 | 11017 | 9410.0 | |
master | LSTM | CUDA | 32 | 128 | 32 | forward | 2656.0 | 10369 | 566.299987792969 | |
master | LSTM | CUDA | 32 | 128 | 32 | backward | 7144.0 | 24764 | 1630.0 | |
master | LSTM | CUDA | 32 | 128 | 32 | forw and back | 11062.0 | 39407 | 2460.0 | |
master | LSTM | CPU | 32 | 128 | 64 | forward | 2630.0 | 833 | 5590.0 | |
master | LSTM | CPU | 32 | 128 | 64 | backward | 4550.0 | 9069 | 11460.0 | |
master | LSTM | CPU | 32 | 128 | 64 | forw and back | 8997.0 | 21897 | 18840.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | forward | 5159.0 | 20737 | 1110.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | backward | 13942.0 | 49500 | 3260.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | forw and back | 21668.0 | 78671 | 4920.0 | |
master | LSTM | CPU | 32 | 512 | 4 | forward | 1488.0 | 64 | 1320.0 | |
master | LSTM | CPU | 32 | 512 | 4 | backward | 1690.0 | 636 | 2600.0 | |
master | LSTM | CPU | 32 | 512 | 4 | forw and back | 3664.0 | 1546 | 4180.0 | |
master | LSTM | CUDA | 32 | 512 | 4 | forward | 461.363006591797 | 1334 | 71.6600036621094 | |
master | LSTM | CUDA | 32 | 512 | 4 | backward | 1166.0 | 3147 | 209.440002441406 | |
master | LSTM | CUDA | 32 | 512 | 4 | forw and back | 1813.0 | 5146 | 325.079986572266 | |
master | LSTM | CPU | 32 | 512 | 16 | forward | 6258.0 | 256 | 5460.0 | |
master | LSTM | CPU | 32 | 512 | 16 | backward | 7021.0 | 2412 | 10510.0 | |
master | LSTM | CPU | 32 | 512 | 16 | forw and back | 14799.0 | 5782 | 16990.0 | |
master | LSTM | CUDA | 32 | 512 | 16 | forward | 1418.0 | 5342 | 286.619995117187 | |
master | LSTM | CUDA | 32 | 512 | 16 | backward | 3877.0 | 12519 | 835.859985351563 | |
master | LSTM | CUDA | 32 | 512 | 16 | forw and back | 6108.0 | 20134 | 1240.0 | |
master | LSTM | CPU | 32 | 512 | 32 | forward | 12599.0 | 512 | 10980.0 | |
master | LSTM | CPU | 32 | 512 | 32 | backward | 14229.0 | 4781 | 21060.0 | |
master | LSTM | CPU | 32 | 512 | 32 | forw and back | 29445.0 | 11432 | 34070.0 | |
master | LSTM | CUDA | 32 | 512 | 32 | forward | 2690.0 | 10686 | 573.25 | |
master | LSTM | CUDA | 32 | 512 | 32 | backward | 7451.0 | 25016 | 1630.0 | |
master | LSTM | CUDA | 32 | 512 | 32 | forw and back | 11705.0 | 40120 | 2480.0 | |
master | LSTM | CPU | 32 | 512 | 64 | forward | 25296.0 | 1024 | 22030.0 | |
master | LSTM | CPU | 32 | 512 | 64 | backward | 28496.0 | 9517 | 42150.0 | |
master | LSTM | CPU | 32 | 512 | 64 | forw and back | 58946.0 | 22728 | 68230.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | forward | 5197.0 | 21374 | 1120.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | backward | 14618.0 | 50008 | 3260.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | forw and back | 23281.0 | 80088 | 4940.0 | |
master | LSTM | CPU | 128 | 32 | 4 | forward | 54.4169998168945 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 128 | 32 | 4 | backward | 199.080001831055 | 608 | 413.700012207031 | |
master | LSTM | CPU | 128 | 32 | 4 | forw and back | 399.412994384766 | 1495 | 537.919982910156 | |
master | LSTM | CUDA | 128 | 32 | 4 | forward | 480.368011474609 | 1289 | 70.6999969482422 | |
master | LSTM | CUDA | 128 | 32 | 4 | backward | 1125.0 | 3075 | 208.309997558594 | |
master | LSTM | CUDA | 128 | 32 | 4 | forw and back | 1774.0 | 5005 | 321.980010986328 | |
master | LSTM | CPU | 128 | 32 | 16 | forward | 232.804992675781 | 209 | 364.230010986328 | |
master | LSTM | CPU | 128 | 32 | 16 | backward | 751.56201171875 | 2300 | 1670.0 | |
master | LSTM | CPU | 128 | 32 | 16 | forw and back | 1429.0 | 5575 | 2140.0 | |
master | LSTM | CUDA | 128 | 32 | 16 | forward | 1463.0 | 5153 | 282.670013427734 | |
master | LSTM | CUDA | 128 | 32 | 16 | backward | 3738.0 | 12219 | 831.169982910156 | |
master | LSTM | CUDA | 128 | 32 | 16 | forw and back | 5875.0 | 19597 | 1230.0 | |
master | LSTM | CPU | 128 | 32 | 32 | forward | 475.282989501953 | 417 | 732.359985351562 | |
master | LSTM | CPU | 128 | 32 | 32 | backward | 1525.0 | 4557 | 3350.0 | |
master | LSTM | CPU | 128 | 32 | 32 | forw and back | 2822.0 | 11017 | 4300.0 | |
master | LSTM | CUDA | 128 | 32 | 32 | forward | 2733.0 | 10305 | 565.299987792969 | |
master | LSTM | CUDA | 128 | 32 | 32 | backward | 7187.0 | 24412 | 1620.0 | |
master | LSTM | CUDA | 128 | 32 | 32 | forw and back | 11243.0 | 39055 | 2460.0 | |
master | LSTM | CPU | 128 | 32 | 64 | forward | 960.460998535156 | 833 | 1430.0 | |
master | LSTM | CPU | 128 | 32 | 64 | backward | 3211.0 | 9069 | 6720.0 | |
master | LSTM | CPU | 128 | 32 | 64 | forw and back | 5705.0 | 21897 | 8620.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | forward | 5253.0 | 20609 | 1100.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | backward | 13694.0 | 48796 | 3240.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | forw and back | 21541.0 | 77967 | 4910.0 | |
master | LSTM | CPU | 128 | 128 | 4 | forward | 531.2509765625 | 53 | 342.640014648437 | |
master | LSTM | CPU | 128 | 128 | 4 | backward | 1021.0 | 616 | 1160.0 | |
master | LSTM | CPU | 128 | 128 | 4 | forw and back | 1912.0 | 1499 | 1430.0 | |
master | LSTM | CUDA | 128 | 128 | 4 | forward | 472.183013916016 | 1297 | 70.8300018310547 | |
master | LSTM | CUDA | 128 | 128 | 4 | backward | 1133.0 | 3119 | 209.0 | |
master | LSTM | CUDA | 128 | 128 | 4 | forw and back | 1789.0 | 5049 | 322.670013427734 | |
master | LSTM | CPU | 128 | 128 | 16 | forward | 1185.0 | 209 | 1380.0 | |
master | LSTM | CPU | 128 | 128 | 16 | backward | 4145.0 | 2332 | 4720.0 | |
master | LSTM | CPU | 128 | 128 | 16 | forw and back | 7534.0 | 5591 | 5810.0 | |
master | LSTM | CUDA | 128 | 128 | 16 | forward | 1443.0 | 5185 | 283.170013427734 | |
master | LSTM | CUDA | 128 | 128 | 16 | backward | 3749.0 | 12395 | 833.919982910156 | |
master | LSTM | CUDA | 128 | 128 | 16 | forw and back | 5830.0 | 19773 | 1240.0 | |
master | LSTM | CPU | 128 | 128 | 32 | forward | 4505.0 | 417 | 2790.0 | |
master | LSTM | CPU | 128 | 128 | 32 | backward | 8352.0 | 4621 | 9460.0 | |
master | LSTM | CPU | 128 | 128 | 32 | forw and back | 15025.0 | 11049 | 11650.0 | |
master | LSTM | CUDA | 128 | 128 | 32 | forward | 2710.0 | 10369 | 566.299987792969 | |
master | LSTM | CUDA | 128 | 128 | 32 | backward | 7201.0 | 24764 | 1630.0 | |
master | LSTM | CUDA | 128 | 128 | 32 | forw and back | 11155.0 | 39407 | 2460.0 | |
master | LSTM | CPU | 128 | 128 | 64 | forward | 9102.0 | 833 | 5590.0 | |
master | LSTM | CPU | 128 | 128 | 64 | backward | 16729.0 | 9197 | 18940.0 | |
master | LSTM | CPU | 128 | 128 | 64 | forw and back | 30072.0 | 21961 | 23320.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | forward | 5187.0 | 20737 | 1110.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | backward | 14091.0 | 49500 | 3260.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | forw and back | 21876.0 | 78671 | 4920.0 | |
master | LSTM | CPU | 128 | 512 | 4 | forward | 1558.0 | 64 | 1320.0 | |
master | LSTM | CPU | 128 | 512 | 4 | backward | 1252.0 | 636 | 4180.0 | |
master | LSTM | CPU | 128 | 512 | 4 | forw and back | 3926.0 | 1546 | 5010.0 | |
master | LSTM | CUDA | 128 | 512 | 4 | forward | 496.510009765625 | 1334 | 71.6600036621094 | |
master | LSTM | CUDA | 128 | 512 | 4 | backward | 1194.0 | 3147 | 209.440002441406 | |
master | LSTM | CUDA | 128 | 512 | 4 | forw and back | 1863.0 | 5146 | 325.079986572266 | |
master | LSTM | CPU | 128 | 512 | 16 | forward | 6626.0 | 256 | 5460.0 | |
master | LSTM | CPU | 128 | 512 | 16 | backward | 7625.0 | 2412 | 16880.0 | |
master | LSTM | CPU | 128 | 512 | 16 | forw and back | 15896.0 | 5782 | 20350.0 | |
master | LSTM | CUDA | 128 | 512 | 16 | forward | 1492.0 | 5342 | 286.619995117187 | |
master | LSTM | CUDA | 128 | 512 | 16 | backward | 3982.0 | 12519 | 835.859985351563 | |
master | LSTM | CUDA | 128 | 512 | 16 | forw and back | 6097.0 | 20134 | 1240.0 | |
master | LSTM | CPU | 128 | 512 | 32 | forward | 12597.0 | 512 | 10980.0 | |
master | LSTM | CPU | 128 | 512 | 32 | backward | 16735.0 | 4781 | 33800.0 | |
master | LSTM | CPU | 128 | 512 | 32 | forw and back | 20492.0 | 11432 | 40810.0 | |
master | LSTM | CUDA | 128 | 512 | 32 | forward | 2865.0 | 10686 | 573.25 | |
master | LSTM | CUDA | 128 | 512 | 32 | backward | 7670.0 | 25016 | 1630.0 | |
master | LSTM | CUDA | 128 | 512 | 32 | forw and back | 11737.0 | 40120 | 2480.0 | |
master | LSTM | CPU | 128 | 512 | 64 | forward | 27121.0 | 1024 | 22030.0 | |
master | LSTM | CPU | 128 | 512 | 64 | backward | 33522.0 | 9517 | 67640.0 | |
master | LSTM | CPU | 128 | 512 | 64 | forw and back | 63381.0 | 22728 | 81710.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | forward | 5428.0 | 21374 | 1120.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | backward | 14902.0 | 50008 | 3260.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | forw and back | 23089.0 | 80088 | 4940.0 | |
master | LSTM | CPU | 512 | 32 | 4 | forward | 299.046997070312 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 512 | 32 | 4 | backward | 904.810974121094 | 623 | 1110.0 | |
master | LSTM | CPU | 512 | 32 | 4 | forw and back | 1529.0 | 1506 | 1040.0 | |
master | LSTM | CUDA | 512 | 32 | 4 | forward | 494.730987548828 | 1309 | 71.0199966430664 | |
master | LSTM | CUDA | 512 | 32 | 4 | backward | 1159.0 | 3107 | 208.809997558594 | |
master | LSTM | CUDA | 512 | 32 | 4 | forw and back | 1807.0 | 5057 | 322.799987792969 | |
master | LSTM | CPU | 512 | 32 | 16 | forward | 1239.0 | 209 | 364.230010986328 | |
master | LSTM | CPU | 512 | 32 | 16 | backward | 3695.0 | 2363 | 4620.0 | |
master | LSTM | CPU | 512 | 32 | 16 | forw and back | 5932.0 | 5622 | 4340.0 | |
master | LSTM | CUDA | 512 | 32 | 16 | forward | 1487.0 | 5233 | 283.920013427734 | |
master | LSTM | CUDA | 512 | 32 | 16 | backward | 3770.0 | 12347 | 833.169982910156 | |
master | LSTM | CUDA | 512 | 32 | 16 | forw and back | 5973.0 | 19805 | 1240.0 | |
master | LSTM | CPU | 512 | 32 | 32 | forward | 2487.0 | 417 | 732.359985351562 | |
master | LSTM | CPU | 512 | 32 | 32 | backward | 7416.0 | 4684 | 9290.0 | |
master | LSTM | CPU | 512 | 32 | 32 | forw and back | 11767.0 | 11112 | 8750.0 | |
master | LSTM | CUDA | 512 | 32 | 32 | forward | 2779.0 | 10465 | 567.799987792969 | |
master | LSTM | CUDA | 512 | 32 | 32 | backward | 7274.0 | 24668 | 1630.0 | |
master | LSTM | CUDA | 512 | 32 | 32 | forw and back | 11415.0 | 39471 | 2460.0 | |
master | LSTM | CPU | 512 | 32 | 64 | forward | 4289.0 | 833 | 1430.0 | |
master | LSTM | CPU | 512 | 32 | 64 | backward | 14788.0 | 9324 | 18650.0 | |
master | LSTM | CPU | 512 | 32 | 64 | forw and back | 23512.0 | 22088 | 17560.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | forward | 5363.0 | 20929 | 1110.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | backward | 14163.0 | 49308 | 3250.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | forw and back | 22199.0 | 78799 | 4920.0 | |
master | LSTM | CPU | 512 | 128 | 4 | forward | 603.39599609375 | 53 | 342.640014648437 | |
master | LSTM | CPU | 512 | 128 | 4 | backward | 1379.0 | 623 | 2990.0 | |
master | LSTM | CPU | 512 | 128 | 4 | forw and back | 2237.0 | 1506 | 2510.0 | |
master | LSTM | CUDA | 512 | 128 | 4 | forward | 527.679992675781 | 1317 | 71.1399993896484 | |
master | LSTM | CUDA | 512 | 128 | 4 | backward | 1188.0 | 3151 | 209.5 | |
master | LSTM | CUDA | 512 | 128 | 4 | forw and back | 1904.0 | 5101 | 323.480010986328 | |
master | LSTM | CPU | 512 | 128 | 16 | forward | 2544.0 | 209 | 1380.0 | |
master | LSTM | CPU | 512 | 128 | 16 | backward | 5842.0 | 2363 | 12170.0 | |
master | LSTM | CPU | 512 | 128 | 16 | forw and back | 9017.0 | 5622 | 10260.0 | |
master | LSTM | CUDA | 512 | 128 | 16 | forward | 1623.0 | 5265 | 284.420013427734 | |
master | LSTM | CUDA | 512 | 128 | 16 | backward | 3877.0 | 12523 | 835.919982910156 | |
master | LSTM | CUDA | 512 | 128 | 16 | forw and back | 6127.0 | 19981 | 1240.0 | |
master | LSTM | CPU | 512 | 128 | 32 | forward | 5252.0 | 417 | 2790.0 | |
master | LSTM | CPU | 512 | 128 | 32 | backward | 11673.0 | 4684 | 24410.0 | |
master | LSTM | CPU | 512 | 128 | 32 | forw and back | 17942.0 | 11112 | 20590.0 | |
master | LSTM | CUDA | 512 | 128 | 32 | forward | 3000.0 | 10529 | 568.799987792969 | |
master | LSTM | CUDA | 512 | 128 | 32 | backward | 7391.0 | 25020 | 1630.0 | |
master | LSTM | CUDA | 512 | 128 | 32 | forw and back | 11691.0 | 39823 | 2470.0 | |
master | LSTM | CPU | 512 | 128 | 64 | forward | 10627.0 | 833 | 5590.0 | |
master | LSTM | CPU | 512 | 128 | 64 | backward | 23273.0 | 9324 | 48880.0 | |
master | LSTM | CPU | 512 | 128 | 64 | forw and back | 35678.0 | 22088 | 41270.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | forward | 5809.0 | 21057 | 1110.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | backward | 14461.0 | 50012 | 3260.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | forw and back | 22961.0 | 79503 | 4930.0 | |
master | LSTM | CPU | 512 | 512 | 4 | forward | 1885.0 | 64 | 1320.0 | |
master | LSTM | CPU | 512 | 512 | 4 | backward | 3582.0 | 643 | 10510.0 | |
master | LSTM | CPU | 512 | 512 | 4 | forw and back | 5252.0 | 1553 | 8340.0 | |
master | LSTM | CUDA | 512 | 512 | 4 | forward | 526.255004882812 | 1354 | 71.9700012207031 | |
master | LSTM | CUDA | 512 | 512 | 4 | backward | 1243.0 | 3179 | 209.940002441406 | |
master | LSTM | CUDA | 512 | 512 | 4 | forw and back | 1923.0 | 5198 | 325.890014648437 | |
master | LSTM | CPU | 512 | 512 | 16 | forward | 8036.99951171875 | 256 | 5460.0 | |
master | LSTM | CPU | 512 | 512 | 16 | backward | 14470.0 | 2443 | 42330.0 | |
master | LSTM | CPU | 512 | 512 | 16 | forw and back | 20734.0 | 5813 | 33800.0 | |
master | LSTM | CUDA | 512 | 512 | 16 | forward | 1580.0 | 5422 | 287.880004882812 | |
master | LSTM | CUDA | 512 | 512 | 16 | backward | 4161.0 | 12647 | 837.859985351563 | |
master | LSTM | CUDA | 512 | 512 | 16 | forw and back | 6324.0 | 20342 | 1250.0 | |
master | LSTM | CPU | 512 | 512 | 32 | forward | 16143.9990234375 | 512 | 10980.0 | |
master | LSTM | CPU | 512 | 512 | 32 | backward | 28997.0 | 4844 | 84750.0 | |
master | LSTM | CPU | 512 | 512 | 32 | forw and back | 41463.0 | 11495 | 67750.0 | |
master | LSTM | CUDA | 512 | 512 | 32 | forward | 3018.0 | 10846 | 575.75 | |
master | LSTM | CUDA | 512 | 512 | 32 | backward | 8192.0 | 25272 | 1640.0 | |
master | LSTM | CUDA | 512 | 512 | 32 | forw and back | 12107.0 | 40536 | 2480.0 | |
master | LSTM | CPU | 512 | 512 | 64 | forward | 32463.001953125 | 1024 | 22030.0 | |
master | LSTM | CPU | 512 | 512 | 64 | backward | 58051.0 | 9644 | 169590.0 | |
master | LSTM | CPU | 512 | 512 | 64 | forw and back | 82756.0 | 22855 | 135660.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | forward | 5804.0 | 21694 | 1120.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | backward | 16487.0 | 50520 | 3270.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | forw and back | 23959.0 | 80920 | 4960.0 | |
master | GRU | CPU | 32 | 32 | 4 | forward | 36.9179992675781 | 49 | 61.7000007629395 | |
master | GRU | CPU | 32 | 32 | 4 | backward | 168.076995849609 | 844 | 246.169998168945 | |
master | GRU | CPU | 32 | 32 | 4 | forw and back | 357.493011474609 | 1737 | 428.230010986328 | |
master | GRU | CUDA | 32 | 32 | 4 | forward | 488.684997558594 | 1429 | 82.1999969482422 | |
master | GRU | CUDA | 32 | 32 | 4 | backward | 1466.0 | 3466 | 213.529998779297 | |
master | GRU | CUDA | 32 | 32 | 4 | forw and back | 2199.0 | 5809 | 364.190002441406 | |
master | GRU | CPU | 32 | 32 | 16 | forward | 164.274993896484 | 193 | 264.299987792969 | |
master | GRU | CPU | 32 | 32 | 16 | backward | 642.880981445312 | 3149 | 1000.0 | |
master | GRU | CPU | 32 | 32 | 16 | forw and back | 1290.0 | 6479 | 1720.0 | |
master | GRU | CUDA | 32 | 32 | 16 | forward | 1474.0 | 5713 | 328.670013427734 | |
master | GRU | CUDA | 32 | 32 | 16 | backward | 4901.0 | 13607 | 844.5 | |
master | GRU | CUDA | 32 | 32 | 16 | forw and back | 7431.0 | 22671 | 1390.0 | |
master | GRU | CPU | 32 | 32 | 32 | forward | 337.221008300781 | 385 | 534.419982910156 | |
master | GRU | CPU | 32 | 32 | 32 | backward | 1263.0 | 6221 | 2020.0 | |
master | GRU | CPU | 32 | 32 | 32 | forw and back | 2519.0 | 12799 | 3440.0 | |
master | GRU | CUDA | 32 | 32 | 32 | forward | 2788.0 | 11425 | 657.299987792969 | |
master | GRU | CUDA | 32 | 32 | 32 | backward | 9441.0 | 27127 | 1650.0 | |
master | GRU | CUDA | 32 | 32 | 32 | forw and back | 14165.0 | 45151 | 2770.0 | |
master | GRU | CPU | 32 | 32 | 64 | forward | 682.10302734375 | 769 | 1050.0 | |
master | GRU | CPU | 32 | 32 | 64 | backward | 2534.0 | 12365 | 4060.0 | |
master | GRU | CPU | 32 | 32 | 64 | forw and back | 5012.0 | 25439 | 6900.0 | |
master | GRU | CUDA | 32 | 32 | 64 | forward | 5414.0 | 22849 | 1280.0 | |
master | GRU | CUDA | 32 | 32 | 64 | backward | 18435.0 | 54167 | 3290.0 | |
master | GRU | CUDA | 32 | 32 | 64 | forw and back | 27756.0 | 90111 | 5540.0 | |
master | GRU | CPU | 32 | 128 | 4 | forward | 145.934005737305 | 49 | 238.020004272461 | |
master | GRU | CPU | 32 | 128 | 4 | backward | 290.269012451172 | 844 | 747.299987792969 | |
master | GRU | CPU | 32 | 128 | 4 | forw and back | 601.927978515625 | 1737 | 1150.0 | |
master | GRU | CUDA | 32 | 128 | 4 | forward | 469.497985839844 | 1437 | 82.3300018310547 | |
master | GRU | CUDA | 32 | 128 | 4 | backward | 1481.0 | 3502 | 214.089996337891 | |
master | GRU | CUDA | 32 | 128 | 4 | forw and back | 2228.0 | 5845 | 364.75 | |
master | GRU | CPU | 32 | 128 | 16 | forward | 617.14697265625 | 193 | 1000.0 | |
master | GRU | CPU | 32 | 128 | 16 | backward | 1132.0 | 3149 | 3090.0 | |
master | GRU | CPU | 32 | 128 | 16 | forw and back | 2237.0 | 6479 | 4830.0 | |
master | GRU | CUDA | 32 | 128 | 16 | forward | 1491.0 | 5745 | 329.170013427734 | |
master | GRU | CUDA | 32 | 128 | 16 | backward | 4994.0 | 13751 | 846.75 | |
master | GRU | CUDA | 32 | 128 | 16 | forw and back | 7488.0 | 22815 | 1390.0 | |
master | GRU | CPU | 32 | 128 | 32 | forward | 1244.0 | 385 | 2020.0 | |
master | GRU | CPU | 32 | 128 | 32 | backward | 2306.0 | 6221 | 6230.0 | |
master | GRU | CPU | 32 | 128 | 32 | forw and back | 4461.0 | 12799 | 9730.0 | |
master | GRU | CUDA | 32 | 128 | 32 | forward | 2786.0 | 11489 | 658.299987792969 | |
master | GRU | CUDA | 32 | 128 | 32 | backward | 9673.0 | 27415 | 1650.0 | |
master | GRU | CUDA | 32 | 128 | 32 | forw and back | 14565.0 | 45439 | 2780.0 | |
master | GRU | CPU | 32 | 128 | 64 | forward | 2508.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 32 | 128 | 64 | backward | 4782.0 | 12365 | 12510.0 | |
master | GRU | CPU | 32 | 128 | 64 | forw and back | 8927.0 | 25439 | 19540.0 | |
master | GRU | CUDA | 32 | 128 | 64 | forward | 5382.0 | 22977 | 1290.0 | |
master | GRU | CUDA | 32 | 128 | 64 | backward | 18704.0 | 54743 | 3300.0 | |
master | GRU | CUDA | 32 | 128 | 64 | forw and back | 28415.0 | 90687 | 5550.0 | |
master | GRU | CPU | 32 | 512 | 4 | forward | 1414.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 32 | 512 | 4 | backward | 1809.0 | 880 | 2670.0 | |
master | GRU | CPU | 32 | 512 | 4 | forw and back | 3539.0 | 1788 | 4060.0 | |
master | GRU | CUDA | 32 | 512 | 4 | forward | 471.503997802734 | 1458 | 82.6600036621094 | |
master | GRU | CUDA | 32 | 512 | 4 | backward | 1545.0 | 3594 | 217.839996337891 | |
master | GRU | CUDA | 32 | 512 | 4 | forw and back | 2299.0 | 5998 | 370.220001220703 | |
master | GRU | CPU | 32 | 512 | 16 | forward | 5816.0 | 224 | 3930.0 | |
master | GRU | CPU | 32 | 512 | 16 | backward | 7534.0 | 3305 | 11350.0 | |
master | GRU | CPU | 32 | 512 | 16 | forw and back | 14328.0 | 6698 | 17150.0 | |
master | GRU | CUDA | 32 | 512 | 16 | forward | 1482.0 | 5838 | 330.619995117187 | |
master | GRU | CUDA | 32 | 512 | 16 | backward | 5323.0 | 14131 | 861.940002441406 | |
master | GRU | CUDA | 32 | 512 | 16 | forw and back | 8156.0 | 23400 | 1410.0 | |
master | GRU | CPU | 32 | 512 | 32 | forward | 11724.0 | 448 | 7950.0 | |
master | GRU | CPU | 32 | 512 | 32 | backward | 15170.0 | 6537 | 22910.0 | |
master | GRU | CPU | 32 | 512 | 32 | forw and back | 28593.0 | 13242 | 34600.0 | |
master | GRU | CUDA | 32 | 512 | 32 | forward | 2800.0 | 11678 | 661.25 | |
master | GRU | CUDA | 32 | 512 | 32 | backward | 10281.0 | 28179 | 1680.0 | |
master | GRU | CUDA | 32 | 512 | 32 | forw and back | 15715.0 | 46600 | 2820.0 | |
master | GRU | CPU | 32 | 512 | 64 | forward | 23528.0 | 896 | 15990.0 | |
master | GRU | CPU | 32 | 512 | 64 | backward | 30385.0 | 13001 | 46050.0 | |
master | GRU | CPU | 32 | 512 | 64 | forw and back | 57075.0 | 26330 | 69500.0 | |
master | GRU | CUDA | 32 | 512 | 64 | forward | 5475.0 | 23358 | 1290.0 | |
master | GRU | CUDA | 32 | 512 | 64 | backward | 20310.0 | 56275 | 3360.0 | |
master | GRU | CUDA | 32 | 512 | 64 | forw and back | 30778.0 | 93000 | 5620.0 | |
master | GRU | CPU | 128 | 32 | 4 | forward | 49.060001373291 | 49 | 61.7000007629395 | |
master | GRU | CPU | 128 | 32 | 4 | backward | 203.59700012207 | 844 | 405.170013427734 | |
master | GRU | CPU | 128 | 32 | 4 | forw and back | 408.595001220703 | 1737 | 539.22998046875 | |
master | GRU | CUDA | 128 | 32 | 4 | forward | 499.382995605469 | 1429 | 82.1999969482422 | |
master | GRU | CUDA | 128 | 32 | 4 | backward | 1507.0 | 3466 | 213.529998779297 | |
master | GRU | CUDA | 128 | 32 | 4 | forw and back | 2247.0 | 5809 | 364.190002441406 | |
master | GRU | CPU | 128 | 32 | 16 | forward | 211.781997680664 | 193 | 264.299987792969 | |
master | GRU | CPU | 128 | 32 | 16 | backward | 763.973022460938 | 3149 | 1650.0 | |
master | GRU | CPU | 128 | 32 | 16 | forw and back | 1447.0 | 6479 | 2180.0 | |
master | GRU | CUDA | 128 | 32 | 16 | forward | 1528.0 | 5713 | 328.670013427734 | |
master | GRU | CUDA | 128 | 32 | 16 | backward | 5001.0 | 13607 | 844.5 | |
master | GRU | CUDA | 128 | 32 | 16 | forw and back | 7507.0 | 22671 | 1390.0 | |
master | GRU | CPU | 128 | 32 | 32 | forward | 430.852996826172 | 385 | 534.419982910156 | |
master | GRU | CPU | 128 | 32 | 32 | backward | 1544.0 | 6221 | 3330.0 | |
master | GRU | CPU | 128 | 32 | 32 | forw and back | 2857.0 | 12799 | 4370.0 | |
master | GRU | CUDA | 128 | 32 | 32 | forward | 2827.0 | 11425 | 657.299987792969 | |
master | GRU | CUDA | 128 | 32 | 32 | backward | 9425.0 | 27127 | 1650.0 | |
master | GRU | CUDA | 128 | 32 | 32 | forw and back | 14271.0 | 45151 | 2770.0 | |
master | GRU | CPU | 128 | 32 | 64 | forward | 868.632995605469 | 769 | 1050.0 | |
master | GRU | CPU | 128 | 32 | 64 | backward | 3253.0 | 12365 | 6680.0 | |
master | GRU | CPU | 128 | 32 | 64 | forw and back | 5797.0 | 25439 | 8770.0 | |
master | GRU | CUDA | 128 | 32 | 64 | forward | 5454.0 | 22849 | 1280.0 | |
master | GRU | CUDA | 128 | 32 | 64 | backward | 18304.0 | 54167 | 3290.0 | |
master | GRU | CUDA | 128 | 32 | 64 | forw and back | 27904.0 | 90111 | 5540.0 | |
master | GRU | CPU | 128 | 128 | 4 | forward | 504.920013427734 | 49 | 238.020004272461 | |
master | GRU | CPU | 128 | 128 | 4 | backward | 1054.0 | 852 | 1170.0 | |
master | GRU | CPU | 128 | 128 | 4 | forw and back | 1904.0 | 1741 | 1400.0 | |
master | GRU | CUDA | 128 | 128 | 4 | forward | 476.441986083984 | 1437 | 82.3300018310547 | |
master | GRU | CUDA | 128 | 128 | 4 | backward | 1482.0 | 3502 | 214.089996337891 | |
master | GRU | CUDA | 128 | 128 | 4 | forw and back | 2258.0 | 5845 | 364.75 | |
master | GRU | CPU | 128 | 128 | 16 | forward | 2126.0 | 193 | 1000.0 | |
master | GRU | CPU | 128 | 128 | 16 | backward | 4301.0 | 3181 | 4860.0 | |
master | GRU | CPU | 128 | 128 | 16 | forw and back | 4592.0 | 6495 | 5850.0 | |
master | GRU | CUDA | 128 | 128 | 16 | forward | 1507.0 | 5745 | 329.170013427734 | |
master | GRU | CUDA | 128 | 128 | 16 | backward | 5061.0 | 13751 | 846.75 | |
master | GRU | CUDA | 128 | 128 | 16 | forw and back | 7577.0 | 22815 | 1390.0 | |
master | GRU | CPU | 128 | 128 | 32 | forward | 4291.0 | 385 | 2020.0 | |
master | GRU | CPU | 128 | 128 | 32 | backward | 8731.0 | 6285 | 9780.0 | |
master | GRU | CPU | 128 | 128 | 32 | forw and back | 14893.0 | 12831 | 11780.0 | |
master | GRU | CUDA | 128 | 128 | 32 | forward | 2798.0 | 11489 | 658.299987792969 | |
master | GRU | CUDA | 128 | 128 | 32 | backward | 9674.0 | 27415 | 1650.0 | |
master | GRU | CUDA | 128 | 128 | 32 | forw and back | 14475.0 | 45439 | 2780.0 | |
master | GRU | CPU | 128 | 128 | 64 | forward | 8624.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 128 | 128 | 64 | backward | 17444.0 | 12493 | 19620.0 | |
master | GRU | CPU | 128 | 128 | 64 | forw and back | 29794.0 | 25503 | 23650.0 | |
master | GRU | CUDA | 128 | 128 | 64 | forward | 5462.0 | 22977 | 1290.0 | |
master | GRU | CUDA | 128 | 128 | 64 | backward | 18837.0 | 54743 | 3300.0 | |
master | GRU | CUDA | 128 | 128 | 64 | forw and back | 28386.0 | 90687 | 5550.0 | |
master | GRU | CPU | 128 | 512 | 4 | forward | 1494.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 128 | 512 | 4 | backward | 2102.0 | 880 | 4230.0 | |
master | GRU | CPU | 128 | 512 | 4 | forw and back | 3756.0 | 1788 | 4870.0 | |
master | GRU | CUDA | 128 | 512 | 4 | forward | 475.424987792969 | 1458 | 82.6600036621094 | |
master | GRU | CUDA | 128 | 512 | 4 | backward | 1550.0 | 3594 | 217.839996337891 | |
master | GRU | CUDA | 128 | 512 | 4 | forw and back | 2331.0 | 5998 | 370.220001220703 | |
master | GRU | CPU | 128 | 512 | 16 | forward | 6206.0 | 224 | 3930.0 | |
master | GRU | CPU | 128 | 512 | 16 | backward | 8760.0 | 3305 | 17620.0 | |
master | GRU | CPU | 128 | 512 | 16 | forw and back | 15116.0 | 6698 | 20420.0 | |
master | GRU | CUDA | 128 | 512 | 16 | forward | 1505.0 | 5838 | 330.619995117187 | |
master | GRU | CUDA | 128 | 512 | 16 | backward | 5327.0 | 14131 | 861.940002441406 | |
master | GRU | CUDA | 128 | 512 | 16 | forw and back | 7866.0 | 23400 | 1410.0 | |
master | GRU | CPU | 128 | 512 | 32 | forward | 12577.0 | 448 | 7950.0 | |
master | GRU | CPU | 128 | 512 | 32 | backward | 17580.0 | 6537 | 35470.0 | |
master | GRU | CPU | 128 | 512 | 32 | forw and back | 30524.0 | 13242 | 41150.0 | |
master | GRU | CUDA | 128 | 512 | 32 | forward | 2848.0 | 11678 | 661.25 | |
master | GRU | CUDA | 128 | 512 | 32 | backward | 10367.0 | 28179 | 1680.0 | |
master | GRU | CUDA | 128 | 512 | 32 | forw and back | 15281.0 | 46600 | 2820.0 | |
master | GRU | CPU | 128 | 512 | 64 | forward | 25270.0 | 896 | 15990.0 | |
master | GRU | CPU | 128 | 512 | 64 | backward | 35290.0 | 13001 | 71160.0 | |
master | GRU | CPU | 128 | 512 | 64 | forw and back | 60744.0 | 26330 | 82620.0 | |
master | GRU | CUDA | 128 | 512 | 64 | forward | 5532.0 | 23358 | 1290.0 | |
master | GRU | CUDA | 128 | 512 | 64 | backward | 20284.0 | 56275 | 3360.0 | |
master | GRU | CUDA | 128 | 512 | 64 | forw and back | 30075.0 | 93000 | 5620.0 | |
master | GRU | CPU | 512 | 32 | 4 | forward | 319.101989746094 | 49 | 61.7000007629395 | |
master | GRU | CPU | 512 | 32 | 4 | backward | 923.757995605469 | 859 | 1020.0 | |
master | GRU | CPU | 512 | 32 | 4 | forw and back | 1572.0 | 1748 | 982.380004882813 | |
master | GRU | CUDA | 512 | 32 | 4 | forward | 507.290008544922 | 1449 | 82.5199966430664 | |
master | GRU | CUDA | 512 | 32 | 4 | backward | 1492.0 | 3498 | 214.029998779297 | |
master | GRU | CUDA | 512 | 32 | 4 | forw and back | 2245.0 | 5861 | 365.0 | |
master | GRU | CPU | 512 | 32 | 16 | forward | 1164.0 | 193 | 264.299987792969 | |
master | GRU | CPU | 512 | 32 | 16 | backward | 3791.0 | 3212 | 4240.0 | |
master | GRU | CPU | 512 | 32 | 16 | forw and back | 6120.0 | 6526 | 4010.00024414062 | |
master | GRU | CUDA | 512 | 32 | 16 | forward | 1564.0 | 5793 | 329.920013427734 | |
master | GRU | CUDA | 512 | 32 | 16 | backward | 5027.0 | 13735 | 846.5 | |
master | GRU | CUDA | 512 | 32 | 16 | forw and back | 7642.0 | 22879 | 1400.0 | |
master | GRU | CPU | 512 | 32 | 32 | forward | 2637.0 | 385 | 534.419982910156 | |
master | GRU | CPU | 512 | 32 | 32 | backward | 7665.0 | 6348 | 8530.0 | |
master | GRU | CPU | 512 | 32 | 32 | forw and back | 12242.0 | 12894 | 8080.0 | |
master | GRU | CUDA | 512 | 32 | 32 | forward | 2881.0 | 11585 | 659.799987792969 | |
master | GRU | CUDA | 512 | 32 | 32 | backward | 9533.0 | 27383 | 1650.0 | |
master | GRU | CUDA | 512 | 32 | 32 | forw and back | 14572.0 | 45567 | 2780.0 | |
master | GRU | CPU | 512 | 32 | 64 | forward | 5305.0 | 769 | 1050.0 | |
master | GRU | CPU | 512 | 32 | 64 | backward | 15232.0 | 12620 | 17120.0 | |
master | GRU | CPU | 512 | 32 | 64 | forw and back | 24258.0 | 25630 | 16219.9990234375 | |
master | GRU | CUDA | 512 | 32 | 64 | forward | 5537.0 | 23169 | 1290.0 | |
master | GRU | CUDA | 512 | 32 | 64 | backward | 18534.0 | 54679 | 3300.0 | |
master | GRU | CUDA | 512 | 32 | 64 | forw and back | 28153.0 | 90943 | 5550.0 | |
master | GRU | CPU | 512 | 128 | 4 | forward | 593.072998046875 | 49 | 238.020004272461 | |
master | GRU | CPU | 512 | 128 | 4 | backward | 1418.0 | 859 | 2910.0 | |
master | GRU | CPU | 512 | 128 | 4 | forw and back | 2217.0 | 1748 | 2400.0 | |
master | GRU | CUDA | 512 | 128 | 4 | forward | 493.325988769531 | 1457 | 82.6399993896484 | |
master | GRU | CUDA | 512 | 128 | 4 | backward | 1512.0 | 3534 | 214.589996337891 | |
master | GRU | CUDA | 512 | 128 | 4 | forw and back | 2295.0 | 5897 | 365.559997558594 | |
master | GRU | CPU | 512 | 128 | 16 | forward | 2444.0 | 193 | 1000.0 | |
master | GRU | CPU | 512 | 128 | 16 | backward | 5984.0 | 3212 | 11940.0 | |
master | GRU | CPU | 512 | 128 | 16 | forw and back | 8924.0 | 6526 | 9940.0 | |
master | GRU | CUDA | 512 | 128 | 16 | forward | 1503.0 | 5825 | 330.420013427734 | |
master | GRU | CUDA | 512 | 128 | 16 | backward | 5069.0 | 13879 | 848.75 | |
master | GRU | CUDA | 512 | 128 | 16 | forw and back | 7608.0 | 23023 | 1400.0 | |
master | GRU | CPU | 512 | 128 | 32 | forward | 4212.0 | 385 | 2020.0 | |
master | GRU | CPU | 512 | 128 | 32 | backward | 11972.0 | 6348 | 23990.0 | |
master | GRU | CPU | 512 | 128 | 32 | forw and back | 17700.0 | 12894 | 19990.0 | |
master | GRU | CUDA | 512 | 128 | 32 | forward | 2811.0 | 11649 | 660.799987792969 | |
master | GRU | CUDA | 512 | 128 | 32 | backward | 9804.0 | 27671 | 1650.0 | |
master | GRU | CUDA | 512 | 128 | 32 | forw and back | 14645.0 | 45855 | 2790.0 | |
master | GRU | CPU | 512 | 128 | 64 | forward | 10127.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 512 | 128 | 64 | backward | 23969.0 | 12620 | 48070.0 | |
master | GRU | CPU | 512 | 128 | 64 | forw and back | 35305.0 | 25630 | 40100.0 | |
master | GRU | CUDA | 512 | 128 | 64 | forward | 5483.0 | 23297 | 1290.0 | |
master | GRU | CUDA | 512 | 128 | 64 | backward | 18967.0 | 55255 | 3310.0 | |
master | GRU | CUDA | 512 | 128 | 64 | forw and back | 28818.0 | 91519 | 5560.0 | |
master | GRU | CPU | 512 | 512 | 4 | forward | 1767.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 512 | 512 | 4 | backward | 3335.0 | 887 | 10480.0 | |
master | GRU | CPU | 512 | 512 | 4 | forw and back | 5143.0 | 1795 | 8109.99951171875 | |
master | GRU | CUDA | 512 | 512 | 4 | forward | 533.997009277344 | 1478 | 82.9700012207031 | |
master | GRU | CUDA | 512 | 512 | 4 | backward | 1617.0 | 3626 | 218.339996337891 | |
master | GRU | CUDA | 512 | 512 | 4 | forw and back | 2420.0 | 6050 | 371.029998779297 | |
master | GRU | CPU | 512 | 512 | 16 | forward | 7644.0 | 224 | 3930.0 | |
master | GRU | CPU | 512 | 512 | 16 | backward | 15006.0 | 3336 | 42710.0 | |
master | GRU | CPU | 512 | 512 | 16 | forw and back | 20369.0 | 6729 | 33500.0 | |
master | GRU | CUDA | 512 | 512 | 16 | forward | 1656.0 | 5918 | 331.880004882812 | |
master | GRU | CUDA | 512 | 512 | 16 | backward | 5598.0 | 14259 | 863.940002441406 | |
master | GRU | CUDA | 512 | 512 | 16 | forw and back | 8168.0 | 23608 | 1420.0 | |
master | GRU | CPU | 512 | 512 | 32 | forward | 15379.0 | 448 | 7950.0 | |
master | GRU | CPU | 512 | 512 | 32 | backward | 30088.0 | 6600 | 85680.0 | |
master | GRU | CPU | 512 | 512 | 32 | forw and back | 40615.0 | 13305 | 67360.0 | |
master | GRU | CUDA | 512 | 512 | 32 | forward | 3145.0 | 11838 | 663.75 | |
master | GRU | CUDA | 512 | 512 | 32 | backward | 10914.0 | 28435 | 1680.0 | |
master | GRU | CUDA | 512 | 512 | 32 | forw and back | 15935.0 | 47016 | 2820.0 | |
master | GRU | CPU | 512 | 512 | 64 | forward | 30859.0 | 896 | 15990.0 | |
master | GRU | CPU | 512 | 512 | 64 | backward | 60045.0 | 13128 | 171620.0 | |
master | GRU | CPU | 512 | 512 | 64 | forw and back | 81051.0 | 26457 | 135070.0 | |
master | GRU | CUDA | 512 | 512 | 64 | forward | 6096.0 | 23678 | 1300.0 | |
master | GRU | CUDA | 512 | 512 | 64 | backward | 22248.0 | 56787 | 3370.0 | |
master | GRU | CUDA | 512 | 512 | 64 | forw and back | 31378.0 | 93832 | 5640.0 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 4 | forward | 42.0589981079102 | 21 | 54.1399993896484 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 4 | backward | 151.651000976562 | 824 | 176.080001831055 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 4 | forw and back | 321.837005615234 | 1779 | 347.859985351562 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 4 | forward | 248.850006103516 | 557 | 49.3300018310547 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 4 | backward | 1016.99993896484 | 3199 | 212.309997558594 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 4 | forw and back | 1550.0 | 4733 | 328.299987792969 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 16 | forward | 176.811996459961 | 81 | 228.229995727539 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 16 | backward | 568.111999511719 | 3164 | 720.22998046875 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 16 | forw and back | 1157.0 | 6711 | 1370.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 16 | forward | 687.651977539062 | 2225 | 197.169998168945 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 16 | backward | 3388.0 | 12715 | 847.169982910156 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 16 | forw and back | 5056.0 | 18509 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 32 | forward | 360.375 | 161 | 460.359985351563 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 32 | backward | 1118.0 | 6285 | 1410.0 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 32 | forw and back | 2261.0 | 13289 | 2740.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 32 | forward | 1288.0 | 4449 | 394.299987792969 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 32 | backward | 6496.0 | 25404 | 1650.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 32 | forw and back | 9757.0 | 36879 | 2510.0 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 64 | forward | 724.122009277344 | 321 | 924.619995117188 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 64 | backward | 2240.0 | 12525 | 2830.0 | |
pr1761_multigate | LSTM | CPU | 32 | 32 | 64 | forw and back | 4488.0 | 26441 | 5480.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 64 | forward | 2456.0 | 8897 | 788.559997558594 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 64 | backward | 12751.0 | 50780 | 3310.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 32 | 64 | forw and back | 19279.0 | 73615 | 5000.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 4 | forward | 154.302993774414 | 21 | 210.639999389648 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 4 | backward | 247.195999145508 | 824 | 473.200012207031 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 4 | forw and back | 543.414978027344 | 1779 | 887.22998046875 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 4 | forward | 254.785003662109 | 561 | 49.3899993896484 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 4 | backward | 1052.0 | 3243 | 213.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 4 | forw and back | 1597.0 | 4777 | 328.980010986328 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 16 | forward | 635.35302734375 | 81 | 890.22998046875 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 16 | backward | 957.734008789063 | 3164 | 1880.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 16 | forw and back | 2027.0 | 6711 | 3530.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 16 | forward | 714.284973144531 | 2241 | 197.419998168945 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 16 | backward | 3451.0 | 12891 | 849.919982910156 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 16 | forw and back | 5187.0 | 18685 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 32 | forward | 1288.0 | 161 | 1750.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 32 | backward | 1938.0 | 6285 | 3770.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 32 | forw and back | 4065.0 | 13289 | 7070.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 32 | forward | 1306.0 | 4481 | 394.799987792969 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 32 | backward | 6603.0 | 25756 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 32 | forw and back | 9957.0 | 37231 | 2510.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 64 | forward | 2591.0 | 321 | 3520.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 64 | backward | 4008.0 | 12525 | 7560.0 | |
pr1761_multigate | LSTM | CPU | 32 | 128 | 64 | forw and back | 8139.0 | 26441 | 14170.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 64 | forward | 2502.0 | 8961 | 789.559997558594 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 64 | backward | 12875.0 | 51484 | 3320.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 128 | 64 | forw and back | 19406.0 | 74319 | 5020.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 4 | forward | 1443.0 | 32 | 833.780029296875 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 4 | backward | 1446.0 | 836 | 1610.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 4 | forw and back | 3283.0 | 1814 | 2950.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 4 | forward | 258.283996582031 | 582 | 49.7200012207031 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 4 | backward | 1099.0 | 3271 | 213.440002441406 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 4 | forw and back | 1651.0 | 4858 | 330.890014648437 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 16 | forward | 6000.0 | 128 | 3440.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 16 | backward | 6059.0 | 3212 | 6540.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 16 | forw and back | 13348.0 | 6854 | 12080.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 16 | forward | 718.882019042969 | 2334 | 198.880004882812 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 16 | backward | 3616.0 | 13015 | 851.859985351563 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 16 | forw and back | 5308.0 | 18982 | 1270.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 32 | forward | 12162.0 | 256 | 6950.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 32 | backward | 12209.0 | 6381 | 13120.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 32 | forw and back | 26654.0 | 13576 | 24240.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 32 | forward | 1328.0 | 4670 | 397.75 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 32 | backward | 6839.0 | 26008 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 32 | forw and back | 10249.0 | 37816 | 2520.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 64 | forward | 24408.0 | 512 | 13960.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 64 | backward | 24593.0 | 12717 | 26270.0 | |
pr1761_multigate | LSTM | CPU | 32 | 512 | 64 | forw and back | 53268.0 | 27016 | 48570.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 64 | forward | 2553.0 | 9342 | 795.52001953125 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 64 | backward | 13356.0 | 51992 | 3330.0 | |
pr1761_multigate | LSTM | CUDA | 32 | 512 | 64 | forw and back | 20183.0 | 75480 | 5030.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 4 | forward | 55.5870018005371 | 21 | 54.1399993896484 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 4 | backward | 191.156005859375 | 824 | 356.079986572266 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 4 | forw and back | 383.257995605469 | 1779 | 479.859985351562 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 4 | forward | 267.013000488281 | 557 | 49.3300018310547 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 4 | backward | 1084.0 | 3199 | 212.309997558594 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 4 | forw and back | 1636.0 | 4733 | 328.299987792969 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 16 | forward | 234.233001708984 | 81 | 228.229995727539 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 16 | backward | 715.557983398438 | 3164 | 1440.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 16 | forw and back | 1358.0 | 6711 | 1920.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 16 | forward | 720.684020996094 | 2225 | 197.169998168945 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 16 | backward | 3442.0 | 12715 | 847.169982910156 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 16 | forw and back | 5143.0 | 18509 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 32 | forward | 474.350006103516 | 161 | 460.359985351563 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 32 | backward | 1439.0 | 6285 | 2900.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 32 | forw and back | 2681.0 | 13289 | 3850.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 32 | forward | 1330.0 | 4449 | 394.299987792969 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 32 | backward | 6586.0 | 25404 | 1650.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 32 | forw and back | 9891.0 | 36879 | 2510.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 64 | forward | 952.992980957031 | 321 | 924.619995117188 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 64 | backward | 3052.0 | 12525 | 5820.0 | |
pr1761_multigate | LSTM | CPU | 128 | 32 | 64 | forw and back | 5437.0 | 26441 | 7710.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 64 | forward | 2529.0 | 8897 | 788.559997558594 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 64 | backward | 12845.0 | 50780 | 3310.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 32 | 64 | forw and back | 19381.0 | 73615 | 5000.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 4 | forward | 507.778015136719 | 21 | 210.639999389648 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 4 | backward | 963.617004394531 | 832 | 940.580017089844 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 4 | forw and back | 1800.0 | 1783 | 1140.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 4 | forward | 262.748992919922 | 561 | 49.3899993896484 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 4 | backward | 1070.0 | 3243 | 213.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 4 | forw and back | 1622.0 | 4777 | 328.980010986328 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 16 | forward | 2136.0 | 81 | 890.22998046875 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 16 | backward | 3893.0 | 3196 | 3740.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 16 | forw and back | 7014.0 | 6727 | 4640.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 16 | forward | 737.192016601562 | 2241 | 197.419998168945 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 16 | backward | 3495.0 | 12891 | 849.919982910156 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 16 | forw and back | 5242.0 | 18685 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 32 | forward | 4290.0 | 161 | 1750.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 32 | backward | 7861.0 | 6349 | 7510.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 32 | forw and back | 14093.0 | 13321 | 9310.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 32 | forward | 1329.0 | 4481 | 394.799987792969 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 32 | backward | 6652.0 | 25756 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 32 | forw and back | 10033.0 | 37231 | 2510.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 64 | forward | 8724.0 | 321 | 3520.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 64 | backward | 15784.0 | 12653 | 15040.0 | |
pr1761_multigate | LSTM | CPU | 128 | 128 | 64 | forw and back | 28257.0 | 26505 | 18650.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 64 | forward | 2550.0 | 8961 | 789.559997558594 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 64 | backward | 12912.0 | 51484 | 3320.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 128 | 64 | forw and back | 19573.0 | 74319 | 5020.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 4 | forward | 1513.0 | 32 | 833.780029296875 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 4 | backward | 1726.0 | 836 | 3190.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 4 | forw and back | 3499.0 | 1814 | 3780.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 4 | forward | 285.451995849609 | 582 | 49.7200012207031 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 4 | backward | 1142.0 | 3271 | 213.440002441406 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 4 | forw and back | 1721.0 | 4858 | 330.890014648437 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 16 | forward | 6449.0 | 128 | 3440.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 16 | backward | 7435.0 | 3212 | 12910.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 16 | forw and back | 14501.0 | 6854 | 15440.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 16 | forward | 788.846008300781 | 2334 | 198.880004882812 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 16 | backward | 3723.0 | 13015 | 851.859985351563 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 16 | forw and back | 5460.0 | 18982 | 1270.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 32 | forward | 13083.0 | 256 | 6950.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 32 | backward | 14822.0 | 6381 | 25860.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 32 | forw and back | 28938.0 | 13576 | 30980.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 32 | forward | 1454.0 | 4670 | 397.75 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 32 | backward | 7158.0 | 26008 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 32 | forw and back | 10586.0 | 37816 | 2520.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 64 | forward | 26329.0 | 512 | 13960.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 64 | backward | 29689.0 | 12717 | 51760.0 | |
pr1761_multigate | LSTM | CPU | 128 | 512 | 64 | forw and back | 57559.0 | 27016 | 62060.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 64 | forward | 2789.0 | 9342 | 795.52001953125 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 64 | backward | 14019.0 | 51992 | 3330.0 | |
pr1761_multigate | LSTM | CUDA | 128 | 512 | 64 | forw and back | 20952.0 | 75480 | 5030.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 4 | forward | 281.427001953125 | 21 | 54.1399993896484 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 4 | backward | 886.869995117188 | 839 | 1050.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 4 | forw and back | 1483.0 | 1790 | 1007.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 4 | forward | 274.860992431641 | 577 | 49.6399993896484 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 4 | backward | 1097.0 | 3231 | 212.809997558594 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 4 | forw and back | 1673.0 | 4785 | 329.109985351562 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 16 | forward | 1197.0 | 81 | 228.229995727539 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 16 | backward | 3595.0 | 3227 | 4390.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 16 | forw and back | 5754.0 | 6758 | 4120.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 16 | forward | 741.932983398437 | 2305 | 198.419998168945 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 16 | backward | 3532.0 | 12843 | 849.169982910156 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 16 | forw and back | 5294.0 | 18717 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 32 | forward | 2410.0 | 161 | 460.359985351563 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 32 | backward | 7267.0 | 6412 | 8840.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 32 | forw and back | 11516.0 | 13384 | 8300.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 32 | forward | 1359.0 | 4609 | 396.799987792969 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 32 | backward | 6715.0 | 25660 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 32 | forw and back | 10137.0 | 37295 | 2510.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 64 | forward | 3643.0 | 321 | 924.619995117188 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 64 | backward | 14499.0 | 12780 | 17750.0 | |
pr1761_multigate | LSTM | CPU | 512 | 32 | 64 | forw and back | 22928.0 | 26632 | 16650.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 64 | forward | 2553.0 | 9217 | 793.559997558594 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 64 | backward | 12995.0 | 51292 | 3320.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 32 | 64 | forw and back | 19851.0 | 74447 | 5020.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 4 | forward | 575.971984863281 | 21 | 210.639999389648 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 4 | backward | 1308.0 | 839 | 2750.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 4 | forw and back | 2122.0 | 1790 | 2210.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 4 | forward | 311.31201171875 | 581 | 49.7000007629395 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 4 | backward | 1118.0 | 3275 | 213.5 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 4 | forw and back | 1723.0 | 4829 | 329.799987792969 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 16 | forward | 2427.0 | 81 | 890.22998046875 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 16 | backward | 5588.0 | 3227 | 11190.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 16 | forw and back | 8536.0 | 6758 | 9090.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 16 | forward | 877.736022949219 | 2321 | 198.669998168945 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 16 | backward | 3624.0 | 13019 | 851.919982910156 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 16 | forw and back | 5539.0 | 18893 | 1260.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 32 | forward | 4971.0 | 161 | 1750.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 32 | backward | 11162.0 | 6412 | 22460.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 32 | forw and back | 17002.0 | 13384 | 18260.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 32 | forward | 1584.0 | 4641 | 397.299987792969 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 32 | backward | 6910.0 | 26012 | 1660.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 32 | forw and back | 10632.0 | 37647 | 2520.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 64 | forward | 10123.0 | 321 | 3520.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 64 | backward | 22419.0 | 12780 | 44980.0 | |
pr1761_multigate | LSTM | CPU | 512 | 128 | 64 | forw and back | 33834.0 | 26632 | 36600.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 64 | forward | 3038.0 | 9281 | 794.559997558594 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 64 | backward | 13390.0 | 51996 | 3330.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 128 | 64 | forw and back | 20672.0 | 75151 | 5030.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 4 | forward | 1791.0 | 32 | 833.780029296875 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 4 | backward | 3316.0 | 843 | 9520.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 4 | forw and back | 3526.0 | 1821 | 7110.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 4 | forward | 312.209014892578 | 602 | 50.0299987792969 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 4 | backward | 1201.0 | 3303 | 213.940002441406 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 4 | forw and back | 1778.0 | 4910 | 331.700012207031 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 16 | forward | 7796.0 | 128 | 3440.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 16 | backward | 13284.0 | 3243 | 38360.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 16 | forw and back | 19260.0 | 6885 | 28890.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 16 | forward | 869.986999511719 | 2414 | 200.119995117187 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 16 | backward | 3947.0 | 13143 | 853.859985351563 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 16 | forw and back | 5743.0 | 19190 | 1270.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 32 | forward | 15670.0 | 256 | 6950.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 32 | backward | 26938.0 | 6444 | 76810.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 32 | forw and back | 38295.0 | 13639 | 57930.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 32 | forward | 1604.0 | 4830 | 400.25 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 32 | backward | 7653.0 | 26264 | 1670.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 32 | forw and back | 11028.0 | 38232 | 2530.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 64 | forward | 31388.0 | 512 | 13960.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 64 | backward | 54138.0 | 12844 | 153700.0 | |
pr1761_multigate | LSTM | CPU | 512 | 512 | 64 | forw and back | 76616.0 | 27143 | 116000.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 64 | forward | 3067.0 | 9662 | 800.52001953125 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 64 | backward | 15159.0 | 52504 | 3330.0 | |
pr1761_multigate | LSTM | CUDA | 512 | 512 | 64 | forw and back | 21667.0 | 76312 | 5050.0 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 4 | forward | 33.560001373291 | 25 | 39.1100006103516 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 4 | backward | 167.455001831055 | 1117 | 194.770004272461 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 4 | forw and back | 346.714996337891 | 1894 | 369.230010986328 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 4 | forward | 278.77099609375 | 597 | 46.7700004577637 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 4 | backward | 1348.0 | 3455 | 214.830001831055 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 4 | forw and back | 1918.0 | 4782 | 346.049987792969 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 16 | forward | 149.580001831055 | 97 | 165.199996948242 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 16 | backward | 659.940979003906 | 4238 | 796.590026855469 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 16 | forw and back | 1254.0 | 7104 | 1450.0 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 16 | forward | 762.4990234375 | 2385 | 186.919998168945 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 16 | backward | 4445.0 | 13560 | 849.549987792969 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 16 | forw and back | 6255.0 | 18560 | 1320.0 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 32 | forward | 306.828002929687 | 193 | 333.329986572266 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 32 | backward | 1292.0 | 8398 | 1560.0 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 32 | forw and back | 2424.0 | 14048 | 2900.0 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 32 | forward | 1399.0 | 4769 | 373.799987792969 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 32 | backward | 8518.0 | 27032 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 32 | forw and back | 11977.0 | 36928 | 2630.0 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 64 | forward | 619.963989257812 | 385 | 669.590026855469 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 64 | backward | 2591.0 | 16718 | 3130.0 | |
pr1761_multigate | GRU | CPU | 32 | 32 | 64 | forw and back | 4806.0 | 27936 | 5810.0 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 64 | forward | 2662.0 | 9537 | 747.559997558594 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 64 | backward | 16727.0 | 53976 | 3310.0 | |
pr1761_multigate | GRU | CUDA | 32 | 32 | 64 | forw and back | 23488.0 | 73664 | 5250.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 4 | forward | 130.85400390625 | 25 | 151.110000610352 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 4 | backward | 274.640014648437 | 1117 | 506.890014648437 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 4 | forw and back | 553.372985839844 | 1894 | 831.109985351563 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 4 | forward | 271.454986572266 | 609 | 46.9500007629395 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 4 | backward | 1353.0 | 3491 | 215.389999389648 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 4 | forw and back | 1936.0 | 4818 | 346.609985351562 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 16 | forward | 546.9990234375 | 97 | 640.200012207031 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 16 | backward | 1046.0 | 4238 | 2040.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 16 | forw and back | 2017.0 | 7104 | 3330.0 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 16 | forward | 767.724975585938 | 2433 | 187.669998168945 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 16 | backward | 4547.0 | 13704 | 851.799987792969 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 16 | forw and back | 6358.0 | 18704 | 1320.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 32 | forward | 1106.0 | 193 | 1260.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 32 | backward | 2110.0 | 8398 | 4100.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 32 | forw and back | 3990.0 | 14048 | 6680.0 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 32 | forward | 1409.0 | 4865 | 375.299987792969 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 32 | backward | 8790.0 | 27320 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 32 | forw and back | 12198.0 | 37216 | 2640.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 64 | forward | 2227.0 | 385 | 2540.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 64 | backward | 4352.0 | 16718 | 8230.0 | |
pr1761_multigate | GRU | CPU | 32 | 128 | 64 | forw and back | 7996.0 | 27936 | 13380.0 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 64 | forward | 2695.0 | 9729 | 750.559997558594 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 64 | backward | 17108.0 | 54552 | 3320.0 | |
pr1761_multigate | GRU | CUDA | 32 | 128 | 64 | forw and back | 23983.0 | 74240 | 5260.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 4 | forward | 1264.0 | 32 | 594.559997558594 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 4 | backward | 1575.0 | 1132 | 1700.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 4 | forw and back | 2672.0 | 1924 | 2590.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 4 | forward | 269.457000732422 | 630 | 47.2799987792969 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 4 | backward | 1416.0 | 3583 | 219.139999389648 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 4 | forw and back | 1990.0 | 4971 | 352.079986572266 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 16 | forward | 5298.0 | 128 | 2460.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 16 | backward | 6639.0 | 4301 | 7040.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 16 | forw and back | 12694.0 | 7230 | 10730.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 16 | forward | 764.281982421875 | 2526 | 189.119995117188 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 16 | backward | 4834.0 | 14084 | 866.97998046875 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 16 | forw and back | 6935.0 | 19289 | 1340.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 32 | forward | 10722.0 | 256 | 4970.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 32 | backward | 13356.0 | 8525 | 14160.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 32 | forw and back | 22247.0 | 14302 | 21570.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 32 | forward | 1424.0 | 5054 | 378.25 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 32 | backward | 9320.0 | 28084 | 1690.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 32 | forw and back | 12783.0 | 38377 | 2670.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 64 | forward | 21675.0 | 512 | 9990.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 64 | backward | 26814.0 | 16973 | 28400.0 | |
pr1761_multigate | GRU | CPU | 32 | 512 | 64 | forw and back | 50952.0 | 28446 | 43270.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 64 | forward | 2725.0 | 10110 | 756.52001953125 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 64 | backward | 18307.0 | 56084 | 3380.0 | |
pr1761_multigate | GRU | CUDA | 32 | 512 | 64 | forw and back | 25988.0 | 76553 | 5340.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 4 | forward | 46.640998840332 | 25 | 39.1100006103516 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 4 | backward | 214.24299621582 | 1117 | 353.769989013672 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 4 | forw and back | 411.119995117187 | 1894 | 480.230010986328 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 4 | forward | 281.294006347656 | 597 | 46.7700004577637 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 4 | backward | 1362.0 | 3455 | 214.830001831055 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 4 | forw and back | 1935.0 | 4782 | 346.049987792969 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 16 | forward | 196.511993408203 | 97 | 165.199996948242 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 16 | backward | 787.031005859375 | 4238 | 1430.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 16 | forw and back | 1428.0 | 7104 | 1910.0 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 16 | forward | 787.830017089844 | 2385 | 186.919998168945 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 16 | backward | 4583.0 | 13560 | 849.549987792969 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 16 | forw and back | 6438.0 | 18560 | 1320.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 32 | forward | 398.618011474609 | 193 | 333.329986572266 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 32 | backward | 1575.0 | 8398 | 2870.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 32 | forw and back | 2794.0 | 14048 | 3830.0 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 32 | forward | 1449.0 | 4769 | 373.799987792969 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 32 | backward | 8759.0 | 27032 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 32 | forw and back | 12227.0 | 36928 | 2630.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 64 | forward | 798.705993652344 | 385 | 669.590026855469 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 64 | backward | 3280.0 | 16718 | 5750.0 | |
pr1761_multigate | GRU | CPU | 128 | 32 | 64 | forw and back | 5621.0 | 27936 | 7680.0 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 64 | forward | 2768.0 | 9537 | 747.559997558594 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 64 | backward | 16799.0 | 53976 | 3310.0 | |
pr1761_multigate | GRU | CUDA | 128 | 32 | 64 | forw and back | 23627.0 | 73664 | 5250.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 4 | forward | 470.582000732422 | 25 | 151.110000610352 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 4 | backward | 1024.0 | 1125 | 953.27001953125 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 4 | forw and back | 1813.0 | 1898 | 1060.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 4 | forward | 271.450012207031 | 609 | 46.9500007629395 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 4 | backward | 1353.0 | 3491 | 215.389999389648 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 4 | forw and back | 1928.0 | 4818 | 346.609985351562 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 16 | forward | 1955.0 | 97 | 640.200012207031 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 16 | backward | 4127.0 | 4270 | 3810.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 16 | forw and back | 7025.0 | 7120 | 4350.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 16 | forward | 762.539978027344 | 2433 | 187.669998168945 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 16 | backward | 4574.0 | 13704 | 851.799987792969 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 16 | forw and back | 6396.0 | 18704 | 1320.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 32 | forward | 2435.0 | 193 | 1260.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 32 | backward | 8386.0 | 8462 | 7650.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 32 | forw and back | 14122.0 | 14080 | 8730.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 32 | forward | 1421.0 | 4865 | 375.299987792969 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 32 | backward | 8819.0 | 27320 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 32 | forw and back | 12320.0 | 37216 | 2640.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 64 | forward | 7949.0 | 385 | 2540.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 64 | backward | 16790.0 | 16846 | 15330.0 | |
pr1761_multigate | GRU | CPU | 128 | 128 | 64 | forw and back | 28142.0 | 28000 | 17490.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 64 | forward | 2723.0 | 9729 | 750.559997558594 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 64 | backward | 17311.0 | 54552 | 3320.0 | |
pr1761_multigate | GRU | CUDA | 128 | 128 | 64 | forw and back | 24091.0 | 74240 | 5260.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 4 | forward | 1377.0 | 32 | 594.559997558594 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 4 | backward | 1902.0 | 1132 | 3260.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 4 | forw and back | 3432.0 | 1924 | 3400.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 4 | forward | 275.136993408203 | 630 | 47.2799987792969 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 4 | backward | 1423.0 | 3583 | 219.139999389648 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 4 | forw and back | 2009.00012207031 | 4971 | 352.079986572266 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 16 | forward | 5680.0 | 128 | 2460.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 16 | backward | 7926.0 | 4301 | 13310.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 16 | forw and back | 13797.0 | 7230 | 14000.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 16 | forward | 780.648010253906 | 2526 | 189.119995117188 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 16 | backward | 4848.0 | 14084 | 866.97998046875 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 16 | forw and back | 6705.0 | 19289 | 1340.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 32 | forward | 11500.0 | 256 | 4970.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 32 | backward | 15866.0 | 8525 | 26710.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 32 | forw and back | 27474.0 | 14302 | 28130.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 32 | forward | 1439.0 | 5054 | 378.25 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 32 | backward | 9411.0 | 28084 | 1690.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 32 | forw and back | 12827.0 | 38377 | 2670.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 64 | forward | 23182.0 | 512 | 9990.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 64 | backward | 31875.0 | 16973 | 53520.0 | |
pr1761_multigate | GRU | CPU | 128 | 512 | 64 | forw and back | 54836.0 | 28446 | 56380.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 64 | forward | 2746.0 | 10110 | 756.52001953125 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 64 | backward | 18130.0 | 56084 | 3380.0 | |
pr1761_multigate | GRU | CUDA | 128 | 512 | 64 | forw and back | 25132.0 | 76553 | 5340.0 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 4 | forward | 315.055999755859 | 25 | 39.1100006103516 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 4 | backward | 946.289001464844 | 1132 | 988.590026855469 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 4 | forw and back | 1564.0 | 1905 | 923.380004882813 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 4 | forward | 283.980010986328 | 617 | 47.0800018310547 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 4 | backward | 1379.0 | 3487 | 215.330001831055 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 4 | forw and back | 1958.0 | 4834 | 346.859985351562 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 16 | forward | 1263.0 | 97 | 165.199996948242 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 16 | backward | 3815.0 | 4301 | 4010.00024414062 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 16 | forw and back | 6029.0 | 7151 | 3750.0 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 16 | forward | 790.833984375 | 2465 | 188.169998168945 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 16 | backward | 4549.0 | 13688 | 851.549987792969 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 16 | forw and back | 6443.0 | 18768 | 1320.0 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 32 | forward | 2536.0 | 193 | 333.329986572266 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 32 | backward | 6099.0 | 8525 | 8069.99951171875 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 32 | forw and back | 11996.0 | 14143 | 7540.0 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 32 | forward | 1443.0 | 4929 | 376.299987792969 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 32 | backward | 8692.0 | 27288 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 32 | forw and back | 12216.0 | 37344 | 2640.0 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 64 | forward | 5089.0 | 385 | 669.590026855469 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 64 | backward | 15242.0 | 16973 | 16190.0009765625 | |
pr1761_multigate | GRU | CPU | 512 | 32 | 64 | forw and back | 23703.0 | 28127 | 15130.0 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 64 | forward | 2723.0 | 9857 | 752.559997558594 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 64 | backward | 17028.0 | 54488 | 3320.0 | |
pr1761_multigate | GRU | CUDA | 512 | 32 | 64 | forw and back | 24133.0 | 74496 | 5270.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 4 | forward | 559.637023925781 | 25 | 151.110000610352 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 4 | backward | 1399.0 | 1132 | 2680.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 4 | forw and back | 2139.0 | 1905 | 2060.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 4 | forward | 276.165008544922 | 629 | 47.2700004577637 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 4 | backward | 1374.0 | 3523 | 215.889999389648 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 4 | forw and back | 1981.0 | 4870 | 347.420013427734 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 16 | forward | 2258.0 | 97 | 640.200012207031 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 16 | backward | 5840.0 | 4301 | 10900.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 16 | forw and back | 8515.0 | 7151 | 8430.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 16 | forward | 788.554016113281 | 2513 | 188.919998168945 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 16 | backward | 4650.0 | 13832 | 853.799987792969 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 16 | forw and back | 6462.0 | 18912 | 1330.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 32 | forward | 4651.0 | 193 | 1260.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 32 | backward | 11692.0 | 8525 | 21860.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 32 | forw and back | 16916.0 | 14143 | 16940.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 32 | forward | 1434.0 | 5025 | 377.799987792969 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 32 | backward | 8944.0 | 27576 | 1660.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 32 | forw and back | 12441.0 | 37632 | 2640.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 64 | forward | 9432.0 | 385 | 2540.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 64 | backward | 23472.0 | 16973 | 43790.0 | |
pr1761_multigate | GRU | CPU | 512 | 128 | 64 | forw and back | 33648.0 | 28127 | 33950.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 64 | forward | 2722.0 | 10049 | 755.559997558594 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 64 | backward | 17468.0 | 55064 | 3330.0 | |
pr1761_multigate | GRU | CUDA | 512 | 128 | 64 | forw and back | 24193.0 | 75072 | 5280.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 4 | forward | 1650.0 | 32 | 594.559997558594 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 4 | backward | 3454.0 | 1139 | 9510.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 4 | forw and back | 4823.0 | 1931 | 6650.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 4 | forward | 314.127014160156 | 650 | 47.5900001525879 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 4 | backward | 1488.0 | 3615 | 219.639999389648 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 4 | forw and back | 2077.0 | 5023 | 352.890014648437 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 16 | forward | 7131.0 | 128 | 2460.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 16 | backward | 14083.0 | 4332 | 38400.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 16 | forw and back | 18896.0 | 7261 | 27090.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 16 | forward | 903.867980957031 | 2606 | 190.380004882812 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 16 | backward | 5050.0 | 14212 | 868.97998046875 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 16 | forw and back | 6913.0 | 19497 | 1350.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 32 | forward | 14344.0 | 256 | 4970.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 32 | backward | 28299.0 | 8588 | 76920.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 32 | forw and back | 37622.0 | 14365 | 54340.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 32 | forward | 1677.0 | 5214 | 380.75 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 32 | backward | 9893.0 | 28340 | 1690.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 32 | forw and back | 13357.0 | 38793 | 2680.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 64 | forward | 28770.0 | 512 | 9990.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 64 | backward | 56732.0 | 17100 | 153970.0 | |
pr1761_multigate | GRU | CPU | 512 | 512 | 64 | forw and back | 74510.0 | 28573 | 108840.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 64 | forward | 3243.0 | 10430 | 761.52001953125 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 64 | backward | 19664.0 | 56596 | 3390.0 | |
pr1761_multigate | GRU | CUDA | 512 | 512 | 64 | forw and back | 26862.0 | 77385 | 5350.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LSTM CPU c=32 n=32 ts=4 | |
forward | |
36.511 μs (37 allocations: 71.14 KiB) | |
backward | |
157.997 μs (608 allocations: 233.70 KiB) | |
forw and back | |
328.358 μs (1495 allocations: 390.67 KiB) | |
LSTM CUDA c=32 n=32 ts=4 | |
forward | |
331.403 μs (809 allocations: 56.45 KiB) | |
backward | |
1.063 ms (3075 allocations: 208.31 KiB) | |
forw and back | |
1.585 ms (4541 allocations: 309.48 KiB) | |
LSTM CPU c=32 n=32 ts=16 | |
forward | |
163.047 μs (145 allocations: 296.23 KiB) | |
backward | |
588.481 μs (2300 allocations: 950.73 KiB) | |
forw and back | |
1.193 ms (5575 allocations: 1.53 MiB) | |
LSTM CUDA c=32 n=32 ts=16 | |
forward | |
997.360 μs (3233 allocations: 225.67 KiB) | |
backward | |
3.549 ms (12219 allocations: 831.17 KiB) | |
forw and back | |
5.160 ms (17741 allocations: 1.18 MiB) | |
LSTM CPU c=32 n=32 ts=32 | |
forward | |
335.327 μs (289 allocations: 596.36 KiB) | |
backward | |
1.162 ms (4557 allocations: 1.86 MiB) | |
forw and back | |
2.329 ms (11017 allocations: 3.07 MiB) | |
LSTM CUDA c=32 n=32 ts=32 | |
forward | |
1.884 ms (6465 allocations: 451.30 KiB) | |
backward | |
6.835 ms (24412 allocations: 1.62 MiB) | |
forw and back | |
9.937 ms (35343 allocations: 2.36 MiB) | |
LSTM CPU c=32 n=32 ts=64 | |
forward | |
677.271 μs (577 allocations: 1.17 MiB) | |
backward | |
2.348 ms (9069 allocations: 3.73 MiB) | |
forw and back | |
4.628 ms (21897 allocations: 6.14 MiB) | |
LSTM CUDA c=32 n=32 ts=64 | |
forward | |
3.638 ms (12929 allocations: 902.56 KiB) | |
backward | |
13.431 ms (48796 allocations: 3.24 MiB) | |
forw and back | |
19.508 ms (70543 allocations: 4.71 MiB) | |
LSTM CPU c=32 n=128 ts=4 | |
forward | |
139.940 μs (37 allocations: 276.64 KiB) | |
backward | |
275.686 μs (608 allocations: 722.83 KiB) | |
forw and back | |
582.452 μs (1495 allocations: 1.10 MiB) | |
LSTM CUDA c=32 n=128 ts=4 | |
forward | |
343.437 μs (817 allocations: 56.58 KiB) | |
backward | |
1.101 ms (3119 allocations: 209.00 KiB) | |
forw and back | |
1.629 ms (4585 allocations: 310.17 KiB) | |
LSTM CPU c=32 n=128 ts=16 | |
forward | |
588.313 μs (145 allocations: 1.13 MiB) | |
backward | |
1.068 ms (2300 allocations: 2.86 MiB) | |
forw and back | |
2.170 ms (5575 allocations: 4.44 MiB) | |
LSTM CUDA c=32 n=128 ts=16 | |
forward | |
1.043 ms (3265 allocations: 226.17 KiB) | |
backward | |
3.662 ms (12395 allocations: 833.92 KiB) | |
forw and back | |
5.349 ms (17917 allocations: 1.19 MiB) | |
LSTM CPU c=32 n=128 ts=32 | |
forward | |
1.190 ms (289 allocations: 2.27 MiB) | |
backward | |
2.180 ms (4557 allocations: 5.73 MiB) | |
forw and back | |
4.349 ms (11017 allocations: 8.91 MiB) | |
LSTM CUDA c=32 n=128 ts=32 | |
forward | |
1.917 ms (6529 allocations: 452.30 KiB) | |
backward | |
6.973 ms (24764 allocations: 1.63 MiB) | |
forw and back | |
10.145 ms (35695 allocations: 2.36 MiB) | |
LSTM CPU c=32 n=128 ts=64 | |
forward | |
2.394 ms (577 allocations: 4.56 MiB) | |
backward | |
4.499 ms (9069 allocations: 11.46 MiB) | |
forw and back | |
8.673 ms (21897 allocations: 17.84 MiB) | |
LSTM CUDA c=32 n=128 ts=64 | |
forward | |
3.725 ms (13057 allocations: 904.56 KiB) | |
backward | |
13.667 ms (49500 allocations: 3.26 MiB) | |
forw and back | |
19.937 ms (71247 allocations: 4.72 MiB) | |
LSTM CPU c=32 n=512 ts=4 | |
forward | |
1.348 ms (48 allocations: 1.07 MiB) | |
backward | |
1.674 ms (636 allocations: 2.60 MiB) | |
forw and back | |
3.579 ms (1546 allocations: 3.93 MiB) | |
LSTM CUDA c=32 n=512 ts=4 | |
forward | |
349.663 μs (838 allocations: 56.91 KiB) | |
backward | |
1.156 ms (3147 allocations: 209.44 KiB) | |
forw and back | |
1.698 ms (4666 allocations: 312.08 KiB) | |
LSTM CPU c=32 n=512 ts=16 | |
forward | |
5.774 ms (192 allocations: 4.45 MiB) | |
backward | |
7.088 ms (2412 allocations: 10.51 MiB) | |
forw and back | |
14.627 ms (5782 allocations: 15.99 MiB) | |
LSTM CUDA c=32 n=512 ts=16 | |
forward | |
1.073 ms (3358 allocations: 227.62 KiB) | |
backward | |
3.879 ms (12519 allocations: 835.86 KiB) | |
forw and back | |
5.569 ms (18214 allocations: 1.19 MiB) | |
LSTM CPU c=32 n=512 ts=32 | |
forward | |
9.449 ms (384 allocations: 8.97 MiB) | |
backward | |
14.147 ms (4781 allocations: 21.06 MiB) | |
forw and back | |
28.861 ms (11432 allocations: 32.07 MiB) | |
LSTM CUDA c=32 n=512 ts=32 | |
forward | |
1.981 ms (6718 allocations: 455.25 KiB) | |
backward | |
7.394 ms (25016 allocations: 1.63 MiB) | |
forw and back | |
10.652 ms (36280 allocations: 2.37 MiB) | |
LSTM CPU c=32 n=512 ts=64 | |
forward | |
23.437 ms (768 allocations: 17.99 MiB) | |
backward | |
28.574 ms (9517 allocations: 42.15 MiB) | |
forw and back | |
57.866 ms (22728 allocations: 64.22 MiB) | |
LSTM CUDA c=32 n=512 ts=64 | |
forward | |
3.811 ms (13438 allocations: 910.52 KiB) | |
backward | |
14.319 ms (50008 allocations: 3.26 MiB) | |
forw and back | |
20.900 ms (72408 allocations: 4.74 MiB) | |
LSTM CPU c=128 n=32 ts=4 | |
forward | |
50.745 μs (37 allocations: 71.14 KiB) | |
backward | |
199.522 μs (608 allocations: 413.70 KiB) | |
forw and back | |
396.784 μs (1495 allocations: 522.67 KiB) | |
LSTM CUDA c=128 n=32 ts=4 | |
forward | |
364.986 μs (809 allocations: 56.45 KiB) | |
backward | |
1.135 ms (3075 allocations: 208.31 KiB) | |
forw and back | |
1.676 ms (4541 allocations: 309.48 KiB) | |
LSTM CPU c=128 n=32 ts=16 | |
forward | |
218.906 μs (145 allocations: 296.23 KiB) | |
backward | |
755.577 μs (2300 allocations: 1.67 MiB) | |
forw and back | |
1.420 ms (5575 allocations: 2.08 MiB) | |
LSTM CUDA c=128 n=32 ts=16 | |
forward | |
1.058 ms (3233 allocations: 225.67 KiB) | |
backward | |
3.679 ms (12219 allocations: 831.17 KiB) | |
forw and back | |
5.374 ms (17741 allocations: 1.18 MiB) | |
LSTM CPU c=128 n=32 ts=32 | |
forward | |
447.689 μs (289 allocations: 596.36 KiB) | |
backward | |
1.522 ms (4557 allocations: 3.35 MiB) | |
forw and back | |
2.791 ms (11017 allocations: 4.18 MiB) | |
LSTM CUDA c=128 n=32 ts=32 | |
forward | |
1.960 ms (6465 allocations: 451.30 KiB) | |
backward | |
7.010 ms (24412 allocations: 1.62 MiB) | |
forw and back | |
10.229 ms (35343 allocations: 2.36 MiB) | |
LSTM CPU c=128 n=32 ts=64 | |
forward | |
906.688 μs (577 allocations: 1.17 MiB) | |
backward | |
3.214 ms (9069 allocations: 6.72 MiB) | |
forw and back | |
5.648 ms (21897 allocations: 8.38 MiB) | |
LSTM CUDA c=128 n=32 ts=64 | |
forward | |
3.743 ms (12929 allocations: 902.56 KiB) | |
backward | |
13.742 ms (48796 allocations: 3.24 MiB) | |
forw and back | |
19.897 ms (70543 allocations: 4.71 MiB) | |
LSTM CPU c=128 n=128 ts=4 | |
forward | |
493.348 μs (37 allocations: 276.64 KiB) | |
backward | |
996.974 μs (616 allocations: 1.16 MiB) | |
forw and back | |
1.866 ms (1499 allocations: 1.36 MiB) | |
LSTM CUDA c=128 n=128 ts=4 | |
forward | |
357.844 μs (817 allocations: 56.58 KiB) | |
backward | |
1.130 ms (3119 allocations: 209.00 KiB) | |
forw and back | |
1.680 ms (4585 allocations: 310.17 KiB) | |
LSTM CPU c=128 n=128 ts=16 | |
forward | |
2.087 ms (145 allocations: 1.13 MiB) | |
backward | |
4.089 ms (2332 allocations: 4.72 MiB) | |
forw and back | |
7.342 ms (5591 allocations: 5.56 MiB) | |
LSTM CUDA c=128 n=128 ts=16 | |
forward | |
1.053 ms (3265 allocations: 226.17 KiB) | |
backward | |
3.694 ms (12395 allocations: 833.92 KiB) | |
forw and back | |
5.376 ms (17917 allocations: 1.19 MiB) | |
LSTM CPU c=128 n=128 ts=32 | |
forward | |
4.222 ms (289 allocations: 2.27 MiB) | |
backward | |
8.297 ms (4621 allocations: 9.46 MiB) | |
forw and back | |
14.760 ms (11049 allocations: 11.14 MiB) | |
LSTM CUDA c=128 n=128 ts=32 | |
forward | |
1.931 ms (6529 allocations: 452.30 KiB) | |
backward | |
7.032 ms (24764 allocations: 1.63 MiB) | |
forw and back | |
10.200 ms (35695 allocations: 2.36 MiB) | |
LSTM CPU c=128 n=128 ts=64 | |
forward | |
8.167 ms (577 allocations: 4.56 MiB) | |
backward | |
16.636 ms (9197 allocations: 18.94 MiB) | |
forw and back | |
29.544 ms (21961 allocations: 22.32 MiB) | |
LSTM CUDA c=128 n=128 ts=64 | |
forward | |
3.787 ms (13057 allocations: 904.56 KiB) | |
backward | |
13.967 ms (49500 allocations: 3.26 MiB) | |
forw and back | |
20.377 ms (71247 allocations: 4.72 MiB) | |
LSTM CPU c=128 n=512 ts=4 | |
forward | |
1.435 ms (48 allocations: 1.07 MiB) | |
backward | |
1.989 ms (636 allocations: 4.18 MiB) | |
forw and back | |
3.844 ms (1546 allocations: 4.76 MiB) | |
LSTM CUDA c=128 n=512 ts=4 | |
forward | |
380.586 μs (838 allocations: 56.91 KiB) | |
backward | |
1.202 ms (3147 allocations: 209.44 KiB) | |
forw and back | |
1.767 ms (4666 allocations: 312.08 KiB) | |
LSTM CPU c=128 n=512 ts=16 | |
forward | |
3.478 ms (192 allocations: 4.45 MiB) | |
backward | |
8.329 ms (2412 allocations: 16.88 MiB) | |
forw and back | |
15.553 ms (5782 allocations: 19.35 MiB) | |
LSTM CUDA c=128 n=512 ts=16 | |
forward | |
1.120 ms (3358 allocations: 227.62 KiB) | |
backward | |
3.991 ms (12519 allocations: 835.86 KiB) | |
forw and back | |
5.734 ms (18214 allocations: 1.19 MiB) | |
LSTM CPU c=128 n=512 ts=32 | |
forward | |
12.533 ms (384 allocations: 8.97 MiB) | |
backward | |
16.724 ms (4781 allocations: 33.80 MiB) | |
forw and back | |
30.750 ms (11432 allocations: 38.80 MiB) | |
LSTM CUDA c=128 n=512 ts=32 | |
forward | |
2.100 ms (6718 allocations: 455.25 KiB) | |
backward | |
7.663 ms (25016 allocations: 1.63 MiB) | |
forw and back | |
10.944 ms (36280 allocations: 2.37 MiB) | |
LSTM CPU c=128 n=512 ts=64 | |
forward | |
25.193 ms (768 allocations: 17.99 MiB) | |
backward | |
33.564 ms (9517 allocations: 67.64 MiB) | |
forw and back | |
61.978 ms (22728 allocations: 77.71 MiB) | |
LSTM CUDA c=128 n=512 ts=64 | |
forward | |
4.050 ms (13438 allocations: 910.52 KiB) | |
backward | |
14.960 ms (50008 allocations: 3.26 MiB) | |
forw and back | |
21.488 ms (72408 allocations: 4.74 MiB) | |
LSTM CPU c=512 n=32 ts=4 | |
forward | |
282.930 μs (37 allocations: 71.14 KiB) | |
backward | |
867.287 μs (623 allocations: 1.11 MiB) | |
forw and back | |
1.494 ms (1506 allocations: 1.03 MiB) | |
LSTM CUDA c=512 n=32 ts=4 | |
forward | |
365.976 μs (829 allocations: 56.77 KiB) | |
backward | |
1.133 ms (3107 allocations: 208.81 KiB) | |
forw and back | |
1.678 ms (4593 allocations: 310.30 KiB) | |
LSTM CPU c=512 n=32 ts=16 | |
forward | |
1.185 ms (145 allocations: 296.23 KiB) | |
backward | |
3.641 ms (2363 allocations: 4.62 MiB) | |
forw and back | |
5.847 ms (5622 allocations: 4.28 MiB) | |
LSTM CUDA c=512 n=32 ts=16 | |
forward | |
1.056 ms (3313 allocations: 226.92 KiB) | |
backward | |
3.712 ms (12347 allocations: 833.17 KiB) | |
forw and back | |
5.441 ms (17949 allocations: 1.19 MiB) | |
LSTM CPU c=512 n=32 ts=32 | |
forward | |
2.399 ms (289 allocations: 596.36 KiB) | |
backward | |
7.361 ms (4684 allocations: 9.29 MiB) | |
forw and back | |
11.652 ms (11112 allocations: 8.63 MiB) | |
LSTM CUDA c=512 n=32 ts=32 | |
forward | |
1.982 ms (6625 allocations: 453.80 KiB) | |
backward | |
7.102 ms (24668 allocations: 1.63 MiB) | |
forw and back | |
10.382 ms (35759 allocations: 2.37 MiB) | |
LSTM CPU c=512 n=32 ts=64 | |
forward | |
4.867 ms (577 allocations: 1.17 MiB) | |
backward | |
14.773 ms (9324 allocations: 18.65 MiB) | |
forw and back | |
23.629 ms (22088 allocations: 17.32 MiB) | |
LSTM CUDA c=512 n=32 ts=64 | |
forward | |
3.763 ms (13249 allocations: 907.56 KiB) | |
backward | |
13.788 ms (49308 allocations: 3.25 MiB) | |
forw and back | |
20.283 ms (71375 allocations: 4.72 MiB) | |
LSTM CPU c=512 n=128 ts=4 | |
forward | |
558.467 μs (37 allocations: 276.64 KiB) | |
backward | |
1.349 ms (623 allocations: 2.99 MiB) | |
forw and back | |
2.169 ms (1506 allocations: 2.44 MiB) | |
LSTM CUDA c=512 n=128 ts=4 | |
forward | |
406.861 μs (837 allocations: 56.89 KiB) | |
backward | |
1.166 ms (3151 allocations: 209.50 KiB) | |
forw and back | |
1.748 ms (4637 allocations: 310.98 KiB) | |
LSTM CPU c=512 n=128 ts=16 | |
forward | |
2.391 ms (145 allocations: 1.13 MiB) | |
backward | |
5.799 ms (2363 allocations: 12.17 MiB) | |
forw and back | |
8.837 ms (5622 allocations: 10.01 MiB) | |
LSTM CUDA c=512 n=128 ts=16 | |
forward | |
1.194 ms (3345 allocations: 227.42 KiB) | |
backward | |
3.846 ms (12523 allocations: 835.92 KiB) | |
forw and back | |
5.701 ms (18125 allocations: 1.19 MiB) | |
LSTM CPU c=512 n=128 ts=32 | |
forward | |
4.913 ms (289 allocations: 2.27 MiB) | |
backward | |
11.581 ms (4684 allocations: 24.41 MiB) | |
forw and back | |
17.497 ms (11112 allocations: 20.09 MiB) | |
LSTM CUDA c=512 n=128 ts=32 | |
forward | |
2.245 ms (6689 allocations: 454.80 KiB) | |
backward | |
7.364 ms (25020 allocations: 1.63 MiB) | |
forw and back | |
10.827 ms (36111 allocations: 2.37 MiB) | |
LSTM CPU c=512 n=128 ts=64 | |
forward | |
9.983 ms (577 allocations: 4.56 MiB) | |
backward | |
23.183 ms (9324 allocations: 48.88 MiB) | |
forw and back | |
35.427 ms (22088 allocations: 40.26 MiB) | |
LSTM CUDA c=512 n=128 ts=64 | |
forward | |
4.330 ms (13377 allocations: 909.56 KiB) | |
backward | |
14.335 ms (50012 allocations: 3.26 MiB) | |
forw and back | |
21.230 ms (72079 allocations: 4.73 MiB) | |
LSTM CPU c=512 n=512 ts=4 | |
forward | |
1.688 ms (48 allocations: 1.07 MiB) | |
backward | |
3.557 ms (643 allocations: 10.51 MiB) | |
forw and back | |
5.102 ms (1553 allocations: 8.09 MiB) | |
LSTM CUDA c=512 n=512 ts=4 | |
forward | |
399.813 μs (858 allocations: 57.22 KiB) | |
backward | |
1.241 ms (3179 allocations: 209.94 KiB) | |
forw and back | |
1.815 ms (4718 allocations: 312.89 KiB) | |
LSTM CPU c=512 n=512 ts=16 | |
forward | |
7.528 ms (192 allocations: 4.45 MiB) | |
backward | |
14.394 ms (2443 allocations: 42.33 MiB) | |
forw and back | |
20.404 ms (5813 allocations: 32.80 MiB) | |
LSTM CUDA c=512 n=512 ts=16 | |
forward | |
1.216 ms (3438 allocations: 228.88 KiB) | |
backward | |
4.185 ms (12647 allocations: 837.86 KiB) | |
forw and back | |
5.924 ms (18422 allocations: 1.20 MiB) | |
LSTM CPU c=512 n=512 ts=32 | |
forward | |
15.272 ms (384 allocations: 8.97 MiB) | |
backward | |
29.294 ms (4844 allocations: 84.75 MiB) | |
forw and back | |
41.290 ms (11495 allocations: 65.75 MiB) | |
LSTM CUDA c=512 n=512 ts=32 | |
forward | |
2.254 ms (6878 allocations: 457.75 KiB) | |
backward | |
8.176 ms (25272 allocations: 1.64 MiB) | |
forw and back | |
11.253 ms (36696 allocations: 2.38 MiB) | |
LSTM CPU c=512 n=512 ts=64 | |
forward | |
30.405 ms (768 allocations: 17.99 MiB) | |
backward | |
58.086 ms (9644 allocations: 169.59 MiB) | |
forw and back | |
81.369 ms (22855 allocations: 131.65 MiB) | |
LSTM CUDA c=512 n=512 ts=64 | |
forward | |
4.307 ms (13758 allocations: 915.52 KiB) | |
backward | |
16.188 ms (50520 allocations: 3.27 MiB) | |
forw and back | |
22.430 ms (73240 allocations: 4.75 MiB) | |
GRU CPU c=32 n=32 ts=4 | |
forward | |
33.804 μs (25 allocations: 39.11 KiB) | |
backward | |
178.609 μs (844 allocations: 250.83 KiB) | |
forw and back | |
367.103 μs (1721 allocations: 424.92 KiB) | |
GRU CUDA c=32 n=32 ts=4 | |
forward | |
382.705 μs (1005 allocations: 117.64 KiB) | |
backward | |
1.490 ms (3466 allocations: 219.28 KiB) | |
forw and back | |
2.087 ms (5025 allocations: 355.75 KiB) | |
GRU CPU c=32 n=32 ts=16 | |
forward | |
146.961 μs (97 allocations: 165.20 KiB) | |
backward | |
717.275 μs (3149 allocations: 1.02 MiB) | |
forw and back | |
1.371 ms (6415 allocations: 1.69 MiB) | |
GRU CUDA c=32 n=32 ts=16 | |
forward | |
1.158 ms (4017 allocations: 470.42 KiB) | |
backward | |
4.959 ms (13607 allocations: 867.50 KiB) | |
forw and back | |
6.879 ms (19535 allocations: 1.36 MiB) | |
GRU CPU c=32 n=32 ts=32 | |
forward | |
305.891 μs (193 allocations: 333.33 KiB) | |
backward | |
1.411 ms (6221 allocations: 2.06 MiB) | |
forw and back | |
2.689 ms (12671 allocations: 3.40 MiB) | |
GRU CUDA c=32 n=32 ts=32 | |
forward | |
2.169 ms (8033 allocations: 940.80 KiB) | |
backward | |
9.623 ms (27127 allocations: 1.69 MiB) | |
forw and back | |
13.479 ms (38879 allocations: 2.71 MiB) | |
GRU CPU c=32 n=32 ts=64 | |
forward | |
617.131 μs (385 allocations: 669.59 KiB) | |
backward | |
2.846 ms (12365 allocations: 4.13 MiB) | |
forw and back | |
5.336 ms (25183 allocations: 6.81 MiB) | |
GRU CUDA c=32 n=32 ts=64 | |
forward | |
4.278 ms (16065 allocations: 1.84 MiB) | |
backward | |
18.976 ms (54167 allocations: 3.38 MiB) | |
forw and back | |
26.450 ms (77567 allocations: 5.41 MiB) | |
GRU CPU c=32 n=128 ts=4 | |
forward | |
127.137 μs (25 allocations: 151.11 KiB) | |
backward | |
312.210 μs (844 allocations: 751.95 KiB) | |
forw and back | |
612.394 μs (1721 allocations: 1.09 MiB) | |
GRU CUDA c=32 n=128 ts=4 | |
forward | |
382.322 μs (1021 allocations: 117.89 KiB) | |
backward | |
1.505 ms (3502 allocations: 219.84 KiB) | |
forw and back | |
2.110 ms (5061 allocations: 356.31 KiB) | |
GRU CPU c=32 n=128 ts=16 | |
forward | |
546.475 μs (97 allocations: 640.20 KiB) | |
backward | |
1.208 ms (3149 allocations: 3.10 MiB) | |
forw and back | |
2.254 ms (6415 allocations: 4.53 MiB) | |
GRU CUDA c=32 n=128 ts=16 | |
forward | |
1.167 ms (4081 allocations: 471.42 KiB) | |
backward | |
5.157 ms (13751 allocations: 869.75 KiB) | |
forw and back | |
7.216 ms (19679 allocations: 1.36 MiB) | |
GRU CPU c=32 n=128 ts=32 | |
forward | |
1.094 ms (193 allocations: 1.26 MiB) | |
backward | |
2.450 ms (6221 allocations: 6.26 MiB) | |
forw and back | |
4.498 ms (12671 allocations: 9.12 MiB) | |
GRU CUDA c=32 n=128 ts=32 | |
forward | |
2.173 ms (8161 allocations: 942.80 KiB) | |
backward | |
9.933 ms (27415 allocations: 1.70 MiB) | |
forw and back | |
13.795 ms (39167 allocations: 2.71 MiB) | |
GRU CPU c=32 n=128 ts=64 | |
forward | |
2.214 ms (385 allocations: 2.54 MiB) | |
backward | |
5.052 ms (12365 allocations: 12.58 MiB) | |
forw and back | |
8.956 ms (25183 allocations: 18.30 MiB) | |
GRU CUDA c=32 n=128 ts=64 | |
forward | |
4.272 ms (16321 allocations: 1.84 MiB) | |
backward | |
19.251 ms (54743 allocations: 3.39 MiB) | |
forw and back | |
26.773 ms (78143 allocations: 5.42 MiB) | |
GRU CPU c=32 n=512 ts=4 | |
forward | |
650.535 μs (32 allocations: 594.56 KiB) | |
backward | |
1.806 ms (880 allocations: 2.68 MiB) | |
forw and back | |
3.443 ms (1772 allocations: 3.74 MiB) | |
GRU CUDA c=32 n=512 ts=4 | |
forward | |
380.510 μs (1042 allocations: 118.22 KiB) | |
backward | |
1.581 ms (3594 allocations: 223.59 KiB) | |
forw and back | |
2.205 ms (5214 allocations: 361.78 KiB) | |
GRU CPU c=32 n=512 ts=16 | |
forward | |
5.215 ms (128 allocations: 2.46 MiB) | |
backward | |
7.626 ms (3305 allocations: 11.37 MiB) | |
forw and back | |
14.006 ms (6634 allocations: 15.76 MiB) | |
GRU CUDA c=32 n=512 ts=16 | |
forward | |
1.169 ms (4174 allocations: 472.88 KiB) | |
backward | |
5.440 ms (14131 allocations: 884.94 KiB) | |
forw and back | |
7.500 ms (20264 allocations: 1.38 MiB) | |
GRU CPU c=32 n=512 ts=32 | |
forward | |
10.522 ms (256 allocations: 4.97 MiB) | |
backward | |
15.234 ms (6537 allocations: 22.95 MiB) | |
forw and back | |
27.659 ms (13114 allocations: 31.77 MiB) | |
GRU CUDA c=32 n=512 ts=32 | |
forward | |
2.190 ms (8350 allocations: 945.75 KiB) | |
backward | |
10.514 ms (28179 allocations: 1.73 MiB) | |
forw and back | |
14.553 ms (40328 allocations: 2.75 MiB) | |
GRU CPU c=32 n=512 ts=64 | |
forward | |
21.181 ms (512 allocations: 9.99 MiB) | |
backward | |
30.551 ms (13001 allocations: 46.12 MiB) | |
forw and back | |
55.245 ms (26074 allocations: 63.80 MiB) | |
GRU CUDA c=32 n=512 ts=64 | |
forward | |
4.341 ms (16702 allocations: 1.85 MiB) | |
backward | |
20.676 ms (56275 allocations: 3.45 MiB) | |
forw and back | |
28.481 ms (80456 allocations: 5.49 MiB) | |
GRU CPU c=128 n=32 ts=4 | |
forward | |
45.944 μs (25 allocations: 39.11 KiB) | |
backward | |
227.995 μs (844 allocations: 409.83 KiB) | |
forw and back | |
436.912 μs (1721 allocations: 535.92 KiB) | |
GRU CUDA c=128 n=32 ts=4 | |
forward | |
398.092 μs (1005 allocations: 117.64 KiB) | |
backward | |
1.530 ms (3466 allocations: 219.28 KiB) | |
forw and back | |
2.135 ms (5025 allocations: 355.75 KiB) | |
GRU CPU c=128 n=32 ts=16 | |
forward | |
193.138 μs (97 allocations: 165.20 KiB) | |
backward | |
845.771 μs (3149 allocations: 1.67 MiB) | |
forw and back | |
1.542 ms (6415 allocations: 2.15 MiB) | |
GRU CUDA c=128 n=32 ts=16 | |
forward | |
1.207 ms (4017 allocations: 470.42 KiB) | |
backward | |
5.151 ms (13607 allocations: 867.50 KiB) | |
forw and back | |
7.240 ms (19535 allocations: 1.36 MiB) | |
GRU CPU c=128 n=32 ts=32 | |
forward | |
394.700 μs (193 allocations: 333.33 KiB) | |
backward | |
1.702 ms (6221 allocations: 3.36 MiB) | |
forw and back | |
3.038 ms (12671 allocations: 4.33 MiB) | |
GRU CUDA c=128 n=32 ts=32 | |
forward | |
2.262 ms (8033 allocations: 940.80 KiB) | |
backward | |
9.877 ms (27127 allocations: 1.69 MiB) | |
forw and back | |
13.811 ms (38879 allocations: 2.71 MiB) | |
GRU CPU c=128 n=32 ts=64 | |
forward | |
796.808 μs (385 allocations: 669.59 KiB) | |
backward | |
3.573 ms (12365 allocations: 6.75 MiB) | |
forw and back | |
6.142 ms (25183 allocations: 8.68 MiB) | |
GRU CUDA c=128 n=32 ts=64 | |
forward | |
4.383 ms (16065 allocations: 1.84 MiB) | |
backward | |
19.106 ms (54167 allocations: 3.38 MiB) | |
forw and back | |
26.794 ms (77567 allocations: 5.41 MiB) | |
GRU CPU c=128 n=128 ts=4 | |
forward | |
459.966 μs (25 allocations: 151.11 KiB) | |
backward | |
1.041 ms (852 allocations: 1.17 MiB) | |
forw and back | |
1.103 ms (1725 allocations: 1.34 MiB) | |
GRU CUDA c=128 n=128 ts=4 | |
forward | |
389.419 μs (1021 allocations: 117.89 KiB) | |
backward | |
1.522 ms (3502 allocations: 219.84 KiB) | |
forw and back | |
2.154 ms (5061 allocations: 356.31 KiB) | |
GRU CPU c=128 n=128 ts=16 | |
forward | |
1.207 ms (97 allocations: 640.20 KiB) | |
backward | |
4.282 ms (3181 allocations: 4.87 MiB) | |
forw and back | |
7.375 ms (6431 allocations: 5.55 MiB) | |
GRU CUDA c=128 n=128 ts=16 | |
forward | |
1.176 ms (4081 allocations: 471.42 KiB) | |
backward | |
5.240 ms (13751 allocations: 869.75 KiB) | |
forw and back | |
7.321 ms (19679 allocations: 1.36 MiB) | |
GRU CPU c=128 n=128 ts=32 | |
forward | |
3.869 ms (193 allocations: 1.26 MiB) | |
backward | |
6.375 ms (6285 allocations: 9.81 MiB) | |
forw and back | |
12.397 ms (12703 allocations: 11.17 MiB) | |
GRU CUDA c=128 n=128 ts=32 | |
forward | |
2.221 ms (8161 allocations: 942.80 KiB) | |
backward | |
10.115 ms (27415 allocations: 1.70 MiB) | |
forw and back | |
14.042 ms (39167 allocations: 2.71 MiB) | |
GRU CPU c=128 n=128 ts=64 | |
forward | |
7.791 ms (385 allocations: 2.54 MiB) | |
backward | |
17.688 ms (12493 allocations: 19.69 MiB) | |
forw and back | |
29.597 ms (25247 allocations: 22.41 MiB) | |
GRU CUDA c=128 n=128 ts=64 | |
forward | |
4.373 ms (16321 allocations: 1.84 MiB) | |
backward | |
19.570 ms (54743 allocations: 3.39 MiB) | |
forw and back | |
26.788 ms (78143 allocations: 5.42 MiB) | |
GRU CPU c=128 n=512 ts=4 | |
forward | |
1.295 ms (32 allocations: 594.56 KiB) | |
backward | |
2.074 ms (880 allocations: 4.24 MiB) | |
forw and back | |
3.677 ms (1772 allocations: 4.56 MiB) | |
GRU CUDA c=128 n=512 ts=4 | |
forward | |
389.225 μs (1042 allocations: 118.22 KiB) | |
backward | |
1.583 ms (3594 allocations: 223.59 KiB) | |
forw and back | |
2.208 ms (5214 allocations: 361.78 KiB) | |
GRU CPU c=128 n=512 ts=16 | |
forward | |
5.538 ms (128 allocations: 2.46 MiB) | |
backward | |
8.798 ms (3305 allocations: 17.64 MiB) | |
forw and back | |
14.756 ms (6634 allocations: 19.03 MiB) | |
GRU CUDA c=128 n=512 ts=16 | |
forward | |
1.176 ms (4174 allocations: 472.88 KiB) | |
backward | |
5.420 ms (14131 allocations: 884.94 KiB) | |
forw and back | |
7.479 ms (20264 allocations: 1.38 MiB) | |
GRU CPU c=128 n=512 ts=32 | |
forward | |
11.260 ms (256 allocations: 4.97 MiB) | |
backward | |
17.652 ms (6537 allocations: 35.50 MiB) | |
forw and back | |
29.339 ms (13114 allocations: 38.32 MiB) | |
GRU CUDA c=128 n=512 ts=32 | |
forward | |
2.246 ms (8350 allocations: 945.75 KiB) | |
backward | |
10.603 ms (28179 allocations: 1.73 MiB) | |
forw and back | |
14.578 ms (40328 allocations: 2.75 MiB) | |
GRU CPU c=128 n=512 ts=64 | |
forward | |
22.706 ms (512 allocations: 9.99 MiB) | |
backward | |
35.436 ms (13001 allocations: 71.24 MiB) | |
forw and back | |
59.069 ms (26074 allocations: 76.92 MiB) | |
GRU CUDA c=128 n=512 ts=64 | |
forward | |
4.426 ms (16702 allocations: 1.85 MiB) | |
backward | |
20.861 ms (56275 allocations: 3.45 MiB) | |
forw and back | |
28.711 ms (80456 allocations: 5.49 MiB) | |
GRU CPU c=512 n=32 ts=4 | |
forward | |
308.929 μs (25 allocations: 39.11 KiB) | |
backward | |
949.953 μs (859 allocations: 1.02 MiB) | |
forw and back | |
1.598 ms (1732 allocations: 979.06 KiB) | |
GRU CUDA c=512 n=32 ts=4 | |
forward | |
396.805 μs (1025 allocations: 117.95 KiB) | |
backward | |
1.551 ms (3498 allocations: 219.78 KiB) | |
forw and back | |
2.157 ms (5077 allocations: 356.56 KiB) | |
GRU CPU c=512 n=32 ts=16 | |
forward | |
1.251 ms (97 allocations: 165.20 KiB) | |
backward | |
3.884 ms (3212 allocations: 4.26 MiB) | |
forw and back | |
6.193 ms (6462 allocations: 3.99 MiB) | |
GRU CUDA c=512 n=32 ts=16 | |
forward | |
1.193 ms (4097 allocations: 471.67 KiB) | |
backward | |
5.133 ms (13735 allocations: 869.50 KiB) | |
forw and back | |
7.216 ms (19743 allocations: 1.36 MiB) | |
GRU CPU c=512 n=32 ts=32 | |
forward | |
2.510 ms (193 allocations: 333.33 KiB) | |
backward | |
7.843 ms (6348 allocations: 8.57 MiB) | |
forw and back | |
12.336 ms (12766 allocations: 8.04 MiB) | |
GRU CUDA c=512 n=32 ts=32 | |
forward | |
2.221 ms (8193 allocations: 943.30 KiB) | |
backward | |
9.822 ms (27383 allocations: 1.70 MiB) | |
forw and back | |
13.770 ms (39295 allocations: 2.71 MiB) | |
GRU CPU c=512 n=32 ts=64 | |
forward | |
5.059 ms (385 allocations: 669.59 KiB) | |
backward | |
15.663 ms (12620 allocations: 17.20 MiB) | |
forw and back | |
24.714 ms (25374 allocations: 16.13 MiB) | |
GRU CUDA c=512 n=32 ts=64 | |
forward | |
4.358 ms (16385 allocations: 1.84 MiB) | |
backward | |
19.248 ms (54679 allocations: 3.39 MiB) | |
forw and back | |
26.556 ms (78399 allocations: 5.42 MiB) | |
GRU CPU c=512 n=128 ts=4 | |
forward | |
535.009 μs (25 allocations: 151.11 KiB) | |
backward | |
1.459 ms (859 allocations: 2.92 MiB) | |
forw and back | |
2.211 ms (1732 allocations: 2.33 MiB) | |
GRU CUDA c=512 n=128 ts=4 | |
forward | |
388.542 μs (1041 allocations: 118.20 KiB) | |
backward | |
1.544 ms (3534 allocations: 220.34 KiB) | |
forw and back | |
2.170 ms (5113 allocations: 357.12 KiB) | |
GRU CPU c=512 n=128 ts=16 | |
forward | |
2.239 ms (97 allocations: 640.20 KiB) | |
backward | |
6.088 ms (3212 allocations: 11.96 MiB) | |
forw and back | |
8.836 ms (6462 allocations: 9.64 MiB) | |
GRU CUDA c=512 n=128 ts=16 | |
forward | |
1.186 ms (4161 allocations: 472.67 KiB) | |
backward | |
5.222 ms (13879 allocations: 871.75 KiB) | |
forw and back | |
7.237 ms (19887 allocations: 1.36 MiB) | |
GRU CPU c=512 n=128 ts=32 | |
forward | |
4.584 ms (193 allocations: 1.26 MiB) | |
backward | |
12.185 ms (6348 allocations: 24.02 MiB) | |
forw and back | |
17.606 ms (12766 allocations: 19.38 MiB) | |
GRU CUDA c=512 n=128 ts=32 | |
forward | |
2.194 ms (8321 allocations: 945.30 KiB) | |
backward | |
10.055 ms (27671 allocations: 1.70 MiB) | |
forw and back | |
13.803 ms (39583 allocations: 2.72 MiB) | |
GRU CPU c=512 n=128 ts=64 | |
forward | |
9.301 ms (385 allocations: 2.54 MiB) | |
backward | |
24.435 ms (12620 allocations: 48.14 MiB) | |
forw and back | |
35.196 ms (25374 allocations: 38.87 MiB) | |
GRU CUDA c=512 n=128 ts=64 | |
forward | |
4.295 ms (16641 allocations: 1.85 MiB) | |
backward | |
19.502 ms (55255 allocations: 3.40 MiB) | |
forw and back | |
26.931 ms (78975 allocations: 5.43 MiB) | |
GRU CPU c=512 n=512 ts=4 | |
forward | |
1.608 ms (32 allocations: 594.56 KiB) | |
backward | |
3.340 ms (887 allocations: 10.48 MiB) | |
forw and back | |
5.084 ms (1779 allocations: 7.80 MiB) | |
GRU CUDA c=512 n=512 ts=4 | |
forward | |
435.326 μs (1062 allocations: 118.53 KiB) | |
backward | |
1.656 ms (3626 allocations: 224.09 KiB) | |
forw and back | |
2.307 ms (5266 allocations: 362.59 KiB) | |
GRU CPU c=512 n=512 ts=16 | |
forward | |
7.007 ms (128 allocations: 2.46 MiB) | |
backward | |
15.175 ms (3336 allocations: 42.73 MiB) | |
forw and back | |
19.999 ms (6665 allocations: 32.12 MiB) | |
GRU CUDA c=512 n=512 ts=16 | |
forward | |
1.338 ms (4254 allocations: 474.12 KiB) | |
backward | |
5.731 ms (14259 allocations: 886.94 KiB) | |
forw and back | |
7.789 ms (20472 allocations: 1.38 MiB) | |
GRU CPU c=512 n=512 ts=32 | |
forward | |
14.150 ms (256 allocations: 4.97 MiB) | |
backward | |
30.405 ms (6600 allocations: 85.71 MiB) | |
forw and back | |
39.853 ms (13177 allocations: 64.54 MiB) | |
GRU CUDA c=512 n=512 ts=32 | |
forward | |
2.516 ms (8510 allocations: 948.25 KiB) | |
backward | |
11.757 ms (28435 allocations: 1.73 MiB) | |
forw and back | |
15.063 ms (40744 allocations: 2.76 MiB) | |
GRU CPU c=512 n=512 ts=64 | |
forward | |
28.400 ms (512 allocations: 9.99 MiB) | |
backward | |
60.943 ms (13128 allocations: 171.69 MiB) | |
forw and back | |
79.459 ms (26201 allocations: 129.37 MiB) | |
GRU CUDA c=512 n=512 ts=64 | |
forward | |
4.888 ms (17022 allocations: 1.85 MiB) | |
backward | |
22.380 ms (56787 allocations: 3.46 MiB) | |
forw and back | |
29.432 ms (81288 allocations: 5.50 MiB) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
struct RNNWrapper{T} | |
rnn::T | |
end | |
Flux.@functor RNNWrapper | |
function (r::RNNWrapper)(xs) | |
Flux.reset!(r.rnn) | |
[r.rnn(x) for x in xs][end] | |
end | |
for rnn_type in [Flux.LSTM, Flux.GRU] | |
for c in [32, 128, 512] | |
rnn = RNNWrapper(rnn_type(c, 8)) | |
grnn = gpu(rnn) | |
for n in [32, 128, 512] | |
for ts in [4, 16, 64] | |
xs = [randn(Float32, c, n) for _ in 1:ts] | |
println("$rnn_type CPU c=$c n=$n ts=$ts") | |
run_benchmark(rnn, xs, cuda=false) | |
println("$rnn_type CUDA c=$c n=$n ts=$ts") | |
run_benchmark(grnn, xs, cuda=true) | |
end | |
end | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"$schema": "https://vega.github.io/schema/vega-lite/v5.json", | |
"params": [ | |
{ | |
"name": "device", | |
"select": { | |
"type": "point", | |
"fields": ["device"] | |
}, | |
"bind": "legend" | |
} | |
], | |
"data": { | |
"name": "source", | |
"url": "view_vs_slice.csv", | |
"format": { | |
"type": "csv" | |
} | |
}, | |
"mark": "point", | |
"encoding": { | |
"y": { | |
"field": "timesteps", | |
"type": "ordinal", | |
"sort": [4, 16, 64] | |
}, | |
"x": { | |
"field": "time_μs", | |
"type": "quantitative", | |
"scale": { | |
"type": "log" | |
} | |
}, | |
"column": { | |
"field": "features", | |
"type": "quantitative" | |
}, | |
"row": { | |
"field": "batch_size", | |
"type": "quantitative" | |
}, | |
"color": { | |
"field": "branch", | |
"type": "nominal" | |
}, | |
"shape": { | |
"field": "device", | |
"type": "nominal" | |
}, | |
"opacity": { | |
"condition": {"param": "device", "value": 1}, | |
"value": 0.2 | |
} | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
branch | rnn_type | device | features | batch_size | timesteps | passes | time_μs | alloc_num | alloc_KiB | |
---|---|---|---|---|---|---|---|---|---|---|
master | LSTM | CPU | 32 | 32 | 4 | forward | 40.2490005493164 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 32 | 32 | 4 | backward | 155.457000732422 | 608 | 233.699996948242 | |
master | LSTM | CPU | 32 | 32 | 4 | forw and back | 336.703002929688 | 1495 | 405.920013427734 | |
master | LSTM | CUDA | 32 | 32 | 4 | forward | 449.045989990234 | 1289 | 70.6999969482422 | |
master | LSTM | CUDA | 32 | 32 | 4 | backward | 1079.0 | 3075 | 208.309997558594 | |
master | LSTM | CUDA | 32 | 32 | 4 | forw and back | 1702.0 | 5005 | 321.980010986328 | |
master | LSTM | CPU | 32 | 32 | 16 | forward | 177.645004272461 | 209 | 364.230010986328 | |
master | LSTM | CPU | 32 | 32 | 16 | backward | 584.898986816406 | 2300 | 950.72998046875 | |
master | LSTM | CPU | 32 | 32 | 16 | forw and back | 1219.0 | 5575 | 1590.0 | |
master | LSTM | CUDA | 32 | 32 | 16 | forward | 1384.0 | 5153 | 282.670013427734 | |
master | LSTM | CUDA | 32 | 32 | 16 | backward | 3566.0 | 12219 | 831.169982910156 | |
master | LSTM | CUDA | 32 | 32 | 16 | forw and back | 5574.0 | 19597 | 1230.0 | |
master | LSTM | CPU | 32 | 32 | 32 | forward | 360.834991455078 | 417 | 732.359985351562 | |
master | LSTM | CPU | 32 | 32 | 32 | backward | 1154.0 | 4557 | 1860.0 | |
master | LSTM | CPU | 32 | 32 | 32 | forw and back | 2370.0 | 11017 | 3190.0 | |
master | LSTM | CUDA | 32 | 32 | 32 | forward | 2631.0 | 10305 | 565.299987792969 | |
master | LSTM | CUDA | 32 | 32 | 32 | backward | 6852.0 | 24412 | 1620.0 | |
master | LSTM | CUDA | 32 | 32 | 32 | forw and back | 10746.0 | 39055 | 2460.0 | |
master | LSTM | CPU | 32 | 32 | 64 | forward | 733.223999023438 | 833 | 1430.0 | |
master | LSTM | CPU | 32 | 32 | 64 | backward | 2329.0 | 9069 | 3730.0 | |
master | LSTM | CPU | 32 | 32 | 64 | forw and back | 4713.0 | 21897 | 6380.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | forward | 5068.0 | 20609 | 1100.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | backward | 13402.0 | 48796 | 3240.0 | |
master | LSTM | CUDA | 32 | 32 | 64 | forw and back | 21120.0 | 77967 | 4910.0 | |
master | LSTM | CPU | 32 | 128 | 4 | forward | 153.962997436523 | 53 | 342.640014648437 | |
master | LSTM | CPU | 32 | 128 | 4 | backward | 273.554992675781 | 608 | 722.830017089844 | |
master | LSTM | CPU | 32 | 128 | 4 | forw and back | 597.818969726562 | 1495 | 1160.0 | |
master | LSTM | CUDA | 32 | 128 | 4 | forward | 467.039001464844 | 1297 | 70.8300018310547 | |
master | LSTM | CUDA | 32 | 128 | 4 | backward | 1109.0 | 3119 | 209.0 | |
master | LSTM | CUDA | 32 | 128 | 4 | forw and back | 1761.0 | 5049 | 322.670013427734 | |
master | LSTM | CPU | 32 | 128 | 16 | forward | 644.987976074219 | 209 | 1380.0 | |
master | LSTM | CPU | 32 | 128 | 16 | backward | 1065.0 | 2300 | 2860.0 | |
master | LSTM | CPU | 32 | 128 | 16 | forw and back | 2241.0 | 5575 | 4700.0 | |
master | LSTM | CUDA | 32 | 128 | 16 | forward | 1426.0 | 5185 | 283.170013427734 | |
master | LSTM | CUDA | 32 | 128 | 16 | backward | 3689.0 | 12395 | 833.919982910156 | |
master | LSTM | CUDA | 32 | 128 | 16 | forw and back | 5764.0 | 19773 | 1240.0 | |
master | LSTM | CPU | 32 | 128 | 32 | forward | 1302.0 | 417 | 2790.0 | |
master | LSTM | CPU | 32 | 128 | 32 | backward | 2168.0 | 4557 | 5730.0 | |
master | LSTM | CPU | 32 | 128 | 32 | forw and back | 4473.0 | 11017 | 9410.0 | |
master | LSTM | CUDA | 32 | 128 | 32 | forward | 2663.0 | 10369 | 566.299987792969 | |
master | LSTM | CUDA | 32 | 128 | 32 | backward | 7051.0 | 24764 | 1630.0 | |
master | LSTM | CUDA | 32 | 128 | 32 | forw and back | 10977.0 | 39407 | 2460.0 | |
master | LSTM | CPU | 32 | 128 | 64 | forward | 2627.0 | 833 | 5590.0 | |
master | LSTM | CPU | 32 | 128 | 64 | backward | 4497.0 | 9069 | 11460.0 | |
master | LSTM | CPU | 32 | 128 | 64 | forw and back | 8940.0 | 21897 | 18840.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | forward | 5145.0 | 20737 | 1110.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | backward | 13790.0 | 49500 | 3260.0 | |
master | LSTM | CUDA | 32 | 128 | 64 | forw and back | 21547.0 | 78671 | 4920.0 | |
master | LSTM | CPU | 32 | 512 | 4 | forward | 1477.0 | 64 | 1320.0 | |
master | LSTM | CPU | 32 | 512 | 4 | backward | 1701.0 | 636 | 2600.0 | |
master | LSTM | CPU | 32 | 512 | 4 | forw and back | 3646.0 | 1546 | 4180.0 | |
master | LSTM | CUDA | 32 | 512 | 4 | forward | 468.97900390625 | 1334 | 71.6600036621094 | |
master | LSTM | CUDA | 32 | 512 | 4 | backward | 1175.0 | 3147 | 209.440002441406 | |
master | LSTM | CUDA | 32 | 512 | 4 | forw and back | 1834.0 | 5146 | 325.079986572266 | |
master | LSTM | CPU | 32 | 512 | 16 | forward | 5619.0 | 256 | 5460.0 | |
master | LSTM | CPU | 32 | 512 | 16 | backward | 7090.0 | 2412 | 10510.0 | |
master | LSTM | CPU | 32 | 512 | 16 | forw and back | 14776.0 | 5782 | 16990.0 | |
master | LSTM | CUDA | 32 | 512 | 16 | forward | 1435.0 | 5342 | 286.619995117187 | |
master | LSTM | CUDA | 32 | 512 | 16 | backward | 3871.0 | 12519 | 835.859985351563 | |
master | LSTM | CUDA | 32 | 512 | 16 | forw and back | 6144.0 | 20134 | 1240.0 | |
master | LSTM | CPU | 32 | 512 | 32 | forward | 12595.0 | 512 | 10980.0 | |
master | LSTM | CPU | 32 | 512 | 32 | backward | 14100.0 | 4781 | 21060.0 | |
master | LSTM | CPU | 32 | 512 | 32 | forw and back | 29444.0 | 11432 | 34070.0 | |
master | LSTM | CUDA | 32 | 512 | 32 | forward | 2721.0 | 10686 | 573.25 | |
master | LSTM | CUDA | 32 | 512 | 32 | backward | 7438.0 | 25016 | 1630.0 | |
master | LSTM | CUDA | 32 | 512 | 32 | forw and back | 11761.0 | 40120 | 2480.0 | |
master | LSTM | CPU | 32 | 512 | 64 | forward | 25441.0 | 1024 | 22030.0 | |
master | LSTM | CPU | 32 | 512 | 64 | backward | 28548.0 | 9517 | 42150.0 | |
master | LSTM | CPU | 32 | 512 | 64 | forw and back | 58672.0 | 22728 | 68230.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | forward | 5208.0 | 21374 | 1120.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | backward | 14488.0 | 50008 | 3260.0 | |
master | LSTM | CUDA | 32 | 512 | 64 | forw and back | 23170.0 | 80088 | 4940.0 | |
master | LSTM | CPU | 128 | 32 | 4 | forward | 53.7029991149902 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 128 | 32 | 4 | backward | 196.535995483398 | 608 | 413.700012207031 | |
master | LSTM | CPU | 128 | 32 | 4 | forw and back | 399.234985351562 | 1495 | 537.919982910156 | |
master | LSTM | CUDA | 128 | 32 | 4 | forward | 484.81298828125 | 1289 | 70.6999969482422 | |
master | LSTM | CUDA | 128 | 32 | 4 | backward | 1125.0 | 3075 | 208.309997558594 | |
master | LSTM | CUDA | 128 | 32 | 4 | forw and back | 1769.0 | 5005 | 321.980010986328 | |
master | LSTM | CPU | 128 | 32 | 16 | forward | 232.432998657227 | 209 | 364.230010986328 | |
master | LSTM | CPU | 128 | 32 | 16 | backward | 741.138000488281 | 2300 | 1670.0 | |
master | LSTM | CPU | 128 | 32 | 16 | forw and back | 1420.0 | 5575 | 2140.0 | |
master | LSTM | CUDA | 128 | 32 | 16 | forward | 1469.0 | 5153 | 282.670013427734 | |
master | LSTM | CUDA | 128 | 32 | 16 | backward | 3697.0 | 12219 | 831.169982910156 | |
master | LSTM | CUDA | 128 | 32 | 16 | forw and back | 5811.0 | 19597 | 1230.0 | |
master | LSTM | CPU | 128 | 32 | 32 | forward | 475.834991455078 | 417 | 732.359985351562 | |
master | LSTM | CPU | 128 | 32 | 32 | backward | 1496.0 | 4557 | 3350.0 | |
master | LSTM | CPU | 128 | 32 | 32 | forw and back | 2802.0 | 11017 | 4300.0 | |
master | LSTM | CUDA | 128 | 32 | 32 | forward | 2735.0 | 10305 | 565.299987792969 | |
master | LSTM | CUDA | 128 | 32 | 32 | backward | 7067.0 | 24412 | 1620.0 | |
master | LSTM | CUDA | 128 | 32 | 32 | forw and back | 11125.0 | 39055 | 2460.0 | |
master | LSTM | CPU | 128 | 32 | 64 | forward | 959.431030273438 | 833 | 1430.0 | |
master | LSTM | CPU | 128 | 32 | 64 | backward | 3159.0 | 9069 | 6720.0 | |
master | LSTM | CPU | 128 | 32 | 64 | forw and back | 5664.0 | 21897 | 8620.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | forward | 5234.0 | 20609 | 1100.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | backward | 13496.0 | 48796 | 3240.0 | |
master | LSTM | CUDA | 128 | 32 | 64 | forw and back | 21335.0 | 77967 | 4910.0 | |
master | LSTM | CPU | 128 | 128 | 4 | forward | 311.561004638672 | 53 | 342.640014648437 | |
master | LSTM | CPU | 128 | 128 | 4 | backward | 1003.0 | 616 | 1160.0 | |
master | LSTM | CPU | 128 | 128 | 4 | forw and back | 1899.0 | 1499 | 1430.0 | |
master | LSTM | CUDA | 128 | 128 | 4 | forward | 476.438995361328 | 1297 | 70.8300018310547 | |
master | LSTM | CUDA | 128 | 128 | 4 | backward | 1125.0 | 3119 | 209.0 | |
master | LSTM | CUDA | 128 | 128 | 4 | forw and back | 1769.0 | 5049 | 322.670013427734 | |
master | LSTM | CPU | 128 | 128 | 16 | forward | 2236.0 | 209 | 1380.0 | |
master | LSTM | CPU | 128 | 128 | 16 | backward | 4116.0 | 2332 | 4720.0 | |
master | LSTM | CPU | 128 | 128 | 16 | forw and back | 7460.0 | 5591 | 5810.0 | |
master | LSTM | CUDA | 128 | 128 | 16 | forward | 1438.0 | 5185 | 283.170013427734 | |
master | LSTM | CUDA | 128 | 128 | 16 | backward | 3694.0 | 12395 | 833.919982910156 | |
master | LSTM | CUDA | 128 | 128 | 16 | forw and back | 5800.0 | 19773 | 1240.0 | |
master | LSTM | CPU | 128 | 128 | 32 | forward | 4504.0 | 417 | 2790.0 | |
master | LSTM | CPU | 128 | 128 | 32 | backward | 8002.0 | 4621 | 9460.0 | |
master | LSTM | CPU | 128 | 128 | 32 | forw and back | 14503.0 | 11049 | 11650.0 | |
master | LSTM | CUDA | 128 | 128 | 32 | forward | 2700.0 | 10369 | 566.299987792969 | |
master | LSTM | CUDA | 128 | 128 | 32 | backward | 7140.0 | 24764 | 1630.0 | |
master | LSTM | CUDA | 128 | 128 | 32 | forw and back | 11057.0 | 39407 | 2460.0 | |
master | LSTM | CPU | 128 | 128 | 64 | forward | 9043.0 | 833 | 5590.0 | |
master | LSTM | CPU | 128 | 128 | 64 | backward | 16704.0 | 9197 | 18940.0 | |
master | LSTM | CPU | 128 | 128 | 64 | forw and back | 29730.0 | 21961 | 23320.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | forward | 5205.0 | 20737 | 1110.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | backward | 14014.0 | 49500 | 3260.0 | |
master | LSTM | CUDA | 128 | 128 | 64 | forw and back | 21858.0 | 78671 | 4920.0 | |
master | LSTM | CPU | 128 | 512 | 4 | forward | 1537.0 | 64 | 1320.0 | |
master | LSTM | CPU | 128 | 512 | 4 | backward | 2000.0 | 636 | 4180.0 | |
master | LSTM | CPU | 128 | 512 | 4 | forw and back | 3908.0 | 1546 | 5010.0 | |
master | LSTM | CUDA | 128 | 512 | 4 | forward | 498.242004394531 | 1334 | 71.6600036621094 | |
master | LSTM | CUDA | 128 | 512 | 4 | backward | 1193.0 | 3147 | 209.440002441406 | |
master | LSTM | CUDA | 128 | 512 | 4 | forw and back | 1856.0 | 5146 | 325.079986572266 | |
master | LSTM | CPU | 128 | 512 | 16 | forward | 6646.0 | 256 | 5460.0 | |
master | LSTM | CPU | 128 | 512 | 16 | backward | 8319.0 | 2412 | 16880.0 | |
master | LSTM | CPU | 128 | 512 | 16 | forw and back | 15881.0 | 5782 | 20350.0 | |
master | LSTM | CUDA | 128 | 512 | 16 | forward | 1488.0 | 5342 | 286.619995117187 | |
master | LSTM | CUDA | 128 | 512 | 16 | backward | 3926.0 | 12519 | 835.859985351563 | |
master | LSTM | CUDA | 128 | 512 | 16 | forw and back | 6031.0 | 20134 | 1240.0 | |
master | LSTM | CPU | 128 | 512 | 32 | forward | 13529.0 | 512 | 10980.0 | |
master | LSTM | CPU | 128 | 512 | 32 | backward | 16747.0 | 4781 | 33800.0 | |
master | LSTM | CPU | 128 | 512 | 32 | forw and back | 31685.0 | 11432 | 40810.0 | |
master | LSTM | CUDA | 128 | 512 | 32 | forward | 2830.0 | 10686 | 573.25 | |
master | LSTM | CUDA | 128 | 512 | 32 | backward | 7573.0 | 25016 | 1630.0 | |
master | LSTM | CUDA | 128 | 512 | 32 | forw and back | 11650.0 | 40120 | 2480.0 | |
master | LSTM | CPU | 128 | 512 | 64 | forward | 27174.0 | 1024 | 22030.0 | |
master | LSTM | CPU | 128 | 512 | 64 | backward | 33567.0 | 9517 | 67640.0 | |
master | LSTM | CPU | 128 | 512 | 64 | forw and back | 63216.0 | 22728 | 81710.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | forward | 5410.0 | 21374 | 1120.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | backward | 14782.0 | 50008 | 3260.0 | |
master | LSTM | CUDA | 128 | 512 | 64 | forw and back | 22877.0 | 80088 | 4940.0 | |
master | LSTM | CPU | 512 | 32 | 4 | forward | 296.654998779297 | 53 | 88.1399993896484 | |
master | LSTM | CPU | 512 | 32 | 4 | backward | 902.289978027344 | 623 | 1110.0 | |
master | LSTM | CPU | 512 | 32 | 4 | forw and back | 1507.0 | 1506 | 1040.0 | |
master | LSTM | CUDA | 512 | 32 | 4 | forward | 495.875 | 1309 | 71.0199966430664 | |
master | LSTM | CUDA | 512 | 32 | 4 | backward | 1152.0 | 3107 | 208.809997558594 | |
master | LSTM | CUDA | 512 | 32 | 4 | forw and back | 1801.0 | 5057 | 322.799987792969 | |
master | LSTM | CPU | 512 | 32 | 16 | forward | 1234.0 | 209 | 364.230010986328 | |
master | LSTM | CPU | 512 | 32 | 16 | backward | 3671.0 | 2363 | 4620.0 | |
master | LSTM | CPU | 512 | 32 | 16 | forw and back | 5887.0 | 5622 | 4340.0 | |
master | LSTM | CUDA | 512 | 32 | 16 | forward | 1492.0 | 5233 | 283.920013427734 | |
master | LSTM | CUDA | 512 | 32 | 16 | backward | 3738.0 | 12347 | 833.169982910156 | |
master | LSTM | CUDA | 512 | 32 | 16 | forw and back | 5898.0 | 19805 | 1240.0 | |
master | LSTM | CPU | 512 | 32 | 32 | forward | 2483.0 | 417 | 732.359985351562 | |
master | LSTM | CPU | 512 | 32 | 32 | backward | 7368.0 | 4684 | 9290.0 | |
master | LSTM | CPU | 512 | 32 | 32 | forw and back | 11759.0 | 11112 | 8750.0 | |
master | LSTM | CUDA | 512 | 32 | 32 | forward | 2761.0 | 10465 | 567.799987792969 | |
master | LSTM | CUDA | 512 | 32 | 32 | backward | 7149.0 | 24668 | 1630.0 | |
master | LSTM | CUDA | 512 | 32 | 32 | forw and back | 11274.0 | 39471 | 2460.0 | |
master | LSTM | CPU | 512 | 32 | 64 | forward | 5011.0 | 833 | 1430.0 | |
master | LSTM | CPU | 512 | 32 | 64 | backward | 12625.0 | 9324 | 18650.0 | |
master | LSTM | CPU | 512 | 32 | 64 | forw and back | 23286.0 | 22088 | 17560.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | forward | 5330.0 | 20929 | 1110.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | backward | 13971.0 | 49308 | 3250.0 | |
master | LSTM | CUDA | 512 | 32 | 64 | forw and back | 22006.0 | 78799 | 4920.0 | |
master | LSTM | CPU | 512 | 128 | 4 | forward | 598.161010742188 | 53 | 342.640014648437 | |
master | LSTM | CPU | 512 | 128 | 4 | backward | 1382.0 | 623 | 2990.0 | |
master | LSTM | CPU | 512 | 128 | 4 | forw and back | 2212.0 | 1506 | 2510.0 | |
master | LSTM | CUDA | 512 | 128 | 4 | forward | 529.643981933594 | 1317 | 71.1399993896484 | |
master | LSTM | CUDA | 512 | 128 | 4 | backward | 1173.0 | 3151 | 209.5 | |
master | LSTM | CUDA | 512 | 128 | 4 | forw and back | 1868.0 | 5101 | 323.480010986328 | |
master | LSTM | CPU | 512 | 128 | 16 | forward | 2532.0 | 209 | 1380.0 | |
master | LSTM | CPU | 512 | 128 | 16 | backward | 5782.0 | 2363 | 12170.0 | |
master | LSTM | CPU | 512 | 128 | 16 | forw and back | 8975.0 | 5622 | 10260.0 | |
master | LSTM | CUDA | 512 | 128 | 16 | forward | 1601.0 | 5265 | 284.420013427734 | |
master | LSTM | CUDA | 512 | 128 | 16 | backward | 3813.0 | 12523 | 835.919982910156 | |
master | LSTM | CUDA | 512 | 128 | 16 | forw and back | 6066.0 | 19981 | 1240.0 | |
master | LSTM | CPU | 512 | 128 | 32 | forward | 5206.0 | 417 | 2790.0 | |
master | LSTM | CPU | 512 | 128 | 32 | backward | 9077.0 | 4684 | 24410.0 | |
master | LSTM | CPU | 512 | 128 | 32 | forw and back | 17836.0 | 11112 | 20590.0 | |
master | LSTM | CUDA | 512 | 128 | 32 | forward | 2990.0 | 10529 | 568.799987792969 | |
master | LSTM | CUDA | 512 | 128 | 32 | backward | 7328.0 | 25020 | 1630.0 | |
master | LSTM | CUDA | 512 | 128 | 32 | forw and back | 11564.0 | 39823 | 2470.0 | |
master | LSTM | CPU | 512 | 128 | 64 | forward | 10575.0 | 833 | 5590.0 | |
master | LSTM | CPU | 512 | 128 | 64 | backward | 23239.0 | 9324 | 48880.0 | |
master | LSTM | CPU | 512 | 128 | 64 | forw and back | 35565.0 | 22088 | 41270.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | forward | 5804.0 | 21057 | 1110.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | backward | 14272.0 | 50012 | 3260.0 | |
master | LSTM | CUDA | 512 | 128 | 64 | forw and back | 22629.0 | 79503 | 4930.0 | |
master | LSTM | CPU | 512 | 512 | 4 | forward | 1775.0 | 64 | 1320.0 | |
master | LSTM | CPU | 512 | 512 | 4 | backward | 3575.0 | 643 | 10510.0 | |
master | LSTM | CPU | 512 | 512 | 4 | forw and back | 5246.0 | 1553 | 8340.0 | |
master | LSTM | CUDA | 512 | 512 | 4 | forward | 524.591979980469 | 1354 | 71.9700012207031 | |
master | LSTM | CUDA | 512 | 512 | 4 | backward | 1235.0 | 3179 | 209.940002441406 | |
master | LSTM | CUDA | 512 | 512 | 4 | forw and back | 1906.0 | 5198 | 325.890014648437 | |
master | LSTM | CPU | 512 | 512 | 16 | forward | 7986.0 | 256 | 5460.0 | |
master | LSTM | CPU | 512 | 512 | 16 | backward | 14480.0 | 2443 | 42330.0 | |
master | LSTM | CPU | 512 | 512 | 16 | forw and back | 20785.0 | 5813 | 33800.0 | |
master | LSTM | CUDA | 512 | 512 | 16 | forward | 1586.0 | 5422 | 287.880004882812 | |
master | LSTM | CUDA | 512 | 512 | 16 | backward | 4138.0 | 12647 | 837.859985351563 | |
master | LSTM | CUDA | 512 | 512 | 16 | forw and back | 6273.0 | 20342 | 1250.0 | |
master | LSTM | CPU | 512 | 512 | 32 | forward | 16118.0 | 512 | 10980.0 | |
master | LSTM | CPU | 512 | 512 | 32 | backward | 29159.0 | 4844 | 84750.0 | |
master | LSTM | CPU | 512 | 512 | 32 | forw and back | 41383.0 | 11495 | 67750.0 | |
master | LSTM | CUDA | 512 | 512 | 32 | forward | 3017.0 | 10846 | 575.75 | |
master | LSTM | CUDA | 512 | 512 | 32 | backward | 7970.0 | 25272 | 1640.0 | |
master | LSTM | CUDA | 512 | 512 | 32 | forw and back | 12031.0 | 40536 | 2480.0 | |
master | LSTM | CPU | 512 | 512 | 64 | forward | 32323.001953125 | 1024 | 22030.0 | |
master | LSTM | CPU | 512 | 512 | 64 | backward | 58327.0 | 9644 | 169590.0 | |
master | LSTM | CPU | 512 | 512 | 64 | forw and back | 82442.0 | 22855 | 135660.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | forward | 5773.0 | 21694 | 1120.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | backward | 16018.9990234375 | 50520 | 3270.0 | |
master | LSTM | CUDA | 512 | 512 | 64 | forw and back | 23525.0 | 80920 | 4960.0 | |
master | GRU | CPU | 32 | 32 | 4 | forward | 37.0559997558594 | 49 | 61.7000007629395 | |
master | GRU | CPU | 32 | 32 | 4 | backward | 171.009002685547 | 844 | 246.169998168945 | |
master | GRU | CPU | 32 | 32 | 4 | forw and back | 350.778991699219 | 1737 | 428.230010986328 | |
master | GRU | CUDA | 32 | 32 | 4 | forward | 486.434997558594 | 1429 | 82.1999969482422 | |
master | GRU | CUDA | 32 | 32 | 4 | backward | 1490.0 | 3466 | 213.529998779297 | |
master | GRU | CUDA | 32 | 32 | 4 | forw and back | 2230.0 | 5809 | 364.190002441406 | |
master | GRU | CPU | 32 | 32 | 16 | forward | 164.147994995117 | 193 | 264.299987792969 | |
master | GRU | CPU | 32 | 32 | 16 | backward | 672.080017089844 | 3149 | 1000.0 | |
master | GRU | CPU | 32 | 32 | 16 | forw and back | 1278.0 | 6479 | 1720.0 | |
master | GRU | CUDA | 32 | 32 | 16 | forward | 1452.0 | 5713 | 328.670013427734 | |
master | GRU | CUDA | 32 | 32 | 16 | backward | 4948.0 | 13607 | 844.5 | |
master | GRU | CUDA | 32 | 32 | 16 | forw and back | 7461.0 | 22671 | 1390.0 | |
master | GRU | CPU | 32 | 32 | 32 | forward | 338.151000976562 | 385 | 534.419982910156 | |
master | GRU | CPU | 32 | 32 | 32 | backward | 1315.0 | 6221 | 2020.0 | |
master | GRU | CPU | 32 | 32 | 32 | forw and back | 2501.0 | 12799 | 3440.0 | |
master | GRU | CUDA | 32 | 32 | 32 | forward | 2712.0 | 11425 | 657.299987792969 | |
master | GRU | CUDA | 32 | 32 | 32 | backward | 9540.0 | 27127 | 1650.0 | |
master | GRU | CUDA | 32 | 32 | 32 | forw and back | 14249.0 | 45151 | 2770.0 | |
master | GRU | CPU | 32 | 32 | 64 | forward | 682.411987304688 | 769 | 1050.0 | |
master | GRU | CPU | 32 | 32 | 64 | backward | 2647.0 | 12365 | 4060.0 | |
master | GRU | CPU | 32 | 32 | 64 | forw and back | 4982.0 | 25439 | 6900.0 | |
master | GRU | CUDA | 32 | 32 | 64 | forward | 5276.0 | 22849 | 1280.0 | |
master | GRU | CUDA | 32 | 32 | 64 | backward | 18634.0 | 54167 | 3290.0 | |
master | GRU | CUDA | 32 | 32 | 64 | forw and back | 27950.0 | 90111 | 5540.0 | |
master | GRU | CPU | 32 | 128 | 4 | forward | 145.167007446289 | 49 | 238.020004272461 | |
master | GRU | CPU | 32 | 128 | 4 | backward | 298.838012695312 | 844 | 747.299987792969 | |
master | GRU | CPU | 32 | 128 | 4 | forw and back | 595.940979003906 | 1737 | 1150.0 | |
master | GRU | CUDA | 32 | 128 | 4 | forward | 475.029998779297 | 1437 | 82.3300018310547 | |
master | GRU | CUDA | 32 | 128 | 4 | backward | 1497.0 | 3502 | 214.089996337891 | |
master | GRU | CUDA | 32 | 128 | 4 | forw and back | 2254.0 | 5845 | 364.75 | |
master | GRU | CPU | 32 | 128 | 16 | forward | 616.849975585938 | 193 | 1000.0 | |
master | GRU | CPU | 32 | 128 | 16 | backward | 1153.0 | 3149 | 3090.0 | |
master | GRU | CPU | 32 | 128 | 16 | forw and back | 2215.0 | 6479 | 4830.0 | |
master | GRU | CUDA | 32 | 128 | 16 | forward | 1444.0 | 5745 | 329.170013427734 | |
master | GRU | CUDA | 32 | 128 | 16 | backward | 5056.0 | 13751 | 846.75 | |
master | GRU | CUDA | 32 | 128 | 16 | forw and back | 7498.0 | 22815 | 1390.0 | |
master | GRU | CPU | 32 | 128 | 32 | forward | 1242.0 | 385 | 2020.0 | |
master | GRU | CPU | 32 | 128 | 32 | backward | 2346.0 | 6221 | 6230.0 | |
master | GRU | CPU | 32 | 128 | 32 | forw and back | 4434.0 | 12799 | 9730.0 | |
master | GRU | CUDA | 32 | 128 | 32 | forward | 2699.0 | 11489 | 658.299987792969 | |
master | GRU | CUDA | 32 | 128 | 32 | backward | 9703.0 | 27415 | 1650.0 | |
master | GRU | CUDA | 32 | 128 | 32 | forw and back | 14495.0 | 45439 | 2780.0 | |
master | GRU | CPU | 32 | 128 | 64 | forward | 2507.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 32 | 128 | 64 | backward | 4859.0 | 12365 | 12510.0 | |
master | GRU | CPU | 32 | 128 | 64 | forw and back | 8842.0 | 25439 | 19540.0 | |
master | GRU | CUDA | 32 | 128 | 64 | forward | 5261.0 | 22977 | 1290.0 | |
master | GRU | CUDA | 32 | 128 | 64 | backward | 18872.0 | 54743 | 3300.0 | |
master | GRU | CUDA | 32 | 128 | 64 | forw and back | 28257.0 | 90687 | 5550.0 | |
master | GRU | CPU | 32 | 512 | 4 | forward | 1406.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 32 | 512 | 4 | backward | 1806.0 | 880 | 2670.0 | |
master | GRU | CPU | 32 | 512 | 4 | forw and back | 3487.0 | 1788 | 4060.0 | |
master | GRU | CUDA | 32 | 512 | 4 | forward | 462.93701171875 | 1458 | 82.6600036621094 | |
master | GRU | CUDA | 32 | 512 | 4 | backward | 1541.0 | 3594 | 217.839996337891 | |
master | GRU | CUDA | 32 | 512 | 4 | forw and back | 2293.0 | 5998 | 370.220001220703 | |
master | GRU | CPU | 32 | 512 | 16 | forward | 5263.0 | 224 | 3930.0 | |
master | GRU | CPU | 32 | 512 | 16 | backward | 7605.0 | 3305 | 11350.0 | |
master | GRU | CPU | 32 | 512 | 16 | forw and back | 14322.0 | 6698 | 17150.0 | |
master | GRU | CUDA | 32 | 512 | 16 | forward | 1437.0 | 5838 | 330.619995117187 | |
master | GRU | CUDA | 32 | 512 | 16 | backward | 5297.0 | 14131 | 861.940002441406 | |
master | GRU | CUDA | 32 | 512 | 16 | forw and back | 7771.0 | 23400 | 1410.0 | |
master | GRU | CPU | 32 | 512 | 32 | forward | 11747.0 | 448 | 7950.0 | |
master | GRU | CPU | 32 | 512 | 32 | backward | 15211.0 | 6537 | 22910.0 | |
master | GRU | CPU | 32 | 512 | 32 | forw and back | 28691.0 | 13242 | 34600.0 | |
master | GRU | CUDA | 32 | 512 | 32 | forward | 2719.0 | 11678 | 661.25 | |
master | GRU | CUDA | 32 | 512 | 32 | backward | 10270.0 | 28179 | 1680.0 | |
master | GRU | CUDA | 32 | 512 | 32 | forw and back | 15028.0 | 46600 | 2820.0 | |
master | GRU | CPU | 32 | 512 | 64 | forward | 23613.0 | 896 | 15990.0 | |
master | GRU | CPU | 32 | 512 | 64 | backward | 30489.0 | 13001 | 46050.0 | |
master | GRU | CPU | 32 | 512 | 64 | forw and back | 56960.0 | 26330 | 69500.0 | |
master | GRU | CUDA | 32 | 512 | 64 | forward | 5290.0 | 23358 | 1290.0 | |
master | GRU | CUDA | 32 | 512 | 64 | backward | 20239.0 | 56275 | 3360.0 | |
master | GRU | CUDA | 32 | 512 | 64 | forw and back | 30735.0 | 93000 | 5620.0 | |
master | GRU | CPU | 128 | 32 | 4 | forward | 48.1609992980957 | 49 | 61.7000007629395 | |
master | GRU | CPU | 128 | 32 | 4 | backward | 215.723007202148 | 844 | 405.170013427734 | |
master | GRU | CPU | 128 | 32 | 4 | forw and back | 408.123992919922 | 1737 | 539.22998046875 | |
master | GRU | CUDA | 128 | 32 | 4 | forward | 482.053009033203 | 1429 | 82.1999969482422 | |
master | GRU | CUDA | 128 | 32 | 4 | backward | 1496.0 | 3466 | 213.529998779297 | |
master | GRU | CUDA | 128 | 32 | 4 | forw and back | 2217.0 | 5809 | 364.190002441406 | |
master | GRU | CPU | 128 | 32 | 16 | forward | 209.447998046875 | 193 | 264.299987792969 | |
master | GRU | CPU | 128 | 32 | 16 | backward | 795.439025878906 | 3149 | 1650.0 | |
master | GRU | CPU | 128 | 32 | 16 | forw and back | 1444.0 | 6479 | 2180.0 | |
master | GRU | CUDA | 128 | 32 | 16 | forward | 1489.0 | 5713 | 328.670013427734 | |
master | GRU | CUDA | 128 | 32 | 16 | backward | 4985.0 | 13607 | 844.5 | |
master | GRU | CUDA | 128 | 32 | 16 | forw and back | 7435.0 | 22671 | 1390.0 | |
master | GRU | CPU | 128 | 32 | 32 | forward | 426.944000244141 | 385 | 534.419982910156 | |
master | GRU | CPU | 128 | 32 | 32 | backward | 1595.0 | 6221 | 3330.0 | |
master | GRU | CPU | 128 | 32 | 32 | forw and back | 2840.0 | 12799 | 4370.0 | |
master | GRU | CUDA | 128 | 32 | 32 | forward | 2745.0 | 11425 | 657.299987792969 | |
master | GRU | CUDA | 128 | 32 | 32 | backward | 9446.0 | 27127 | 1650.0 | |
master | GRU | CUDA | 128 | 32 | 32 | forw and back | 14175.0 | 45151 | 2770.0 | |
master | GRU | CPU | 128 | 32 | 64 | forward | 859.223022460937 | 769 | 1050.0 | |
master | GRU | CPU | 128 | 32 | 64 | backward | 3385.0 | 12365 | 6680.0 | |
master | GRU | CPU | 128 | 32 | 64 | forw and back | 5770.0 | 25439 | 8770.0 | |
master | GRU | CUDA | 128 | 32 | 64 | forward | 5302.0 | 22849 | 1280.0 | |
master | GRU | CUDA | 128 | 32 | 64 | backward | 18317.0 | 54167 | 3290.0 | |
master | GRU | CUDA | 128 | 32 | 64 | forw and back | 27563.0 | 90111 | 5540.0 | |
master | GRU | CPU | 128 | 128 | 4 | forward | 494.161987304687 | 49 | 238.020004272461 | |
master | GRU | CPU | 128 | 128 | 4 | backward | 1078.0 | 852 | 1170.0 | |
master | GRU | CPU | 128 | 128 | 4 | forw and back | 1902.0 | 1741 | 1400.0 | |
master | GRU | CUDA | 128 | 128 | 4 | forward | 465.380004882812 | 1437 | 82.3300018310547 | |
master | GRU | CUDA | 128 | 128 | 4 | backward | 1477.0 | 3502 | 214.089996337891 | |
master | GRU | CUDA | 128 | 128 | 4 | forw and back | 2239.0 | 5845 | 364.75 | |
master | GRU | CPU | 128 | 128 | 16 | forward | 2125.0 | 193 | 1000.0 | |
master | GRU | CPU | 128 | 128 | 16 | backward | 4355.0 | 3181 | 4860.0 | |
master | GRU | CPU | 128 | 128 | 16 | forw and back | 7485.0 | 6495 | 5850.0 | |
master | GRU | CUDA | 128 | 128 | 16 | forward | 1477.0 | 5745 | 329.170013427734 | |
master | GRU | CUDA | 128 | 128 | 16 | backward | 5129.0 | 13751 | 846.75 | |
master | GRU | CUDA | 128 | 128 | 16 | forw and back | 7632.0 | 22815 | 1390.0 | |
master | GRU | CPU | 128 | 128 | 32 | forward | 4230.0 | 385 | 2020.0 | |
master | GRU | CPU | 128 | 128 | 32 | backward | 8788.0 | 6285 | 9780.0 | |
master | GRU | CPU | 128 | 128 | 32 | forw and back | 14835.0 | 12831 | 11780.0 | |
master | GRU | CUDA | 128 | 128 | 32 | forward | 2758.0 | 11489 | 658.299987792969 | |
master | GRU | CUDA | 128 | 128 | 32 | backward | 9839.0 | 27415 | 1650.0 | |
master | GRU | CUDA | 128 | 128 | 32 | forw and back | 14647.0 | 45439 | 2780.0 | |
master | GRU | CPU | 128 | 128 | 64 | forward | 8546.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 128 | 128 | 64 | backward | 17662.0 | 12493 | 19620.0 | |
master | GRU | CPU | 128 | 128 | 64 | forw and back | 30030.0 | 25503 | 23650.0 | |
master | GRU | CUDA | 128 | 128 | 64 | forward | 5346.0 | 22977 | 1290.0 | |
master | GRU | CUDA | 128 | 128 | 64 | backward | 19017.0 | 54743 | 3300.0 | |
master | GRU | CUDA | 128 | 128 | 64 | forw and back | 28545.0 | 90687 | 5550.0 | |
master | GRU | CPU | 128 | 512 | 4 | forward | 1467.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 128 | 512 | 4 | backward | 2078.0 | 880 | 4230.0 | |
master | GRU | CPU | 128 | 512 | 4 | forw and back | 3707.0 | 1788 | 4870.0 | |
master | GRU | CUDA | 128 | 512 | 4 | forward | 471.027008056641 | 1458 | 82.6600036621094 | |
master | GRU | CUDA | 128 | 512 | 4 | backward | 1575.0 | 3594 | 217.839996337891 | |
master | GRU | CUDA | 128 | 512 | 4 | forw and back | 2332.0 | 5998 | 370.220001220703 | |
master | GRU | CPU | 128 | 512 | 16 | forward | 6161.0 | 224 | 3930.0 | |
master | GRU | CPU | 128 | 512 | 16 | backward | 8740.0 | 3305 | 17620.0 | |
master | GRU | CPU | 128 | 512 | 16 | forw and back | 15252.0 | 6698 | 20420.0 | |
master | GRU | CUDA | 128 | 512 | 16 | forward | 1481.0 | 5838 | 330.619995117187 | |
master | GRU | CUDA | 128 | 512 | 16 | backward | 5411.0 | 14131 | 861.940002441406 | |
master | GRU | CUDA | 128 | 512 | 16 | forw and back | 7896.0 | 23400 | 1410.0 | |
master | GRU | CPU | 128 | 512 | 32 | forward | 10098.0 | 448 | 7950.0 | |
master | GRU | CPU | 128 | 512 | 32 | backward | 17682.0 | 6537 | 35470.0 | |
master | GRU | CPU | 128 | 512 | 32 | forw and back | 30554.0 | 13242 | 41150.0 | |
master | GRU | CUDA | 128 | 512 | 32 | forward | 2800.0 | 11678 | 661.25 | |
master | GRU | CUDA | 128 | 512 | 32 | backward | 10446.0 | 28179 | 1680.0 | |
master | GRU | CUDA | 128 | 512 | 32 | forw and back | 15398.0 | 46600 | 2820.0 | |
master | GRU | CPU | 128 | 512 | 64 | forward | 25142.0 | 896 | 15990.0 | |
master | GRU | CPU | 128 | 512 | 64 | backward | 35435.0 | 13001 | 71160.0 | |
master | GRU | CPU | 128 | 512 | 64 | forw and back | 60727.0 | 26330 | 82620.0 | |
master | GRU | CUDA | 128 | 512 | 64 | forward | 5479.0 | 23358 | 1290.0 | |
master | GRU | CUDA | 128 | 512 | 64 | backward | 20468.0 | 56275 | 3360.0 | |
master | GRU | CUDA | 128 | 512 | 64 | forw and back | 30247.0 | 93000 | 5620.0 | |
master | GRU | CPU | 512 | 32 | 4 | forward | 311.080993652344 | 49 | 61.7000007629395 | |
master | GRU | CPU | 512 | 32 | 4 | backward | 937.929992675781 | 859 | 1020.0 | |
master | GRU | CPU | 512 | 32 | 4 | forw and back | 1183.0 | 1748 | 982.380004882813 | |
master | GRU | CUDA | 512 | 32 | 4 | forward | 500.746002197266 | 1449 | 82.5199966430664 | |
master | GRU | CUDA | 512 | 32 | 4 | backward | 1524.0 | 3498 | 214.029998779297 | |
master | GRU | CUDA | 512 | 32 | 4 | forw and back | 2272.0 | 5861 | 365.0 | |
master | GRU | CPU | 512 | 32 | 16 | forward | 1297.0 | 193 | 264.299987792969 | |
master | GRU | CPU | 512 | 32 | 16 | backward | 3842.0 | 3212 | 4240.0 | |
master | GRU | CPU | 512 | 32 | 16 | forw and back | 6129.0 | 6526 | 4010.00024414062 | |
master | GRU | CUDA | 512 | 32 | 16 | forward | 1538.0 | 5793 | 329.920013427734 | |
master | GRU | CUDA | 512 | 32 | 16 | backward | 5103.0 | 13735 | 846.5 | |
master | GRU | CUDA | 512 | 32 | 16 | forw and back | 7769.0 | 22879 | 1400.0 | |
master | GRU | CPU | 512 | 32 | 32 | forward | 2623.0 | 385 | 534.419982910156 | |
master | GRU | CPU | 512 | 32 | 32 | backward | 7902.0 | 6348 | 8530.0 | |
master | GRU | CPU | 512 | 32 | 32 | forw and back | 12426.0 | 12894 | 8080.0 | |
master | GRU | CUDA | 512 | 32 | 32 | forward | 2858.0 | 11585 | 659.799987792969 | |
master | GRU | CUDA | 512 | 32 | 32 | backward | 9831.0 | 27383 | 1650.0 | |
master | GRU | CUDA | 512 | 32 | 32 | forw and back | 14704.0 | 45567 | 2780.0 | |
master | GRU | CPU | 512 | 32 | 64 | forward | 5302.0 | 769 | 1050.0 | |
master | GRU | CPU | 512 | 32 | 64 | backward | 15853.0 | 12620 | 17120.0 | |
master | GRU | CPU | 512 | 32 | 64 | forw and back | 24895.0 | 25630 | 16219.9990234375 | |
master | GRU | CUDA | 512 | 32 | 64 | forward | 5513.0 | 23169 | 1290.0 | |
master | GRU | CUDA | 512 | 32 | 64 | backward | 18977.0 | 54679 | 3300.0 | |
master | GRU | CUDA | 512 | 32 | 64 | forw and back | 28441.0 | 90943 | 5550.0 | |
master | GRU | CPU | 512 | 128 | 4 | forward | 587.296020507812 | 49 | 238.020004272461 | |
master | GRU | CPU | 512 | 128 | 4 | backward | 1421.0 | 859 | 2910.0 | |
master | GRU | CPU | 512 | 128 | 4 | forw and back | 2214.0 | 1748 | 2400.0 | |
master | GRU | CUDA | 512 | 128 | 4 | forward | 490.171997070312 | 1457 | 82.6399993896484 | |
master | GRU | CUDA | 512 | 128 | 4 | backward | 1532.0 | 3534 | 214.589996337891 | |
master | GRU | CUDA | 512 | 128 | 4 | forw and back | 2317.0 | 5897 | 365.559997558594 | |
master | GRU | CPU | 512 | 128 | 16 | forward | 2443.0 | 193 | 1000.0 | |
master | GRU | CPU | 512 | 128 | 16 | backward | 6062.0 | 3212 | 11940.0 | |
master | GRU | CPU | 512 | 128 | 16 | forw and back | 8953.0 | 6526 | 9940.0 | |
master | GRU | CUDA | 512 | 128 | 16 | forward | 1482.0 | 5825 | 330.420013427734 | |
master | GRU | CUDA | 512 | 128 | 16 | backward | 5141.0 | 13879 | 848.75 | |
master | GRU | CUDA | 512 | 128 | 16 | forw and back | 7694.0 | 23023 | 1400.0 | |
master | GRU | CPU | 512 | 128 | 32 | forward | 4997.0 | 385 | 2020.0 | |
master | GRU | CPU | 512 | 128 | 32 | backward | 12083.0 | 6348 | 23990.0 | |
master | GRU | CPU | 512 | 128 | 32 | forw and back | 17802.0 | 12894 | 19990.0 | |
master | GRU | CUDA | 512 | 128 | 32 | forward | 2768.0 | 11649 | 660.799987792969 | |
master | GRU | CUDA | 512 | 128 | 32 | backward | 9899.0 | 27671 | 1650.0 | |
master | GRU | CUDA | 512 | 128 | 32 | forw and back | 14722.0 | 45855 | 2790.0 | |
master | GRU | CPU | 512 | 128 | 64 | forward | 10132.0 | 769 | 4070.00024414062 | |
master | GRU | CPU | 512 | 128 | 64 | backward | 24323.0 | 12620 | 48070.0 | |
master | GRU | CPU | 512 | 128 | 64 | forw and back | 35536.0 | 25630 | 40100.0 | |
master | GRU | CUDA | 512 | 128 | 64 | forward | 5351.0 | 23297 | 1290.0 | |
master | GRU | CUDA | 512 | 128 | 64 | backward | 19270.0 | 55255 | 3310.0 | |
master | GRU | CUDA | 512 | 128 | 64 | forw and back | 28835.0 | 91519 | 5560.0 | |
master | GRU | CPU | 512 | 512 | 4 | forward | 1769.0 | 56 | 933.469970703125 | |
master | GRU | CPU | 512 | 512 | 4 | backward | 3704.0 | 887 | 10480.0 | |
master | GRU | CPU | 512 | 512 | 4 | forw and back | 5138.0 | 1795 | 8109.99951171875 | |
master | GRU | CUDA | 512 | 512 | 4 | forward | 522.494995117187 | 1478 | 82.9700012207031 | |
master | GRU | CUDA | 512 | 512 | 4 | backward | 1630.0 | 3626 | 218.339996337891 | |
master | GRU | CUDA | 512 | 512 | 4 | forw and back | 2422.0 | 6050 | 371.029998779297 | |
master | GRU | CPU | 512 | 512 | 16 | forward | 7603.0 | 224 | 3930.0 | |
master | GRU | CPU | 512 | 512 | 16 | backward | 14938.0 | 3336 | 42710.0 | |
master | GRU | CPU | 512 | 512 | 16 | forw and back | 20314.0 | 6729 | 33500.0 | |
master | GRU | CUDA | 512 | 512 | 16 | forward | 1633.0 | 5918 | 331.880004882812 | |
master | GRU | CUDA | 512 | 512 | 16 | backward | 5625.0 | 14259 | 863.940002441406 | |
master | GRU | CUDA | 512 | 512 | 16 | forw and back | 8189.0 | 23608 | 1420.0 | |
master | GRU | CPU | 512 | 512 | 32 | forward | 15317.0 | 448 | 7950.0 | |
master | GRU | CPU | 512 | 512 | 32 | backward | 30130.0 | 6600 | 85680.0 | |
master | GRU | CPU | 512 | 512 | 32 | forw and back | 40710.0 | 13305 | 67360.0 | |
master | GRU | CUDA | 512 | 512 | 32 | forward | 3070.0 | 11838 | 663.75 | |
master | GRU | CUDA | 512 | 512 | 32 | backward | 10983.0 | 28435 | 1680.0 | |
master | GRU | CUDA | 512 | 512 | 32 | forw and back | 15912.0 | 47016 | 2820.0 | |
master | GRU | CPU | 512 | 512 | 64 | forward | 30749.0 | 896 | 15990.0 | |
master | GRU | CPU | 512 | 512 | 64 | backward | 60205.0 | 13128 | 171620.0 | |
master | GRU | CPU | 512 | 512 | 64 | forw and back | 80993.0 | 26457 | 135070.0 | |
master | GRU | CUDA | 512 | 512 | 64 | forward | 6003.0 | 23678 | 1300.0 | |
master | GRU | CUDA | 512 | 512 | 64 | backward | 22428.0 | 56787 | 3370.0 | |
master | GRU | CUDA | 512 | 512 | 64 | forw and back | 31418.0 | 93832 | 5640.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 4 | forward | 36.5110015869141 | 37 | 71.1399993896484 | |
pr1761 | LSTM | CPU | 32 | 32 | 4 | backward | 157.996994018555 | 608 | 233.699996948242 | |
pr1761 | LSTM | CPU | 32 | 32 | 4 | forw and back | 328.358001708984 | 1495 | 390.670013427734 | |
pr1761 | LSTM | CUDA | 32 | 32 | 4 | forward | 331.403015136719 | 809 | 56.4500007629395 | |
pr1761 | LSTM | CUDA | 32 | 32 | 4 | backward | 1063.0 | 3075 | 208.309997558594 | |
pr1761 | LSTM | CUDA | 32 | 32 | 4 | forw and back | 1585.0 | 4541 | 309.480010986328 | |
pr1761 | LSTM | CPU | 32 | 32 | 16 | forward | 163.046997070312 | 145 | 296.230010986328 | |
pr1761 | LSTM | CPU | 32 | 32 | 16 | backward | 588.481018066406 | 2300 | 950.72998046875 | |
pr1761 | LSTM | CPU | 32 | 32 | 16 | forw and back | 1193.0 | 5575 | 1530.0 | |
pr1761 | LSTM | CUDA | 32 | 32 | 16 | forward | 997.359985351562 | 3233 | 225.669998168945 | |
pr1761 | LSTM | CUDA | 32 | 32 | 16 | backward | 3549.0 | 12219 | 831.169982910156 | |
pr1761 | LSTM | CUDA | 32 | 32 | 16 | forw and back | 5160.0 | 17741 | 1180.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 32 | forward | 335.326995849609 | 289 | 596.359985351562 | |
pr1761 | LSTM | CPU | 32 | 32 | 32 | backward | 1162.0 | 4557 | 1860.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 32 | forw and back | 2329.0 | 11017 | 3070.0 | |
pr1761 | LSTM | CUDA | 32 | 32 | 32 | forward | 1884.0 | 6465 | 451.299987792969 | |
pr1761 | LSTM | CUDA | 32 | 32 | 32 | backward | 6835.0 | 24412 | 1620.0 | |
pr1761 | LSTM | CUDA | 32 | 32 | 32 | forw and back | 9937.0 | 35343 | 2360.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 64 | forward | 677.27099609375 | 577 | 1170.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 64 | backward | 2348.0 | 9069 | 3730.0 | |
pr1761 | LSTM | CPU | 32 | 32 | 64 | forw and back | 4628.0 | 21897 | 6140.0 | |
pr1761 | LSTM | CUDA | 32 | 32 | 64 | forward | 3638.0 | 12929 | 902.559997558594 | |
pr1761 | LSTM | CUDA | 32 | 32 | 64 | backward | 13431.0 | 48796 | 3240.0 | |
pr1761 | LSTM | CUDA | 32 | 32 | 64 | forw and back | 19508.0 | 70543 | 4710.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 4 | forward | 139.940002441406 | 37 | 276.640014648437 | |
pr1761 | LSTM | CPU | 32 | 128 | 4 | backward | 275.686004638672 | 608 | 722.830017089844 | |
pr1761 | LSTM | CPU | 32 | 128 | 4 | forw and back | 582.452026367187 | 1495 | 1100.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 4 | forward | 343.43701171875 | 817 | 56.5800018310547 | |
pr1761 | LSTM | CUDA | 32 | 128 | 4 | backward | 1101.0 | 3119 | 209.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 4 | forw and back | 1629.0 | 4585 | 310.170013427734 | |
pr1761 | LSTM | CPU | 32 | 128 | 16 | forward | 588.31298828125 | 145 | 1130.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 16 | backward | 1068.0 | 2300 | 2860.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 16 | forw and back | 2170.0 | 5575 | 4440.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 16 | forward | 1043.0 | 3265 | 226.169998168945 | |
pr1761 | LSTM | CUDA | 32 | 128 | 16 | backward | 3662.0 | 12395 | 833.919982910156 | |
pr1761 | LSTM | CUDA | 32 | 128 | 16 | forw and back | 5349.0 | 17917 | 1190.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 32 | forward | 1190.0 | 289 | 2270.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 32 | backward | 2180.0 | 4557 | 5730.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 32 | forw and back | 4349.0 | 11017 | 8910.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 32 | forward | 1917.0 | 6529 | 452.299987792969 | |
pr1761 | LSTM | CUDA | 32 | 128 | 32 | backward | 6973.0 | 24764 | 1630.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 32 | forw and back | 10145.0 | 35695 | 2360.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 64 | forward | 2394.0 | 577 | 4560.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 64 | backward | 4499.0 | 9069 | 11460.0 | |
pr1761 | LSTM | CPU | 32 | 128 | 64 | forw and back | 8673.0 | 21897 | 17840.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 64 | forward | 3725.0 | 13057 | 904.559997558594 | |
pr1761 | LSTM | CUDA | 32 | 128 | 64 | backward | 13667.0 | 49500 | 3260.0 | |
pr1761 | LSTM | CUDA | 32 | 128 | 64 | forw and back | 19937.0 | 71247 | 4720.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 4 | forward | 1348.0 | 48 | 1070.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 4 | backward | 1674.0 | 636 | 2600.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 4 | forw and back | 3579.0 | 1546 | 3930.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 4 | forward | 349.662994384766 | 838 | 56.9099998474121 | |
pr1761 | LSTM | CUDA | 32 | 512 | 4 | backward | 1156.0 | 3147 | 209.440002441406 | |
pr1761 | LSTM | CUDA | 32 | 512 | 4 | forw and back | 1698.0 | 4666 | 312.079986572266 | |
pr1761 | LSTM | CPU | 32 | 512 | 16 | forward | 5774.0 | 192 | 4450.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 16 | backward | 7088.0 | 2412 | 10510.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 16 | forw and back | 14627.0 | 5782 | 15990.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 16 | forward | 1073.0 | 3358 | 227.619995117188 | |
pr1761 | LSTM | CUDA | 32 | 512 | 16 | backward | 3879.0 | 12519 | 835.859985351563 | |
pr1761 | LSTM | CUDA | 32 | 512 | 16 | forw and back | 5569.0 | 18214 | 1190.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 32 | forward | 9449.0 | 384 | 8970.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 32 | backward | 14147.0 | 4781 | 21060.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 32 | forw and back | 28861.0 | 11432 | 32070.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 32 | forward | 1981.0 | 6718 | 455.25 | |
pr1761 | LSTM | CUDA | 32 | 512 | 32 | backward | 7394.0 | 25016 | 1630.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 32 | forw and back | 10652.0 | 36280 | 2370.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 64 | forward | 23437.0 | 768 | 17990.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 64 | backward | 28574.0 | 9517 | 42150.0 | |
pr1761 | LSTM | CPU | 32 | 512 | 64 | forw and back | 57866.0 | 22728 | 64220.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 64 | forward | 3811.0 | 13438 | 910.52001953125 | |
pr1761 | LSTM | CUDA | 32 | 512 | 64 | backward | 14319.0 | 50008 | 3260.0 | |
pr1761 | LSTM | CUDA | 32 | 512 | 64 | forw and back | 20900.0 | 72408 | 4740.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 4 | forward | 50.7449989318848 | 37 | 71.1399993896484 | |
pr1761 | LSTM | CPU | 128 | 32 | 4 | backward | 199.522003173828 | 608 | 413.700012207031 | |
pr1761 | LSTM | CPU | 128 | 32 | 4 | forw and back | 396.783996582031 | 1495 | 522.669982910156 | |
pr1761 | LSTM | CUDA | 128 | 32 | 4 | forward | 364.985992431641 | 809 | 56.4500007629395 | |
pr1761 | LSTM | CUDA | 128 | 32 | 4 | backward | 1135.0 | 3075 | 208.309997558594 | |
pr1761 | LSTM | CUDA | 128 | 32 | 4 | forw and back | 1676.0 | 4541 | 309.480010986328 | |
pr1761 | LSTM | CPU | 128 | 32 | 16 | forward | 218.906005859375 | 145 | 296.230010986328 | |
pr1761 | LSTM | CPU | 128 | 32 | 16 | backward | 755.577026367188 | 2300 | 1670.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 16 | forw and back | 1420.0 | 5575 | 2080.0 | |
pr1761 | LSTM | CUDA | 128 | 32 | 16 | forward | 1058.0 | 3233 | 225.669998168945 | |
pr1761 | LSTM | CUDA | 128 | 32 | 16 | backward | 3679.0 | 12219 | 831.169982910156 | |
pr1761 | LSTM | CUDA | 128 | 32 | 16 | forw and back | 5374.0 | 17741 | 1180.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 32 | forward | 447.688995361328 | 289 | 596.359985351562 | |
pr1761 | LSTM | CPU | 128 | 32 | 32 | backward | 1522.0 | 4557 | 3350.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 32 | forw and back | 2791.0 | 11017 | 4180.0 | |
pr1761 | LSTM | CUDA | 128 | 32 | 32 | forward | 1960.0 | 6465 | 451.299987792969 | |
pr1761 | LSTM | CUDA | 128 | 32 | 32 | backward | 7010.0 | 24412 | 1620.0 | |
pr1761 | LSTM | CUDA | 128 | 32 | 32 | forw and back | 10229.0 | 35343 | 2360.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 64 | forward | 906.68798828125 | 577 | 1170.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 64 | backward | 3214.0 | 9069 | 6720.0 | |
pr1761 | LSTM | CPU | 128 | 32 | 64 | forw and back | 5648.0 | 21897 | 8380.0 | |
pr1761 | LSTM | CUDA | 128 | 32 | 64 | forward | 3743.0 | 12929 | 902.559997558594 | |
pr1761 | LSTM | CUDA | 128 | 32 | 64 | backward | 13742.0 | 48796 | 3240.0 | |
pr1761 | LSTM | CUDA | 128 | 32 | 64 | forw and back | 19897.0 | 70543 | 4710.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 4 | forward | 493.347991943359 | 37 | 276.640014648437 | |
pr1761 | LSTM | CPU | 128 | 128 | 4 | backward | 996.973999023438 | 616 | 1160.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 4 | forw and back | 1866.0 | 1499 | 1360.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 4 | forward | 357.843994140625 | 817 | 56.5800018310547 | |
pr1761 | LSTM | CUDA | 128 | 128 | 4 | backward | 1130.0 | 3119 | 209.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 4 | forw and back | 1680.0 | 4585 | 310.170013427734 | |
pr1761 | LSTM | CPU | 128 | 128 | 16 | forward | 2087.0 | 145 | 1130.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 16 | backward | 4089.00024414062 | 2332 | 4720.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 16 | forw and back | 7342.0 | 5591 | 5560.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 16 | forward | 1053.0 | 3265 | 226.169998168945 | |
pr1761 | LSTM | CUDA | 128 | 128 | 16 | backward | 3694.0 | 12395 | 833.919982910156 | |
pr1761 | LSTM | CUDA | 128 | 128 | 16 | forw and back | 5376.0 | 17917 | 1190.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 32 | forward | 4222.0 | 289 | 2270.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 32 | backward | 8297.0 | 4621 | 9460.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 32 | forw and back | 14760.0 | 11049 | 11140.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 32 | forward | 1931.0 | 6529 | 452.299987792969 | |
pr1761 | LSTM | CUDA | 128 | 128 | 32 | backward | 7032.0 | 24764 | 1630.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 32 | forw and back | 10200.0 | 35695 | 2360.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 64 | forward | 8167.0 | 577 | 4560.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 64 | backward | 16636.0 | 9197 | 18940.0 | |
pr1761 | LSTM | CPU | 128 | 128 | 64 | forw and back | 29544.0 | 21961 | 22320.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 64 | forward | 3787.0 | 13057 | 904.559997558594 | |
pr1761 | LSTM | CUDA | 128 | 128 | 64 | backward | 13967.0 | 49500 | 3260.0 | |
pr1761 | LSTM | CUDA | 128 | 128 | 64 | forw and back | 20377.0 | 71247 | 4720.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 4 | forward | 1435.0 | 48 | 1070.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 4 | backward | 1989.0 | 636 | 4180.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 4 | forw and back | 3844.0 | 1546 | 4760.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 4 | forward | 380.585998535156 | 838 | 56.9099998474121 | |
pr1761 | LSTM | CUDA | 128 | 512 | 4 | backward | 1202.0 | 3147 | 209.440002441406 | |
pr1761 | LSTM | CUDA | 128 | 512 | 4 | forw and back | 1767.0 | 4666 | 312.079986572266 | |
pr1761 | LSTM | CPU | 128 | 512 | 16 | forward | 3478.0 | 192 | 4450.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 16 | backward | 8329.0 | 2412 | 16880.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 16 | forw and back | 15553.0 | 5782 | 19350.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 16 | forward | 1120.0 | 3358 | 227.619995117188 | |
pr1761 | LSTM | CUDA | 128 | 512 | 16 | backward | 3991.0 | 12519 | 835.859985351563 | |
pr1761 | LSTM | CUDA | 128 | 512 | 16 | forw and back | 5734.0 | 18214 | 1190.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 32 | forward | 12533.0 | 384 | 8970.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 32 | backward | 16724.0 | 4781 | 33800.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 32 | forw and back | 30750.0 | 11432 | 38800.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 32 | forward | 2100.0 | 6718 | 455.25 | |
pr1761 | LSTM | CUDA | 128 | 512 | 32 | backward | 7663.0 | 25016 | 1630.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 32 | forw and back | 10944.0 | 36280 | 2370.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 64 | forward | 25193.0 | 768 | 17990.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 64 | backward | 33564.0 | 9517 | 67640.0 | |
pr1761 | LSTM | CPU | 128 | 512 | 64 | forw and back | 61978.0 | 22728 | 77710.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 64 | forward | 4050.00024414062 | 13438 | 910.52001953125 | |
pr1761 | LSTM | CUDA | 128 | 512 | 64 | backward | 14960.0 | 50008 | 3260.0 | |
pr1761 | LSTM | CUDA | 128 | 512 | 64 | forw and back | 21488.0 | 72408 | 4740.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 4 | forward | 282.929992675781 | 37 | 71.1399993896484 | |
pr1761 | LSTM | CPU | 512 | 32 | 4 | backward | 867.286987304688 | 623 | 1110.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 4 | forw and back | 1494.0 | 1506 | 1030.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 4 | forward | 365.976013183594 | 829 | 56.7700004577637 | |
pr1761 | LSTM | CUDA | 512 | 32 | 4 | backward | 1133.0 | 3107 | 208.809997558594 | |
pr1761 | LSTM | CUDA | 512 | 32 | 4 | forw and back | 1678.0 | 4593 | 310.299987792969 | |
pr1761 | LSTM | CPU | 512 | 32 | 16 | forward | 1185.0 | 145 | 296.230010986328 | |
pr1761 | LSTM | CPU | 512 | 32 | 16 | backward | 3641.0 | 2363 | 4620.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 16 | forw and back | 5847.0 | 5622 | 4280.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 16 | forward | 1056.0 | 3313 | 226.919998168945 | |
pr1761 | LSTM | CUDA | 512 | 32 | 16 | backward | 3712.0 | 12347 | 833.169982910156 | |
pr1761 | LSTM | CUDA | 512 | 32 | 16 | forw and back | 5441.0 | 17949 | 1190.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 32 | forward | 2399.0 | 289 | 596.359985351562 | |
pr1761 | LSTM | CPU | 512 | 32 | 32 | backward | 7361.0 | 4684 | 9290.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 32 | forw and back | 11652.0 | 11112 | 8630.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 32 | forward | 1982.0 | 6625 | 453.799987792969 | |
pr1761 | LSTM | CUDA | 512 | 32 | 32 | backward | 7102.0 | 24668 | 1630.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 32 | forw and back | 10382.0 | 35759 | 2370.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 64 | forward | 4867.0 | 577 | 1170.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 64 | backward | 14773.0 | 9324 | 18650.0 | |
pr1761 | LSTM | CPU | 512 | 32 | 64 | forw and back | 23629.0 | 22088 | 17320.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 64 | forward | 3763.0 | 13249 | 907.559997558594 | |
pr1761 | LSTM | CUDA | 512 | 32 | 64 | backward | 13788.0 | 49308 | 3250.0 | |
pr1761 | LSTM | CUDA | 512 | 32 | 64 | forw and back | 20283.0 | 71375 | 4720.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 4 | forward | 558.466979980469 | 37 | 276.640014648437 | |
pr1761 | LSTM | CPU | 512 | 128 | 4 | backward | 1349.0 | 623 | 2990.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 4 | forw and back | 2169.0 | 1506 | 2440.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 4 | forward | 406.860992431641 | 837 | 56.8899993896484 | |
pr1761 | LSTM | CUDA | 512 | 128 | 4 | backward | 1166.0 | 3151 | 209.5 | |
pr1761 | LSTM | CUDA | 512 | 128 | 4 | forw and back | 1748.0 | 4637 | 310.980010986328 | |
pr1761 | LSTM | CPU | 512 | 128 | 16 | forward | 2391.0 | 145 | 1130.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 16 | backward | 5799.0 | 2363 | 12170.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 16 | forw and back | 8837.0 | 5622 | 10010.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 16 | forward | 1194.0 | 3345 | 227.419998168945 | |
pr1761 | LSTM | CUDA | 512 | 128 | 16 | backward | 3846.0 | 12523 | 835.919982910156 | |
pr1761 | LSTM | CUDA | 512 | 128 | 16 | forw and back | 5701.0 | 18125 | 1190.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 32 | forward | 4913.0 | 289 | 2270.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 32 | backward | 11581.0 | 4684 | 24410.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 32 | forw and back | 17497.0 | 11112 | 20090.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 32 | forward | 2245.0 | 6689 | 454.799987792969 | |
pr1761 | LSTM | CUDA | 512 | 128 | 32 | backward | 7364.0 | 25020 | 1630.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 32 | forw and back | 10827.0 | 36111 | 2370.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 64 | forward | 9983.0 | 577 | 4560.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 64 | backward | 23183.0 | 9324 | 48880.0 | |
pr1761 | LSTM | CPU | 512 | 128 | 64 | forw and back | 35427.0 | 22088 | 40260.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 64 | forward | 4330.0 | 13377 | 909.559997558594 | |
pr1761 | LSTM | CUDA | 512 | 128 | 64 | backward | 14335.0 | 50012 | 3260.0 | |
pr1761 | LSTM | CUDA | 512 | 128 | 64 | forw and back | 21230.0 | 72079 | 4730.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 4 | forward | 1688.0 | 48 | 1070.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 4 | backward | 3557.0 | 643 | 10510.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 4 | forw and back | 5102.0 | 1553 | 8090.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 4 | forward | 399.81298828125 | 858 | 57.2200012207031 | |
pr1761 | LSTM | CUDA | 512 | 512 | 4 | backward | 1241.0 | 3179 | 209.940002441406 | |
pr1761 | LSTM | CUDA | 512 | 512 | 4 | forw and back | 1815.0 | 4718 | 312.890014648437 | |
pr1761 | LSTM | CPU | 512 | 512 | 16 | forward | 7528.0 | 192 | 4450.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 16 | backward | 14394.0 | 2443 | 42330.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 16 | forw and back | 20404.0 | 5813 | 32800.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 16 | forward | 1216.0 | 3438 | 228.880004882812 | |
pr1761 | LSTM | CUDA | 512 | 512 | 16 | backward | 4185.0 | 12647 | 837.859985351563 | |
pr1761 | LSTM | CUDA | 512 | 512 | 16 | forw and back | 5924.0 | 18422 | 1200.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 32 | forward | 15272.0 | 384 | 8970.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 32 | backward | 29294.0 | 4844 | 84750.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 32 | forw and back | 41290.0 | 11495 | 65750.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 32 | forward | 2254.0 | 6878 | 457.75 | |
pr1761 | LSTM | CUDA | 512 | 512 | 32 | backward | 8175.99951171875 | 25272 | 1640.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 32 | forw and back | 11253.0 | 36696 | 2380.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 64 | forward | 30405.0 | 768 | 17990.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 64 | backward | 58086.0 | 9644 | 169590.0 | |
pr1761 | LSTM | CPU | 512 | 512 | 64 | forw and back | 81369.0 | 22855 | 131650.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 64 | forward | 4307.0 | 13758 | 915.52001953125 | |
pr1761 | LSTM | CUDA | 512 | 512 | 64 | backward | 16188.0 | 50520 | 3270.0 | |
pr1761 | LSTM | CUDA | 512 | 512 | 64 | forw and back | 22430.0 | 73240 | 4750.0 | |
pr1761 | GRU | CPU | 32 | 32 | 4 | forward | 33.8040008544922 | 25 | 39.1100006103516 | |
pr1761 | GRU | CPU | 32 | 32 | 4 | backward | 178.608993530273 | 844 | 250.830001831055 | |
pr1761 | GRU | CPU | 32 | 32 | 4 | forw and back | 367.102996826172 | 1721 | 424.920013427734 | |
pr1761 | GRU | CUDA | 32 | 32 | 4 | forward | 382.704986572266 | 1005 | 117.639999389648 | |
pr1761 | GRU | CUDA | 32 | 32 | 4 | backward | 1490.0 | 3466 | 219.279998779297 | |
pr1761 | GRU | CUDA | 32 | 32 | 4 | forw and back | 2087.0 | 5025 | 355.75 | |
pr1761 | GRU | CPU | 32 | 32 | 16 | forward | 146.960998535156 | 97 | 165.199996948242 | |
pr1761 | GRU | CPU | 32 | 32 | 16 | backward | 717.275024414063 | 3149 | 1020.0 | |
pr1761 | GRU | CPU | 32 | 32 | 16 | forw and back | 1371.0 | 6415 | 1690.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 16 | forward | 1158.0 | 4017 | 470.420013427734 | |
pr1761 | GRU | CUDA | 32 | 32 | 16 | backward | 4959.0 | 13607 | 867.5 | |
pr1761 | GRU | CUDA | 32 | 32 | 16 | forw and back | 6879.0 | 19535 | 1360.0 | |
pr1761 | GRU | CPU | 32 | 32 | 32 | forward | 305.890991210937 | 193 | 333.329986572266 | |
pr1761 | GRU | CPU | 32 | 32 | 32 | backward | 1411.0 | 6221 | 2060.0 | |
pr1761 | GRU | CPU | 32 | 32 | 32 | forw and back | 2689.0 | 12671 | 3400.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 32 | forward | 2169.0 | 8033 | 940.799987792969 | |
pr1761 | GRU | CUDA | 32 | 32 | 32 | backward | 9623.0 | 27127 | 1690.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 32 | forw and back | 13479.0 | 38879 | 2710.0 | |
pr1761 | GRU | CPU | 32 | 32 | 64 | forward | 617.130981445313 | 385 | 669.590026855469 | |
pr1761 | GRU | CPU | 32 | 32 | 64 | backward | 2846.0 | 12365 | 4130.0 | |
pr1761 | GRU | CPU | 32 | 32 | 64 | forw and back | 5336.0 | 25183 | 6810.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 64 | forward | 4278.0 | 16065 | 1840.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 64 | backward | 18976.0 | 54167 | 3380.0 | |
pr1761 | GRU | CUDA | 32 | 32 | 64 | forw and back | 26450.0 | 77567 | 5410.0 | |
pr1761 | GRU | CPU | 32 | 128 | 4 | forward | 127.137001037598 | 25 | 151.110000610352 | |
pr1761 | GRU | CPU | 32 | 128 | 4 | backward | 312.209991455078 | 844 | 751.950012207031 | |
pr1761 | GRU | CPU | 32 | 128 | 4 | forw and back | 612.393981933594 | 1721 | 1090.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 4 | forward | 382.321990966797 | 1021 | 117.889999389648 | |
pr1761 | GRU | CUDA | 32 | 128 | 4 | backward | 1505.0 | 3502 | 219.839996337891 | |
pr1761 | GRU | CUDA | 32 | 128 | 4 | forw and back | 2110.0 | 5061 | 356.309997558594 | |
pr1761 | GRU | CPU | 32 | 128 | 16 | forward | 546.474975585937 | 97 | 640.200012207031 | |
pr1761 | GRU | CPU | 32 | 128 | 16 | backward | 1208.0 | 3149 | 3100.0 | |
pr1761 | GRU | CPU | 32 | 128 | 16 | forw and back | 2254.0 | 6415 | 4530.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 16 | forward | 1167.0 | 4081 | 471.420013427734 | |
pr1761 | GRU | CUDA | 32 | 128 | 16 | backward | 5157.0 | 13751 | 869.75 | |
pr1761 | GRU | CUDA | 32 | 128 | 16 | forw and back | 7216.0 | 19679 | 1360.0 | |
pr1761 | GRU | CPU | 32 | 128 | 32 | forward | 1094.0 | 193 | 1260.0 | |
pr1761 | GRU | CPU | 32 | 128 | 32 | backward | 2450.0 | 6221 | 6260.0 | |
pr1761 | GRU | CPU | 32 | 128 | 32 | forw and back | 4498.0 | 12671 | 9120.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 32 | forward | 2173.0 | 8161 | 942.799987792969 | |
pr1761 | GRU | CUDA | 32 | 128 | 32 | backward | 9933.0 | 27415 | 1700.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 32 | forw and back | 13795.0 | 39167 | 2710.0 | |
pr1761 | GRU | CPU | 32 | 128 | 64 | forward | 2214.0 | 385 | 2540.0 | |
pr1761 | GRU | CPU | 32 | 128 | 64 | backward | 5052.0 | 12365 | 12580.0 | |
pr1761 | GRU | CPU | 32 | 128 | 64 | forw and back | 8956.0 | 25183 | 18300.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 64 | forward | 4272.0 | 16321 | 1840.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 64 | backward | 19251.0 | 54743 | 3390.0 | |
pr1761 | GRU | CUDA | 32 | 128 | 64 | forw and back | 26773.0 | 78143 | 5420.0 | |
pr1761 | GRU | CPU | 32 | 512 | 4 | forward | 650.534973144531 | 32 | 594.559997558594 | |
pr1761 | GRU | CPU | 32 | 512 | 4 | backward | 1806.0 | 880 | 2680.0 | |
pr1761 | GRU | CPU | 32 | 512 | 4 | forw and back | 3443.0 | 1772 | 3740.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 4 | forward | 380.510009765625 | 1042 | 118.220001220703 | |
pr1761 | GRU | CUDA | 32 | 512 | 4 | backward | 1581.0 | 3594 | 223.589996337891 | |
pr1761 | GRU | CUDA | 32 | 512 | 4 | forw and back | 2205.0 | 5214 | 361.779998779297 | |
pr1761 | GRU | CPU | 32 | 512 | 16 | forward | 5215.0 | 128 | 2460.0 | |
pr1761 | GRU | CPU | 32 | 512 | 16 | backward | 7626.0 | 3305 | 11370.0 | |
pr1761 | GRU | CPU | 32 | 512 | 16 | forw and back | 14006.0 | 6634 | 15760.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 16 | forward | 1169.0 | 4174 | 472.880004882812 | |
pr1761 | GRU | CUDA | 32 | 512 | 16 | backward | 5440.0 | 14131 | 884.940002441406 | |
pr1761 | GRU | CUDA | 32 | 512 | 16 | forw and back | 7500.0 | 20264 | 1380.0 | |
pr1761 | GRU | CPU | 32 | 512 | 32 | forward | 10522.0 | 256 | 4970.0 | |
pr1761 | GRU | CPU | 32 | 512 | 32 | backward | 15234.0 | 6537 | 22950.0 | |
pr1761 | GRU | CPU | 32 | 512 | 32 | forw and back | 27659.0 | 13114 | 31770.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 32 | forward | 2190.0 | 8350 | 945.75 | |
pr1761 | GRU | CUDA | 32 | 512 | 32 | backward | 10514.0 | 28179 | 1730.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 32 | forw and back | 14553.0 | 40328 | 2750.0 | |
pr1761 | GRU | CPU | 32 | 512 | 64 | forward | 21181.0 | 512 | 9990.0 | |
pr1761 | GRU | CPU | 32 | 512 | 64 | backward | 30551.0 | 13001 | 46120.0 | |
pr1761 | GRU | CPU | 32 | 512 | 64 | forw and back | 55245.0 | 26074 | 63800.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 64 | forward | 4341.0 | 16702 | 1850.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 64 | backward | 20676.0 | 56275 | 3450.0 | |
pr1761 | GRU | CUDA | 32 | 512 | 64 | forw and back | 28481.0 | 80456 | 5490.0 | |
pr1761 | GRU | CPU | 128 | 32 | 4 | forward | 45.9440002441406 | 25 | 39.1100006103516 | |
pr1761 | GRU | CPU | 128 | 32 | 4 | backward | 227.994995117187 | 844 | 409.829986572266 | |
pr1761 | GRU | CPU | 128 | 32 | 4 | forw and back | 436.911987304687 | 1721 | 535.919982910156 | |
pr1761 | GRU | CUDA | 128 | 32 | 4 | forward | 398.092010498047 | 1005 | 117.639999389648 | |
pr1761 | GRU | CUDA | 128 | 32 | 4 | backward | 1530.0 | 3466 | 219.279998779297 | |
pr1761 | GRU | CUDA | 128 | 32 | 4 | forw and back | 2135.0 | 5025 | 355.75 | |
pr1761 | GRU | CPU | 128 | 32 | 16 | forward | 193.138000488281 | 97 | 165.199996948242 | |
pr1761 | GRU | CPU | 128 | 32 | 16 | backward | 845.77099609375 | 3149 | 1670.0 | |
pr1761 | GRU | CPU | 128 | 32 | 16 | forw and back | 1542.0 | 6415 | 2150.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 16 | forward | 1207.0 | 4017 | 470.420013427734 | |
pr1761 | GRU | CUDA | 128 | 32 | 16 | backward | 5151.0 | 13607 | 867.5 | |
pr1761 | GRU | CUDA | 128 | 32 | 16 | forw and back | 7240.0 | 19535 | 1360.0 | |
pr1761 | GRU | CPU | 128 | 32 | 32 | forward | 394.700012207031 | 193 | 333.329986572266 | |
pr1761 | GRU | CPU | 128 | 32 | 32 | backward | 1702.0 | 6221 | 3360.0 | |
pr1761 | GRU | CPU | 128 | 32 | 32 | forw and back | 3038.0 | 12671 | 4330.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 32 | forward | 2262.0 | 8033 | 940.799987792969 | |
pr1761 | GRU | CUDA | 128 | 32 | 32 | backward | 9877.0 | 27127 | 1690.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 32 | forw and back | 13811.0 | 38879 | 2710.0 | |
pr1761 | GRU | CPU | 128 | 32 | 64 | forward | 796.807983398438 | 385 | 669.590026855469 | |
pr1761 | GRU | CPU | 128 | 32 | 64 | backward | 3573.0 | 12365 | 6750.0 | |
pr1761 | GRU | CPU | 128 | 32 | 64 | forw and back | 6142.0 | 25183 | 8680.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 64 | forward | 4383.0 | 16065 | 1840.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 64 | backward | 19106.0 | 54167 | 3380.0 | |
pr1761 | GRU | CUDA | 128 | 32 | 64 | forw and back | 26794.0 | 77567 | 5410.0 | |
pr1761 | GRU | CPU | 128 | 128 | 4 | forward | 459.966003417969 | 25 | 151.110000610352 | |
pr1761 | GRU | CPU | 128 | 128 | 4 | backward | 1041.0 | 852 | 1170.0 | |
pr1761 | GRU | CPU | 128 | 128 | 4 | forw and back | 1103.0 | 1725 | 1340.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 4 | forward | 389.419006347656 | 1021 | 117.889999389648 | |
pr1761 | GRU | CUDA | 128 | 128 | 4 | backward | 1522.0 | 3502 | 219.839996337891 | |
pr1761 | GRU | CUDA | 128 | 128 | 4 | forw and back | 2154.0 | 5061 | 356.309997558594 | |
pr1761 | GRU | CPU | 128 | 128 | 16 | forward | 1207.0 | 97 | 640.200012207031 | |
pr1761 | GRU | CPU | 128 | 128 | 16 | backward | 4282.0 | 3181 | 4870.0 | |
pr1761 | GRU | CPU | 128 | 128 | 16 | forw and back | 7375.0 | 6431 | 5550.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 16 | forward | 1176.0 | 4081 | 471.420013427734 | |
pr1761 | GRU | CUDA | 128 | 128 | 16 | backward | 5240.0 | 13751 | 869.75 | |
pr1761 | GRU | CUDA | 128 | 128 | 16 | forw and back | 7321.0 | 19679 | 1360.0 | |
pr1761 | GRU | CPU | 128 | 128 | 32 | forward | 3869.0 | 193 | 1260.0 | |
pr1761 | GRU | CPU | 128 | 128 | 32 | backward | 6375.0 | 6285 | 9810.0 | |
pr1761 | GRU | CPU | 128 | 128 | 32 | forw and back | 12397.0 | 12703 | 11170.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 32 | forward | 2221.0 | 8161 | 942.799987792969 | |
pr1761 | GRU | CUDA | 128 | 128 | 32 | backward | 10115.0 | 27415 | 1700.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 32 | forw and back | 14042.0 | 39167 | 2710.0 | |
pr1761 | GRU | CPU | 128 | 128 | 64 | forward | 7791.0 | 385 | 2540.0 | |
pr1761 | GRU | CPU | 128 | 128 | 64 | backward | 17688.0 | 12493 | 19690.0 | |
pr1761 | GRU | CPU | 128 | 128 | 64 | forw and back | 29597.0 | 25247 | 22410.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 64 | forward | 4373.0 | 16321 | 1840.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 64 | backward | 19570.0 | 54743 | 3390.0 | |
pr1761 | GRU | CUDA | 128 | 128 | 64 | forw and back | 26788.0 | 78143 | 5420.0 | |
pr1761 | GRU | CPU | 128 | 512 | 4 | forward | 1295.0 | 32 | 594.559997558594 | |
pr1761 | GRU | CPU | 128 | 512 | 4 | backward | 2074.0 | 880 | 4240.0 | |
pr1761 | GRU | CPU | 128 | 512 | 4 | forw and back | 3677.0 | 1772 | 4560.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 4 | forward | 389.225006103516 | 1042 | 118.220001220703 | |
pr1761 | GRU | CUDA | 128 | 512 | 4 | backward | 1583.0 | 3594 | 223.589996337891 | |
pr1761 | GRU | CUDA | 128 | 512 | 4 | forw and back | 2208.0 | 5214 | 361.779998779297 | |
pr1761 | GRU | CPU | 128 | 512 | 16 | forward | 5538.0 | 128 | 2460.0 | |
pr1761 | GRU | CPU | 128 | 512 | 16 | backward | 8798.0 | 3305 | 17640.0 | |
pr1761 | GRU | CPU | 128 | 512 | 16 | forw and back | 14756.0 | 6634 | 19030.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 16 | forward | 1176.0 | 4174 | 472.880004882812 | |
pr1761 | GRU | CUDA | 128 | 512 | 16 | backward | 5420.0 | 14131 | 884.940002441406 | |
pr1761 | GRU | CUDA | 128 | 512 | 16 | forw and back | 7479.0 | 20264 | 1380.0 | |
pr1761 | GRU | CPU | 128 | 512 | 32 | forward | 11260.0 | 256 | 4970.0 | |
pr1761 | GRU | CPU | 128 | 512 | 32 | backward | 17652.0 | 6537 | 35500.0 | |
pr1761 | GRU | CPU | 128 | 512 | 32 | forw and back | 29339.0 | 13114 | 38320.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 32 | forward | 2246.0 | 8350 | 945.75 | |
pr1761 | GRU | CUDA | 128 | 512 | 32 | backward | 10603.0 | 28179 | 1730.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 32 | forw and back | 14578.0 | 40328 | 2750.0 | |
pr1761 | GRU | CPU | 128 | 512 | 64 | forward | 22706.0 | 512 | 9990.0 | |
pr1761 | GRU | CPU | 128 | 512 | 64 | backward | 35436.0 | 13001 | 71240.0 | |
pr1761 | GRU | CPU | 128 | 512 | 64 | forw and back | 59069.0 | 26074 | 76920.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 64 | forward | 4426.0 | 16702 | 1850.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 64 | backward | 20861.0 | 56275 | 3450.0 | |
pr1761 | GRU | CUDA | 128 | 512 | 64 | forw and back | 28711.0 | 80456 | 5490.0 | |
pr1761 | GRU | CPU | 512 | 32 | 4 | forward | 308.928985595703 | 25 | 39.1100006103516 | |
pr1761 | GRU | CPU | 512 | 32 | 4 | backward | 949.953002929688 | 859 | 1020.0 | |
pr1761 | GRU | CPU | 512 | 32 | 4 | forw and back | 1598.0 | 1732 | 979.059997558594 | |
pr1761 | GRU | CUDA | 512 | 32 | 4 | forward | 396.804992675781 | 1025 | 117.949996948242 | |
pr1761 | GRU | CUDA | 512 | 32 | 4 | backward | 1551.0 | 3498 | 219.779998779297 | |
pr1761 | GRU | CUDA | 512 | 32 | 4 | forw and back | 2157.0 | 5077 | 356.559997558594 | |
pr1761 | GRU | CPU | 512 | 32 | 16 | forward | 1251.0 | 97 | 165.199996948242 | |
pr1761 | GRU | CPU | 512 | 32 | 16 | backward | 3884.0 | 3212 | 4260.0 | |
pr1761 | GRU | CPU | 512 | 32 | 16 | forw and back | 6193.0 | 6462 | 3990.0 | |
pr1761 | GRU | CUDA | 512 | 32 | 16 | forward | 1193.0 | 4097 | 471.670013427734 | |
pr1761 | GRU | CUDA | 512 | 32 | 16 | backward | 5133.0 | 13735 | 869.5 | |
pr1761 | GRU | CUDA | 512 | 32 | 16 | forw and back | 7216.0 | 19743 | 1360.0 | |
pr1761 | GRU | CPU | 512 | 32 | 32 | forward | 2510.0 | 193 | 333.329986572266 | |
pr1761 | GRU | CPU | 512 | 32 | 32 | backward | 7843.0 | 6348 | 8570.0 | |
pr1761 | GRU | CPU | 512 | 32 | 32 | forw and back | 12336.0 | 12766 | 8040.0 | |
pr1761 | GRU | CUDA | 512 | 32 | 32 | forward | 2221.0 | 8193 | 943.299987792969 | |
pr1761 | GRU | CUDA | 512 | 32 | 32 | backward | 9822.0 | 27383 | 1700.0 | |
pr1761 | GRU | CUDA | 512 | 32 | 32 | forw and back | 13770.0 | 39295 | 2710.0 | |
pr1761 | GRU | CPU | 512 | 32 | 64 | forward | 5059.0 | 385 | 669.590026855469 | |
pr1761 | GRU | CPU | 512 | 32 | 64 | backward | 15663.0 | 12620 | 17200.0 | |
pr1761 | GRU | CPU | 512 | 32 | 64 | forw and back | 24714.0 | 25374 | 16129.9990234375 | |
pr1761 | GRU | CUDA | 512 | 32 | 64 | forward | 4358.0 | 16385 | 1840.0 | |
pr1761 | GRU | CUDA | 512 | 32 | 64 | backward | 19248.0 | 54679 | 3390.0 | |
pr1761 | GRU | CUDA | 512 | 32 | 64 | forw and back | 26556.0 | 78399 | 5420.0 | |
pr1761 | GRU | CPU | 512 | 128 | 4 | forward | 535.008972167969 | 25 | 151.110000610352 | |
pr1761 | GRU | CPU | 512 | 128 | 4 | backward | 1459.0 | 859 | 2920.0 | |
pr1761 | GRU | CPU | 512 | 128 | 4 | forw and back | 2211.0 | 1732 | 2330.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 4 | forward | 388.5419921875 | 1041 | 118.199996948242 | |
pr1761 | GRU | CUDA | 512 | 128 | 4 | backward | 1544.0 | 3534 | 220.339996337891 | |
pr1761 | GRU | CUDA | 512 | 128 | 4 | forw and back | 2170.0 | 5113 | 357.119995117187 | |
pr1761 | GRU | CPU | 512 | 128 | 16 | forward | 2239.0 | 97 | 640.200012207031 | |
pr1761 | GRU | CPU | 512 | 128 | 16 | backward | 6088.0 | 3212 | 11960.0 | |
pr1761 | GRU | CPU | 512 | 128 | 16 | forw and back | 8836.0 | 6462 | 9640.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 16 | forward | 1186.0 | 4161 | 472.670013427734 | |
pr1761 | GRU | CUDA | 512 | 128 | 16 | backward | 5222.0 | 13879 | 871.75 | |
pr1761 | GRU | CUDA | 512 | 128 | 16 | forw and back | 7237.0 | 19887 | 1360.0 | |
pr1761 | GRU | CPU | 512 | 128 | 32 | forward | 4584.0 | 193 | 1260.0 | |
pr1761 | GRU | CPU | 512 | 128 | 32 | backward | 12185.0 | 6348 | 24020.0 | |
pr1761 | GRU | CPU | 512 | 128 | 32 | forw and back | 17606.0 | 12766 | 19380.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 32 | forward | 2194.0 | 8321 | 945.299987792969 | |
pr1761 | GRU | CUDA | 512 | 128 | 32 | backward | 10055.0 | 27671 | 1700.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 32 | forw and back | 13803.0 | 39583 | 2720.0 | |
pr1761 | GRU | CPU | 512 | 128 | 64 | forward | 9301.0 | 385 | 2540.0 | |
pr1761 | GRU | CPU | 512 | 128 | 64 | backward | 24435.0 | 12620 | 48140.0 | |
pr1761 | GRU | CPU | 512 | 128 | 64 | forw and back | 35196.0 | 25374 | 38870.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 64 | forward | 4295.0 | 16641 | 1850.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 64 | backward | 19502.0 | 55255 | 3400.0 | |
pr1761 | GRU | CUDA | 512 | 128 | 64 | forw and back | 26931.0 | 78975 | 5430.0 | |
pr1761 | GRU | CPU | 512 | 512 | 4 | forward | 1608.0 | 32 | 594.559997558594 | |
pr1761 | GRU | CPU | 512 | 512 | 4 | backward | 3340.0 | 887 | 10480.0 | |
pr1761 | GRU | CPU | 512 | 512 | 4 | forw and back | 5084.0 | 1779 | 7800.0 | |
pr1761 | GRU | CUDA | 512 | 512 | 4 | forward | 435.325988769531 | 1062 | 118.529998779297 | |
pr1761 | GRU | CUDA | 512 | 512 | 4 | backward | 1656.0 | 3626 | 224.089996337891 | |
pr1761 | GRU | CUDA | 512 | 512 | 4 | forw and back | 2307.0 | 5266 | 362.589996337891 | |
pr1761 | GRU | CPU | 512 | 512 | 16 | forward | 7007.0 | 128 | 2460.0 | |
pr1761 | GRU | CPU | 512 | 512 | 16 | backward | 15175.0 | 3336 | 42730.0 | |
pr1761 | GRU | CPU | 512 | 512 | 16 | forw and back | 19999.0 | 6665 | 32119.998046875 | |
pr1761 | GRU | CUDA | 512 | 512 | 16 | forward | 1338.0 | 4254 | 474.119995117187 | |
pr1761 | GRU | CUDA | 512 | 512 | 16 | backward | 5731.0 | 14259 | 886.940002441406 | |
pr1761 | GRU | CUDA | 512 | 512 | 16 | forw and back | 7789.0 | 20472 | 1380.0 | |
pr1761 | GRU | CPU | 512 | 512 | 32 | forward | 14150.0 | 256 | 4970.0 | |
pr1761 | GRU | CPU | 512 | 512 | 32 | backward | 30405.0 | 6600 | 85710.0 | |
pr1761 | GRU | CPU | 512 | 512 | 32 | forw and back | 39853.0 | 13177 | 64540.0 | |
pr1761 | GRU | CUDA | 512 | 512 | 32 | forward | 2516.0 | 8510 | 948.25 | |
pr1761 | GRU | CUDA | 512 | 512 | 32 | backward | 11757.0 | 28435 | 1730.0 | |
pr1761 | GRU | CUDA | 512 | 512 | 32 | forw and back | 15063.0 | 40744 | 2760.0 | |
pr1761 | GRU | CPU | 512 | 512 | 64 | forward | 28400.0 | 512 | 9990.0 | |
pr1761 | GRU | CPU | 512 | 512 | 64 | backward | 60943.0 | 13128 | 171690.0 | |
pr1761 | GRU | CPU | 512 | 512 | 64 | forw and back | 79459.0 | 26201 | 129369.9921875 | |
pr1761 | GRU | CUDA | 512 | 512 | 64 | forward | 4888.0 | 17022 | 1850.0 | |
pr1761 | GRU | CUDA | 512 | 512 | 64 | backward | 22380.0 | 56787 | 3460.0 | |
pr1761 | GRU | CUDA | 512 | 512 | 64 | forw and back | 29432.0 | 81288 | 5500.0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment