Operator | Inputs | Avg Forward Time (ms) (1.4.1) | Avg Forward Time (ms) (1.5.0) | Avg. Backward Time (ms) (1.4.1) | Avg. Backward Time (ms) (1.5.0) | Max Mem Usage (Storage) (Bytes) (1.4.1) | Max Mem Usage (Storage) (Bytes) (1.5.0) |
---|---|---|---|---|---|---|---|
BatchNorm | {'beta': (3,), 'moving_mean': (3,), 'gamma': (3,), 'data': (32, 3, 256, 256), 'moving_var': (3,)} | 8.0009 | 7.9855 | 7.9907 | 8.014 | 12582.9238 | 12582.9238 |
BatchNorm | {'beta': (3,), 'moving_mean': (3,), 'gamma': (3,), 'data': (32, 3, 10000, 10), 'moving_var': (3,)} | 14.4459 | 14.5993 | 12.363 | 12.5098 | 38400.0117 | 19200.0117 |
BlockGrad | {'data': (1024, 1024)} | 0.9345 | 1.0886 | --- | --- | 2097.1521 | 2097.1521 |
BlockGrad | {'data': (10000, 1)} | 0.0038 | 0.0054 | --- | --- | 20.0 | 20.0 |
BlockGrad | {'data': (10000, 100)} | 1.1126 | 1.1151 | --- | --- | 2000.0 | 2000.0 |
Convolution | {'kernel': (3,), 'weight': (64, 3, 3), 'bias': (64,), 'data': (32, 3, 256), 'layout': 'NCW', 'pad': (0,), 'dilate': (1,), 'num_filter': 64, 'stride': (1,)} | 1.7578 | 0.242 | 3.3886 | 0.5067 | 2080.7681 | 2080.7681 |
Dropout | {'mode': 'always', 'data': (32, 3, 256, 256), 'p': 0.5} | 2.1245 | 1.8425 | 0.7703 | 0.6013 | 37748.7344 | 37748.7344 |
Dropout | {'mode': 'always', 'data': (10000, 10), 'p': 0.5} | 0.0241 | 0.0365 | 0.0078 | 0.0121 | 400.0 | 400.0 |
Flatten | {'data': (1024, 1024)} | 1.181 | 1.1202 | --- | --- | 2097.1521 | 2097.1521 |
Flatten | {'data': (10000, 1)} | 0.0035 | 0.0103 | --- | --- | 20.0 | 20.0 |
Flatten | {'data': (10000, 100)} | 1.1459 | 1.0995 | --- | --- | 2000.0 | 2000.0 |
FullyConnected | {'num_hidden': 64, 'flatten': True, 'weight': (64, 196608), 'bias': (64,), 'data': (32, 3, 256, 256)} | 1.7245 | 1.6666 | 4.664 | 4.5218 | 8.192 | 8.192 |
FullyConnected | {'num_hidden': 64, 'flatten': False, 'weight': (64, 256), 'bias': (64,), 'data': (32, 3, 256, 256)} | 0.7191 | 0.4847 | 1.7348 | 1.8349 | 6291.4561 | 6291.4561 |
LeakyReLU | {'slope': 0.1, 'act_type': 'leaky', 'data': (1024, 1024)} | 0.433 | 0.162 | 0.0546 | 0.0178 | 2097.1521 | 2097.1521 |
LeakyReLU | {'slope': 0.1, 'act_type': 'leaky', 'data': (10000, 1)} | 0.0105 | 0.0294 | 0.0108 | 0.0147 | 20.0 | 20.0 |
LeakyReLU | {'slope': 0.1, 'act_type': 'leaky', 'data': (10000, 100)} | 0.1542 | 0.1556 | 0.0203 | 0.0132 | 2000.0 | 2000.0 |
LeakyReLU | {'slope': 0.1, 'act_type': 'elu', 'data': (1024, 1024)} | 0.4995 | 0.497 | 0.1719 | 0.1792 | 4194.3042 | 4194.3042 |
LeakyReLU | {'slope': 0.1, 'act_type': 'elu', 'data': (10000, 1)} | 0.0105 | 0.0119 | 0.0074 | 0.0456 | 20.0 | 20.0 |
LeakyReLU | {'slope': 0.1, 'act_type': 'elu', 'data': (10000, 100)} | 0.4669 | 0.4832 | 0.1624 | 0.1688 | 2000.0 | 2000.0 |
LeakyReLU | {'act_type': 'selu', 'data': (1024, 1024)} | 0.4989 | 0.5195 | 0.171 | 0.1744 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'selu', 'data': (10000, 1)} | 0.0099 | 0.0108 | 0.0074 | 0.047 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'selu', 'data': (10000, 100)} | 0.4729 | 0.4954 | 0.1598 | 0.1659 | 2000.0 | 2000.0 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 1024), 'data': (1024, 1024)} | 0.1798 | 0.2583 | 4.8877 | 2.3837 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 1), 'data': (10000, 1)} | 0.0148 | 0.0116 | 0.5144 | 0.2028 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 100), 'data': (10000, 100)} | 0.1752 | 0.2649 | 3.8858 | 2.1013 | 2000.0 | 2000.0 |
Pooling | {'kernel': 3, 'data': (32, 3, 256), 'global_pool': 0, 'pad': 1, 'pool_type': 'avg', 'stride': 1} | 0.1805 | 0.1746 | 0.2323 | 0.2315 | 49.152 | 49.152 |
abs | {'data': (1024, 1024)} | 0.0197 | 0.0207 | 0.1646 | 0.1722 | 2097.1521 | 2097.1521 |
abs | {'data': (10000, 1)} | 0.0074 | 0.0091 | 0.0071 | 0.038 | 20.0 | 20.0 |
abs | {'data': (10000, 100)} | 0.019 | 0.0205 | 0.1572 | 0.1638 | 2000.0 | 2000.0 |
arccos | {'data': (1024, 1024)} | 0.5484 | 0.574 | 0.3664 | 0.3713 | 2097.1521 | 2097.1521 |
arccos | {'data': (10000, 1)} | 0.0125 | 0.0137 | 0.0973 | 0.0928 | 20.0 | 20.0 |
arccos | {'data': (10000, 100)} | 0.5266 | 0.5505 | 0.3972 | 0.3564 | 2000.0 | 2000.0 |
arccosh | {'data': (1024, 1024)} | 0.4695 | 0.3511 | 0.4696 | 0.3398 | 2097.1521 | 2097.1521 |
arccosh | {'data': (10000, 1)} | 0.0158 | 0.0116 | 0.01 | 0.0108 | 20.0 | 20.0 |
arccosh | {'data': (10000, 100)} | 0.4494 | 0.4338 | 0.451 | 0.4419 | 2000.0 | 2000.0 |
arcsin | {'data': (1024, 1024)} | 0.5129 | 0.3933 | 0.3663 | 0.2937 | 2097.1521 | 2097.1521 |
arcsin | {'data': (10000, 1)} | 0.0136 | 0.0126 | 0.104 | 0.0958 | 20.0 | 20.0 |
arcsin | {'data': (10000, 100)} | 0.493 | 0.37 | 0.3513 | 0.2768 | 2000.0 | 2000.0 |
arcsinh | {'data': (1024, 1024)} | 1.0431 | 1.0381 | 0.2859 | 0.279 | 2097.1521 | 2097.1521 |
arcsinh | {'data': (10000, 1)} | 0.0141 | 0.0209 | 0.0117 | 0.0108 | 20.0 | 20.0 |
arcsinh | {'data': (10000, 100)} | 0.9962 | 0.9878 | 0.2718 | 0.2684 | 2000.0 | 2000.0 |
arctan | {'data': (1024, 1024)} | 0.6258 | 0.4473 | 0.0396 | 0.0318 | 2097.1521 | 2097.1521 |
arctan | {'data': (10000, 1)} | 0.0103 | 0.0115 | 0.0067 | 0.0163 | 20.0 | 20.0 |
arctan | {'data': (10000, 100)} | 0.6085 | 0.4275 | 0.0318 | 0.0138 | 4000.0 | 2000.0 |
arctanh | {'data': (1024, 1024)} | 0.8912 | 0.6664 | 0.0383 | 0.0382 | 2097.1521 | 2097.1521 |
arctanh | {'data': (10000, 1)} | 0.0176 | 0.0154 | 0.0077 | 0.0164 | 20.0 | 20.0 |
arctanh | {'data': (10000, 100)} | 0.8508 | 0.6387 | 0.0309 | 0.0194 | 2000.0 | 2000.0 |
argmax_channel | {'data': (1024, 1024)} | 0.4417 | 0.1278 | --- | --- | 4.096 | 2.048 |
argmax_channel | {'data': (10000, 1)} | 0.0174 | 0.012 | --- | --- | 20.0 | 20.0 |
argmax_channel | {'data': (10000, 100)} | 0.447 | 0.1417 | --- | --- | 20.0 | 20.0 |
batch_dot | {'lhs': (32, 1024, 1024), 'rhs': (32, 1024, 1024)} | 60.4961 | 56.3872 | 72.3274 | 70.2494 | 134217.7344 | 134217.7344 |
batch_dot | {'lhs': (32, 1000, 10), 'rhs': (32, 1000, 10), 'transpose_b': True} | 32.023 | 35.8877 | 6.6535 | 6.3897 | 128000.0 | 128000.0 |
batch_dot | {'lhs': (32, 1000, 1), 'rhs': (32, 100, 1000), 'transpose_b': True, 'transpose_a': True} | 0.207 | 0.6677 | 1.1905 | 1.2125 | 12.8 | 12.8 |
broadcast_add | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0046 | 0.0021 | 0.0035 | 0.012 | 0.012 |
broadcast_div | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0044 | 0.003 | 0.0049 | 0.012 | 0.012 |
broadcast_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0034 | --- | --- | 0.012 | 0.012 |
broadcast_greater | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0018 | 0.0036 | --- | --- | 0.012 | 0.012 |
broadcast_greater_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0027 | 0.0034 | --- | --- | 0.012 | 0.012 |
broadcast_hypot | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0021 | 0.0045 | 0.0024 | 0.0051 | 0.012 | 0.012 |
broadcast_lesser | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0018 | 0.0036 | --- | --- | 0.012 | 0.012 |
broadcast_lesser_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0018 | 0.0035 | --- | --- | 0.012 | 0.012 |
broadcast_logical_and | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0018 | 0.0034 | --- | --- | 0.012 | 0.012 |
broadcast_logical_or | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0027 | 0.0038 | --- | --- | 0.012 | 0.012 |
broadcast_logical_xor | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0033 | --- | --- | 0.012 | 0.012 |
broadcast_maximum | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.002 | 0.0043 | 0.0022 | 0.0052 | 0.012 | 0.012 |
broadcast_minimum | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.002 | 0.0045 | 0.0022 | 0.0047 | 0.012 | 0.012 |
broadcast_minus | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | --- | --- | --- | --- | 0.012 | 0.012 |
broadcast_mod | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0021 | 0.0045 | 0.0022 | 0.0051 | 0.012 | 0.012 |
broadcast_mul | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0019 | 0.0047 | 0.0022 | 0.0052 | 0.012 | 0.012 |
broadcast_not_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0026 | 0.0034 | --- | --- | 0.012 | 0.012 |
broadcast_plus | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | --- | --- | --- | --- | 0.012 | 0.012 |
broadcast_power | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0028 | 0.005 | 0.0037 | 0.0059 | 0.012 | 0.012 |
broadcast_sub | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0045 | 0.0022 | 0.0037 | 0.012 | 0.012 |
cbrt | {'data': (1024, 1024)} | 0.9478 | 0.6315 | 0.0332 | 0.0239 | 2097.1521 | 2097.1521 |
cbrt | {'data': (10000, 1)} | 0.0154 | 0.0154 | 0.01 | 0.0179 | 20.0 | 20.0 |
cbrt | {'data': (10000, 100)} | 0.9075 | 0.6018 | 0.0308 | 0.0233 | 2000.0 | 2000.0 |
ceil | {'data': (1024, 1024)} | 0.1269 | 0.0813 | --- | --- | 2097.1521 | 2097.1521 |
ceil | {'data': (10000, 1)} | 0.0213 | 0.0243 | --- | --- | 20.0 | 20.0 |
ceil | {'data': (10000, 100)} | 0.0767 | 0.0781 | --- | --- | 2000.0 | 2000.0 |
cos | {'data': (1024, 1024)} | 0.3447 | 0.2635 | 0.3667 | 0.2383 | 2097.1521 | 2097.1521 |
cos | {'data': (10000, 1)} | 0.0093 | 0.0097 | 0.0096 | 0.01 | 20.0 | 20.0 |
cos | {'data': (10000, 100)} | 0.3281 | 0.2539 | 0.3493 | 0.2285 | 2000.0 | 2000.0 |
cosh | {'data': (1024, 1024)} | 0.6565 | 0.654 | 1.1456 | 1.149 | 2097.1521 | 2097.1521 |
cosh | {'data': (10000, 1)} | 0.0167 | 0.0116 | 0.0154 | 0.0156 | 20.0 | 20.0 |
cosh | {'data': (10000, 100)} | 0.6261 | 0.6237 | 1.0902 | 1.0988 | 2000.0 | 2000.0 |
degrees | {'data': (1024, 1024)} | 0.0211 | 0.023 | 0.0176 | 0.0205 | 2097.1521 | 2097.1521 |
degrees | {'data': (10000, 1)} | 0.0038 | 0.011 | 0.0067 | 0.0128 | 20.0 | 20.0 |
degrees | {'data': (10000, 100)} | 0.0204 | 0.0223 | 0.0143 | 0.0164 | 2000.0 | 2000.0 |
dot | {'lhs': (1024, 1024), 'rhs': (1024, 1024)} | 1.0055 | 0.8465 | 2.3081 | 1.9458 | 4194.3042 | 2097.1521 |
dot | {'lhs': (1000, 10), 'rhs': (1000, 10), 'transpose_b': True} | 0.1167 | 0.0817 | 0.1221 | 0.1437 | 2000.0 | 2000.0 |
dot | {'lhs': (1000, 1), 'rhs': (100, 1000), 'transpose_b': True, 'transpose_a': True} | 0.0102 | 0.0161 | 0.0238 | 0.0429 | 0.2 | 0.2 |
elemwise_add | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0214 | 0.0187 | --- | --- | 0.012 | 0.024 |
elemwise_div | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0025 | 0.0033 | --- | --- | 0.012 | 0.012 |
elemwise_mul | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0026 | 0.0034 | --- | --- | 0.012 | 0.012 |
elemwise_sub | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0026 | 0.0035 | --- | --- | 0.012 | 0.012 |
erf | {'data': (1024, 1024)} | 0.6751 | 0.5095 | 0.6064 | 0.4381 | 2097.1521 | 2097.1521 |
erf | {'data': (10000, 1)} | 0.0146 | 0.0128 | 0.0139 | 0.0125 | 20.0 | 20.0 |
erf | {'data': (10000, 100)} | 0.6425 | 0.5344 | 0.5882 | 0.4183 | 4000.0 | 2000.0 |
exp | {'data': (1024, 1024)} | 0.4367 | 0.2743 | --- | --- | 2097.1521 | 2097.1521 |
exp | {'data': (10000, 1)} | 0.0123 | 0.0103 | --- | --- | 20.0 | 20.0 |
exp | {'data': (10000, 100)} | 0.4143 | 0.2635 | --- | --- | 2000.0 | 2000.0 |
expm1 | {'data': (1024, 1024)} | 0.7395 | 0.7458 | 0.4674 | 0.4593 | 2097.1521 | 2097.1521 |
expm1 | {'data': (10000, 1)} | 0.0224 | 0.0139 | 0.0115 | 0.0118 | 20.0 | 20.0 |
expm1 | {'data': (10000, 100)} | 0.7062 | 0.7087 | 0.4415 | 0.4374 | 2000.0 | 2000.0 |
fix | {'data': (1024, 1024)} | 0.2035 | 0.2007 | --- | --- | 2097.1521 | 2097.1521 |
fix | {'data': (10000, 1)} | 0.0084 | 0.0425 | --- | --- | 20.0 | 20.0 |
fix | {'data': (10000, 100)} | 0.1926 | 0.2019 | --- | --- | 2000.0 | 4000.0 |
flatten | {'data': (1024, 1024)} | --- | --- | --- | --- | 2097.1521 | 2097.1521 |
flatten | {'data': (10000, 1)} | --- | --- | --- | --- | 20.0 | 20.0 |
flatten | {'data': (10000, 100)} | --- | --- | --- | --- | 2000.0 | 2000.0 |
floor | {'data': (1024, 1024)} | 0.0855 | 0.0771 | --- | --- | 2097.1521 | 2097.1521 |
floor | {'data': (10000, 1)} | 0.0231 | 0.0218 | --- | --- | 20.0 | 20.0 |
floor | {'data': (10000, 100)} | 0.0786 | 0.0743 | --- | --- | 2000.0 | 2000.0 |
gamma | {'data': (1024, 1024)} | 2.7952 | 2.7983 | 5.2844 | 5.1828 | 4194.3042 | 2097.1521 |
gamma | {'data': (10000, 1)} | 0.0419 | 0.0317 | 0.0561 | 0.0564 | 20.0 | 20.0 |
gamma | {'data': (10000, 100)} | 2.6624 | 2.2738 | 5.0535 | 4.1741 | 4000.0 | 2000.0 |
gammaln | {'data': (1024, 1024)} | 37.6309 | 17.0666 | 2.0739 | 2.5733 | 4194.3042 | 2097.1521 |
gammaln | {'data': (10000, 1)} | 0.3838 | 0.1885 | 0.0302 | 0.0325 | 20.0 | 20.0 |
gammaln | {'data': (10000, 100)} | 35.9658 | 19.2171 | 1.8351 | 1.9163 | 4000.0 | 2000.0 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (1024, 1024)} | 0.0579 | 0.0663 | 0.0506 | 0.0825 | 2097.1521 | 2097.1521 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (10000, 1)} | 0.0077 | 0.0076 | 0.0075 | 0.0079 | 20.0 | 20.0 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (10000, 100)} | 0.0555 | 0.0632 | 0.0479 | 0.0792 | 2000.0 | 2000.0 |
identity | {'data': (1024, 1024)} | --- | --- | --- | --- | 2097.1521 | 2097.1521 |
identity | {'data': (10000, 1)} | --- | --- | --- | --- | 20.0 | 20.0 |
identity | {'data': (10000, 100)} | --- | --- | --- | --- | 2000.0 | 2000.0 |
log | {'data': (1024, 1024)} | 0.5515 | 0.5511 | 0.0288 | 0.0297 | 2097.1521 | 2097.1521 |
log | {'data': (10000, 1)} | 0.0137 | 0.0131 | 0.0074 | 0.015 | 20.0 | 20.0 |
log | {'data': (10000, 100)} | 0.5272 | 0.5276 | 0.0165 | 0.0178 | 2000.0 | 2000.0 |
log10 | {'data': (1024, 1024)} | 0.6513 | 0.6611 | 0.0368 | 0.0321 | 2097.1521 | 2097.1521 |
log10 | {'data': (10000, 1)} | 0.0147 | 0.0143 | 0.0072 | 0.0126 | 20.0 | 20.0 |
log10 | {'data': (10000, 100)} | 0.6318 | 0.6224 | 0.0311 | 0.0245 | 2000.0 | 2000.0 |
log1p | {'data': (1024, 1024)} | 0.6822 | 0.7117 | 0.0354 | 0.0337 | 2097.1521 | 2097.1521 |
log1p | {'data': (10000, 1)} | 0.0131 | 0.0114 | 0.0076 | 0.0128 | 20.0 | 20.0 |
log1p | {'data': (10000, 100)} | 0.6593 | 0.6774 | 0.0272 | 0.028 | 2000.0 | 2000.0 |
log2 | {'data': (1024, 1024)} | 0.5325 | 0.378 | 0.0306 | 0.0283 | 2097.1521 | 2097.1521 |
log2 | {'data': (10000, 1)} | 0.0141 | 0.0121 | 0.0073 | 0.0149 | 20.0 | 20.0 |
log2 | {'data': (10000, 100)} | 0.5092 | 0.3642 | 0.0181 | 0.0132 | 2000.0 | 2000.0 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (1024, 1024)} | 1.176 | 1.1758 | 0.5789 | 0.5908 | 4194.3042 | 2097.1521 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 1)} | 0.0213 | 0.0172 | 0.0184 | 0.0139 | 20.0 | 20.0 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 100)} | 0.9789 | 1.1135 | 0.4785 | 0.5578 | 2000.0 | 2000.0 |
logical_not | {'data': (1024, 1024)} | 0.0342 | 0.0328 | --- | --- | 2097.1521 | 2097.1521 |
logical_not | {'data': (10000, 1)} | 0.0048 | 0.0175 | --- | --- | 40.0 | 20.0 |
logical_not | {'data': (10000, 100)} | 0.056 | 0.0314 | --- | --- | 4000.0 | 2000.0 |
make_loss | {'data': (1024, 1024)} | 1.2697 | 1.1637 | --- | --- | 2097.1521 | 2097.1521 |
make_loss | {'data': (10000, 1)} | 0.0058 | 0.0051 | --- | --- | 20.0 | 20.0 |
make_loss | {'data': (10000, 100)} | 1.49 | 1.0948 | --- | --- | 4000.0 | 2000.0 |
max | {'axis': (), 'data': (1024, 1024)} | 19.1602 | 7.1825 | 1.8421 | 0.3114 | 0.002 | 0.002 |
max | {'axis': 0, 'data': (10000, 1)} | 0.218 | 0.1065 | 0.0256 | 0.0121 | 0.002 | 0.002 |
max | {'axis': (0, 1), 'data': (10000, 100)} | 18.2947 | 6.8747 | 1.7538 | 0.3819 | 0.002 | 0.002 |
max_axis | {'axis': (), 'data': (1024, 1024)} | 21.3432 | 7.3994 | --- | --- | 0.004 | 0.004 |
max_axis | {'axis': 0, 'data': (10000, 1)} | 0.1923 | 0.0794 | --- | --- | 0.002 | 0.002 |
max_axis | {'axis': (0, 1), 'data': (10000, 100)} | 19.08 | 7.1096 | --- | --- | 0.004 | 0.002 |
mean | {'axis': (), 'data': (1024, 1024)} | 20.3092 | 7.6116 | 1.7335 | 0.8835 | 0.004 | 0.002 |
mean | {'axis': 0, 'data': (10000, 1)} | 0.2564 | 0.1173 | 0.0272 | 0.0185 | 0.002 | 0.002 |
mean | {'axis': (0, 1), 'data': (10000, 100)} | 22.8492 | 7.2628 | 1.7059 | 0.7517 | 0.004 | 0.002 |
min | {'axis': (), 'data': (1024, 1024)} | 19.0906 | 7.3093 | 1.8435 | 0.4186 | 0.002 | 0.004 |
min | {'axis': 0, 'data': (10000, 1)} | 0.2145 | 0.0732 | 0.0256 | 0.0195 | 0.002 | 0.002 |
min | {'axis': (0, 1), 'data': (10000, 100)} | 18.2143 | 7.62 | 1.754 | 0.525 | 0.002 | 0.004 |
min_axis | {'axis': (), 'data': (1024, 1024)} | 19.0678 | 7.1708 | --- | --- | 0.002 | 0.004 |
min_axis | {'axis': 0, 'data': (10000, 1)} | 0.1914 | 0.0726 | --- | --- | 0.002 | 0.002 |
min_axis | {'axis': (0, 1), 'data': (10000, 100)} | 18.1892 | 7.2402 | --- | --- | 0.002 | 0.002 |
nanprod | {'axis': (), 'data': (1024, 1024)} | 18.9492 | 7.2374 | 1.8323 | 0.4292 | 0.002 | 0.002 |
nanprod | {'axis': 0, 'data': (10000, 1)} | 0.2108 | 0.1044 | 0.0252 | 0.014 | 0.002 | 0.002 |
nanprod | {'axis': (0, 1), 'data': (10000, 100)} | 18.0501 | 6.9665 | 1.7485 | 0.4629 | 0.002 | 0.004 |
nansum | {'axis': (), 'data': (1024, 1024)} | 19.3539 | 9.5764 | 0.9261 | 0.3318 | 0.002 | 0.004 |
nansum | {'axis': 0, 'data': (10000, 1)} | 0.218 | 0.1204 | 0.0165 | 0.0128 | 0.002 | 0.002 |
nansum | {'axis': (0, 1), 'data': (10000, 100)} | 18.4616 | 8.982 | 0.8795 | 0.3825 | 0.002 | 0.004 |
negative | {'data': (1024, 1024)} | 0.0309 | 0.0313 | --- | --- | 2097.1521 | 2097.1521 |
negative | {'data': (10000, 1)} | 0.0038 | 0.0099 | --- | --- | 20.0 | 20.0 |
negative | {'data': (10000, 100)} | 0.0294 | 0.0307 | --- | --- | 2000.0 | 2000.0 |
ones_like | {'data': (1024, 1024)} | 0.0273 | 0.0279 | --- | --- | 2097.1521 | 2097.1521 |
ones_like | {'data': (10000, 1)} | 0.0031 | 0.0099 | --- | --- | 20.0 | 20.0 |
ones_like | {'data': (10000, 100)} | 0.0264 | 0.0278 | --- | --- | 2000.0 | 2000.0 |
prod | {'axis': (), 'data': (1024, 1024)} | 18.3052 | 7.3333 | 1.8268 | 0.6137 | 0.002 | 0.004 |
prod | {'axis': 0, 'data': (10000, 1)} | 0.2063 | 0.0969 | 0.0251 | 0.0165 | 0.002 | 0.002 |
prod | {'axis': (0, 1), 'data': (10000, 100)} | 17.4361 | 8.0385 | 1.746 | 0.6575 | 0.002 | 0.004 |
radians | {'data': (1024, 1024)} | 0.0229 | 0.0205 | 0.0192 | 0.0194 | 2097.1521 | 2097.1521 |
radians | {'data': (10000, 1)} | 0.0086 | 0.0071 | 0.0068 | 0.0095 | 20.0 | 20.0 |
radians | {'data': (10000, 100)} | 0.0222 | 0.0202 | 0.0155 | 0.0143 | 2000.0 | 2000.0 |
random_exponential | {'shape': (1024, 1024)} | 1.6658 | 1.1995 | --- | --- | 4194.3042 | 4194.3042 |
random_exponential | {'shape': (10000, 1)} | 0.026 | 0.0245 | --- | --- | 20.0 | 20.0 |
random_exponential | {'shape': (10000, 100)} | 1.762 | 1.1383 | --- | --- | 4000.0 | 2000.0 |
random_gamma | {'shape': (1024, 1024)} | 4.4205 | 5.1698 | --- | --- | 2097.1521 | 2097.1521 |
random_gamma | {'shape': (10000, 1)} | 0.058 | 0.0654 | --- | --- | 20.0 | 20.0 |
random_gamma | {'shape': (10000, 100)} | 4.2231 | 4.949 | --- | --- | 4000.0 | 4000.0 |
random_generalized_negative_binomial | {'shape': (1024, 1024)} | 6.2619 | 7.1031 | --- | --- | 4194.3042 | 4194.3042 |
random_generalized_negative_binomial | {'shape': (10000, 1)} | 0.0681 | 0.0883 | --- | --- | 20.0 | 20.0 |
random_generalized_negative_binomial | {'shape': (10000, 100)} | 5.11 | 6.7376 | --- | --- | 2000.0 | 4000.0 |
random_negative_binomial | {'shape': (1024, 1024), 'k': 1, 'p': 1} | 5.9661 | 4.019 | --- | --- | 2097.1521 | 2097.1521 |
random_negative_binomial | {'shape': (10000, 1), 'k': 1, 'p': 1} | 0.0731 | 0.0558 | --- | --- | 20.0 | 20.0 |
random_negative_binomial | {'shape': (10000, 100), 'k': 1, 'p': 1} | 5.7027 | 3.9245 | --- | --- | 4000.0 | 2000.0 |
random_normal | {'shape': (1024, 1024)} | 2.1622 | 1.8507 | --- | --- | 2097.1521 | 2097.1521 |
random_normal | {'shape': (10000, 1)} | 0.0295 | 0.034 | --- | --- | 20.0 | 20.0 |
random_normal | {'shape': (10000, 100)} | 2.0216 | 1.7918 | --- | --- | 2000.0 | 4000.0 |
random_poisson | {'shape': (1024, 1024)} | 1.8634 | 1.6017 | --- | --- | 2097.1521 | 4194.3042 |
random_poisson | {'shape': (10000, 1)} | 0.0265 | 0.0258 | --- | --- | 20.0 | 40.0 |
random_poisson | {'shape': (10000, 100)} | 1.7835 | 1.5337 | --- | --- | 2000.0 | 4000.0 |
random_randint | {'low': 0, 'shape': (1024, 1024), 'high': 5} | 1.0431 | 0.8278 | --- | --- | 4194.3042 | 2097.1521 |
random_randint | {'low': 0, 'shape': (10000, 1), 'high': 5} | 0.0183 | 0.0167 | --- | --- | 20.0 | 20.0 |
random_randint | {'low': 0, 'shape': (10000, 100), 'high': 5} | 1.0256 | 0.7907 | --- | --- | 4000.0 | 2000.0 |
random_uniform | {'low': 0, 'shape': (1024, 1024), 'high': 5} | 0.5809 | 0.4009 | --- | --- | 2097.1521 | 4194.3042 |
random_uniform | {'low': 0, 'shape': (10000, 1), 'high': 5} | 0.0128 | 0.0124 | --- | --- | 20.0 | 20.0 |
random_uniform | {'low': 0, 'shape': (10000, 100), 'high': 5} | 0.5552 | 0.4814 | --- | --- | 2000.0 | 2000.0 |
rcbrt | {'data': (1024, 1024)} | 1.0 | 0.6241 | 1.0904 | 0.6415 | 4194.3042 | 2097.1521 |
rcbrt | {'data': (10000, 1)} | 0.0148 | 0.0148 | 0.0152 | 0.0142 | 20.0 | 20.0 |
rcbrt | {'data': (10000, 100)} | 0.9322 | 0.5979 | 1.0092 | 0.613 | 4000.0 | 2000.0 |
reciprocal | {'data': (1024, 1024)} | 0.0313 | 0.0214 | 0.0566 | 0.0312 | 2097.1521 | 2097.1521 |
reciprocal | {'data': (10000, 1)} | 0.0043 | 0.0134 | 0.007 | 0.0159 | 20.0 | 20.0 |
reciprocal | {'data': (10000, 100)} | 0.0302 | 0.0213 | 0.0296 | 0.0155 | 2000.0 | 2000.0 |
relu | {'data': (1024, 1024)} | 0.022 | 0.0353 | 0.0225 | 0.037 | 2097.1521 | 2097.1521 |
relu | {'data': (10000, 1)} | 0.0036 | 0.0144 | 0.0068 | 0.0187 | 20.0 | 20.0 |
relu | {'data': (10000, 100)} | 0.0213 | 0.034 | 0.0175 | 0.0345 | 2000.0 | 2000.0 |
rint | {'data': (1024, 1024)} | 0.2766 | 0.1615 | --- | --- | 2097.1521 | 2097.1521 |
rint | {'data': (10000, 1)} | 0.0078 | 0.0483 | --- | --- | 20.0 | 20.0 |
rint | {'data': (10000, 100)} | 0.158 | 0.1544 | --- | --- | 2000.0 | 2000.0 |
round | {'data': (1024, 1024)} | 0.2226 | 0.1733 | --- | --- | 2097.1521 | 2097.1521 |
round | {'data': (10000, 1)} | 0.0082 | 0.0596 | --- | --- | 20.0 | 20.0 |
round | {'data': (10000, 100)} | 0.212 | 0.1669 | --- | --- | 2000.0 | 2000.0 |
rsqrt | {'data': (1024, 1024)} | 0.4565 | 0.3737 | 0.4722 | 0.3615 | 2097.1521 | 2097.1521 |
rsqrt | {'data': (10000, 1)} | 0.0115 | 0.0105 | 0.01 | 0.0108 | 20.0 | 20.0 |
rsqrt | {'data': (10000, 100)} | 0.4381 | 0.3602 | 0.4502 | 0.3447 | 2000.0 | 2000.0 |
sample_exponential | {'shape': (1024, 1024), 'lam': [1.0, 8.5]} | 3.6523 | 3.1681 | --- | --- | 8388.6084 | 8388.6084 |
sample_exponential | {'shape': (10000, 1), 'lam': [1.0, 8.5]} | 0.0404 | 0.0384 | --- | --- | 40.0 | 40.0 |
sample_exponential | {'shape': (10000, 100), 'lam': [1.0, 8.5]} | 2.7679 | 2.9097 | --- | --- | 4000.0 | 8000.0 |
sample_gamma | {'shape': (1024, 1024), 'alpha': [0.0, 2.5], 'beta': [1.0, 0.7]} | 13.1229 | 12.3537 | --- | --- | 8388.6084 | 8388.6084 |
sample_gamma | {'shape': (10000, 1), 'alpha': [0.0, 2.5], 'beta': [1.0, 0.7]} | 0.1378 | 0.1359 | --- | --- | 80.0 | 40.0 |
sample_gamma | {'shape': (10000, 100), 'alpha': [0.0, 2.5], 'beta': [1.0, 0.7]} | 12.4871 | 11.8388 | --- | --- | 8000.0 | 4000.0 |
sample_generalized_negative_binomial | {'shape': (1024, 1024), 'alpha': [0.0, 2.5], 'mu': [2.0, 2.5]} | 27.3342 | 27.56 | --- | --- | 4194.3042 | 8388.6084 |
sample_generalized_negative_binomial | {'shape': (10000, 1), 'alpha': [0.0, 2.5], 'mu': [2.0, 2.5]} | 0.2789 | 0.2808 | --- | --- | 40.0 | 40.0 |
sample_generalized_negative_binomial | {'shape': (10000, 100), 'alpha': [0.0, 2.5], 'mu': [2.0, 2.5]} | 26.0722 | 25.7926 | --- | --- | 4000.0 | 4000.0 |
sample_negative_binomial | {'shape': (1024, 1024), 'k': [20, 49], 'p': [0.4, 0.77]} | 335.0974 | 270.0913 | --- | --- | 8388.6084 | 4194.3042 |
sample_negative_binomial | {'shape': (10000, 1), 'k': [20, 49], 'p': [0.4, 0.77]} | 3.1917 | 2.5851 | --- | --- | 80.0 | 80.0 |
sample_negative_binomial | {'shape': (10000, 100), 'k': [20, 49], 'p': [0.4, 0.77]} | 318.5452 | 257.783 | --- | --- | 8000.0 | 8000.0 |
sample_normal | {'shape': (1024, 1024), 'sigma': [1.0, 3.7], 'mu': [2.0, 2.5]} | 2.8849 | 3.7052 | --- | --- | 4194.3042 | 8388.6084 |
sample_normal | {'shape': (10000, 1), 'sigma': [1.0, 3.7], 'mu': [2.0, 2.5]} | 0.0461 | 0.0449 | --- | --- | 40.0 | 40.0 |
sample_normal | {'shape': (10000, 100), 'sigma': [1.0, 3.7], 'mu': [2.0, 2.5]} | 4.0236 | 3.5366 | --- | --- | 8000.0 | 8000.0 |
sample_poisson | {'shape': (1024, 1024), 'lam': [1.0, 8.5]} | 5.1852 | 5.1714 | --- | --- | 4194.3042 | 8388.6084 |
sample_poisson | {'shape': (10000, 1), 'lam': [1.0, 8.5]} | 0.0598 | 0.0661 | --- | --- | 40.0 | 40.0 |
sample_poisson | {'shape': (10000, 100), 'lam': [1.0, 8.5]} | 4.9439 | 4.9404 | --- | --- | 4000.0 | 8000.0 |
sample_uniform | {'low': [0.0, 2.5], 'shape': (1024, 1024), 'high': [1.0, 3.7]} | 0.8399 | 0.7192 | --- | --- | 4194.3042 | 4194.3042 |
sample_uniform | {'low': [0.0, 2.5], 'shape': (10000, 1), 'high': [1.0, 3.7]} | 0.0184 | 0.0155 | --- | --- | 40.0 | 40.0 |
sample_uniform | {'low': [0.0, 2.5], 'shape': (10000, 100), 'high': [1.0, 3.7]} | 0.8011 | 0.692 | --- | --- | 4000.0 | 4000.0 |
shuffle | {'data': (1024, 1024)} | 1.5753 | 1.4351 | --- | --- | 4194.3042 | 2097.1521 |
shuffle | {'data': (10000, 1)} | 0.2571 | 0.2901 | --- | --- | 20.0 | 20.0 |
shuffle | {'data': (10000, 100)} | 1.8581 | 1.9192 | --- | --- | 2000.0 | 2000.0 |
sigmoid | {'data': (1024, 1024)} | 0.4888 | 0.6225 | 0.0278 | 0.0293 | 4194.3042 | 2097.1521 |
sigmoid | {'data': (10000, 1)} | 0.0148 | 0.0132 | 0.0108 | 0.0184 | 20.0 | 20.0 |
sigmoid | {'data': (10000, 100)} | 0.4766 | 0.4633 | 0.0192 | 0.0179 | 2000.0 | 2000.0 |
sign | {'data': (1024, 1024)} | 0.1648 | 0.1591 | 0.0183 | 0.0176 | 2097.1521 | 2097.1521 |
sign | {'data': (10000, 1)} | 0.0322 | 0.0497 | 0.0066 | 0.0129 | 20.0 | 20.0 |
sign | {'data': (10000, 100)} | 0.158 | 0.1531 | 0.0159 | 0.0123 | 2000.0 | 2000.0 |
sin | {'data': (1024, 1024)} | 0.3474 | 0.3467 | 0.349 | 0.3563 | 2097.1521 | 2097.1521 |
sin | {'data': (10000, 1)} | 0.0083 | 0.0108 | 0.0077 | 0.0106 | 20.0 | 20.0 |
sin | {'data': (10000, 100)} | 0.3324 | 0.3732 | 0.3335 | 0.4071 | 2000.0 | 2000.0 |
sinh | {'data': (1024, 1024)} | 1.1067 | 1.1262 | 0.6819 | 0.6829 | 2097.1521 | 2097.1521 |
sinh | {'data': (10000, 1)} | 0.0191 | 0.0165 | 0.0124 | 0.0134 | 20.0 | 20.0 |
sinh | {'data': (10000, 100)} | 1.0553 | 1.065 | 0.6514 | 0.6535 | 2000.0 | 2000.0 |
size_array | {'data': (1024, 1024)} | 0.0015 | 0.0026 | --- | --- | 0.004 | 0.004 |
size_array | {'data': (10000, 1)} | 0.0016 | 0.0026 | --- | --- | 0.004 | 0.008 |
size_array | {'data': (10000, 100)} | 0.0016 | 0.0026 | --- | --- | 0.004 | 0.004 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (1024, 1024)} | 1.2038 | 1.2071 | 0.0934 | 0.0876 | 4194.3042 | 2097.1521 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 1)} | 0.02 | 0.0168 | 0.015 | 0.0117 | 20.0 | 20.0 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 100)} | 1.1498 | 1.1432 | 0.096 | 0.0936 | 2000.0 | 2000.0 |
softsign | {'data': (1024, 1024)} | 0.0248 | 0.0259 | 0.0358 | 0.0348 | 2097.1521 | 2097.1521 |
softsign | {'data': (10000, 1)} | 0.0053 | 0.0154 | 0.0078 | 0.0171 | 20.0 | 20.0 |
softsign | {'data': (10000, 100)} | 0.0244 | 0.0252 | 0.0198 | 0.0193 | 2000.0 | 2000.0 |
sqrt | {'data': (1024, 1024)} | 0.4656 | 0.4432 | 0.0223 | 0.0221 | 2097.1521 | 2097.1521 |
sqrt | {'data': (10000, 1)} | 0.0099 | 0.0097 | 0.0088 | 0.0147 | 20.0 | 20.0 |
sqrt | {'data': (10000, 100)} | 0.4424 | 0.4254 | 0.0186 | 0.0172 | 2000.0 | 2000.0 |
square | {'data': (1024, 1024)} | 0.0219 | 0.0191 | 0.0274 | 0.0246 | 2097.1521 | 2097.1521 |
square | {'data': (10000, 1)} | 0.0037 | 0.0063 | 0.0069 | 0.0107 | 20.0 | 20.0 |
square | {'data': (10000, 100)} | 0.0222 | 0.0194 | 0.0172 | 0.015 | 2000.0 | 2000.0 |
stop_gradient | {'data': (1024, 1024)} | --- | --- | --- | --- | 4194.3042 | 2097.1521 |
stop_gradient | {'data': (10000, 1)} | --- | --- | --- | --- | 40.0 | 20.0 |
stop_gradient | {'data': (10000, 100)} | --- | --- | --- | --- | 4000.0 | 2000.0 |
sum | {'axis': (), 'data': (1024, 1024)} | 18.7897 | 7.5997 | 0.9073 | 0.4699 | 0.002 | 0.002 |
sum | {'axis': 0, 'data': (10000, 1)} | 0.2124 | 0.1119 | 0.0149 | 0.0131 | 0.002 | 0.004 |
sum | {'axis': (0, 1), 'data': (10000, 100)} | 17.9343 | 8.0291 | 0.863 | 0.4388 | 0.002 | 0.004 |
sum_axis | {'axis': (), 'data': (1024, 1024)} | 21.5143 | 7.6222 | --- | --- | 0.004 | 0.004 |
sum_axis | {'axis': 0, 'data': (10000, 1)} | 0.1886 | 0.0834 | --- | --- | 0.002 | 0.002 |
sum_axis | {'axis': (0, 1), 'data': (10000, 100)} | 19.2096 | 7.2101 | --- | --- | 0.004 | 0.002 |
tan | {'data': (1024, 1024)} | 0.7864 | 0.7897 | 0.033 | 0.0304 | 2097.1521 | 2097.1521 |
tan | {'data': (10000, 1)} | 0.014 | 0.0148 | 0.0097 | 0.0165 | 20.0 | 20.0 |
tan | {'data': (10000, 100)} | 0.7533 | 0.753 | 0.0314 | 0.0314 | 2000.0 | 2000.0 |
tanh | {'data': (1024, 1024)} | 1.0414 | 1.0349 | 0.0361 | 0.0312 | 2097.1521 | 2097.1521 |
tanh | {'data': (10000, 1)} | 0.0184 | 0.0142 | 0.0156 | 0.014 | 40.0 | 20.0 |
tanh | {'data': (10000, 100)} | 1.0187 | 0.9849 | 0.0523 | 0.0288 | 2000.0 | 2000.0 |
trunc | {'data': (1024, 1024)} | 0.2086 | 0.2099 | --- | --- | 2097.1521 | 2097.1521 |
trunc | {'data': (10000, 1)} | 0.0079 | 0.053 | --- | --- | 20.0 | 20.0 |
trunc | {'data': (10000, 100)} | 0.1992 | 0.2015 | --- | --- | 2000.0 | 2000.0 |
zeros_like | {'data': (1024, 1024)} | 0.0326 | 0.0336 | --- | --- | 2097.1521 | 2097.1521 |
zeros_like | {'data': (10000, 1)} | 0.0035 | 0.0048 | --- | --- | 20.0 | 20.0 |
zeros_like | {'data': (10000, 100)} | 0.0315 | 0.0329 | --- | --- | 2000.0 | 2000.0 |
Last active
July 2, 2019 06:43
-
-
Save sandeep-krishnamurthy/e0a2be893c8c4d484390c9c8813bdf50 to your computer and use it in GitHub Desktop.
MXNet operator performance 1.4.1 and 1.5.RC2 - CPU (c5.18x) - MXNet-MKL
Operator | Inputs | Avg Forward Time (ms) (v1) | Avg Forward Time (ms) (v2) | Avg. Backward Time (ms) (v1) | Avg. Backward Time (ms) (v2) | Max Mem Usage (Storage) (Bytes) (v1) | Max Mem Usage (Storage) (Bytes) (v2) |
---|---|---|---|---|---|---|---|
BatchNorm | {'moving_mean': (3,), 'moving_var': (3,), 'beta': (3,), 'gamma': (3,), 'data': (32, 3, 256, 256)} | 1.1285 | 1.1327 | 2.3064 | 2.3057 | 12582.9238 | 12582.9238 |
BatchNorm | {'moving_mean': (3,), 'moving_var': (3,), 'beta': (3,), 'gamma': (3,), 'data': (32, 3, 10000, 10)} | 1.716 | 1.7203 | 3.5197 | 3.5286 | 19200.0117 | 19200.0117 |
BlockGrad | {'data': (1024, 1024)} | 0.0322 | 0.0329 | --- | --- | 2097.1521 | 2097.1521 |
BlockGrad | {'data': (10000, 1)} | 0.0203 | 0.0202 | --- | --- | 20.0 | 20.0 |
BlockGrad | {'data': (10000, 100)} | 0.0344 | 0.0344 | --- | --- | 2000.0 | 2000.0 |
Convolution | {'weight': (64, 3, 3), 'stride': (1,), 'layout': 'NCW', 'kernel': (3,), 'num_filter': 64, 'data': (32, 3, 256), 'bias': (64,), 'pad': (0,), 'dilate': (1,)} | 0.05 | 0.0515 | 0.0995 | 0.1014 | 1040.384 | 1040.384 |
Dropout | {'p': 0.5, 'mode': 'always', 'data': (32, 3, 256, 256)} | 5.3855 | 0.1185 | 0.11 | 0.0945 | 25165.8242 | 25165.8242 |
Dropout | {'p': 0.5, 'mode': 'always', 'data': (10000, 10)} | 0.0856 | 0.0589 | 0.0189 | 0.022 | 400.0 | 400.0 |
Flatten | {'data': (1024, 1024)} | 0.0321 | 0.0334 | --- | --- | 2097.1521 | 2097.1521 |
Flatten | {'data': (10000, 1)} | 0.0204 | 0.0216 | --- | --- | 20.0 | 20.0 |
Flatten | {'data': (10000, 100)} | 0.0342 | 0.0357 | --- | --- | 2000.0 | 2000.0 |
FullyConnected | {'bias': (64,), 'weight': (64, 196608), 'num_hidden': 64, 'flatten': True, 'data': (32, 3, 256, 256)} | 0.423 | 0.3097 | 0.3726 | 0.3474 | 4.096 | 4.096 |
FullyConnected | {'bias': (64,), 'weight': (64, 256), 'num_hidden': 64, 'flatten': False, 'data': (32, 3, 256, 256)} | 0.1327 | 0.1261 | 0.5494 | 0.2547 | 3145.728 | 3145.728 |
LeakyReLU | {'act_type': 'leaky', 'slope': 0.1, 'data': (1024, 1024)} | 0.0291 | 0.0302 | 0.0305 | 0.0318 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'leaky', 'slope': 0.1, 'data': (10000, 1)} | 0.0249 | 0.0256 | 0.0171 | 0.0183 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'leaky', 'slope': 0.1, 'data': (10000, 100)} | 0.0286 | 0.0292 | 0.03 | 0.031 | 2000.0 | 2000.0 |
LeakyReLU | {'act_type': 'elu', 'slope': 0.1, 'data': (1024, 1024)} | 0.0294 | 0.0307 | 0.0309 | 0.032 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'elu', 'slope': 0.1, 'data': (10000, 1)} | 0.018 | 0.0189 | 0.017 | 0.0181 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'elu', 'slope': 0.1, 'data': (10000, 100)} | 0.0291 | 0.0296 | 0.0311 | 0.0315 | 2000.0 | 2000.0 |
LeakyReLU | {'act_type': 'selu', 'data': (1024, 1024)} | 0.0319 | 0.0304 | 0.034 | 0.0322 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'selu', 'data': (10000, 1)} | 0.0183 | 0.0189 | 0.0172 | 0.0183 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'selu', 'data': (10000, 100)} | 0.0287 | 0.0298 | 0.0299 | 0.031 | 2000.0 | 2000.0 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 1024), 'data': (1024, 1024)} | 0.0305 | 0.0319 | 0.0935 | 0.0909 | 2097.1521 | 2097.1521 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 1), 'data': (10000, 1)} | 0.0201 | 0.0219 | 0.0435 | 0.0453 | 20.0 | 20.0 |
LeakyReLU | {'act_type': 'prelu', 'gamma': (1, 100), 'data': (10000, 100)} | 0.0298 | 0.0315 | 0.1041 | 0.0993 | 4000.0 | 2000.0 |
Pooling | {'pool_type': 'avg', 'stride': 1, 'kernel': 3, 'data': (32, 3, 256), 'global_pool': 0, 'pad': 1} | 0.0188 | 0.0202 | 0.0227 | 0.0242 | 49.152 | 49.152 |
abs | {'data': (1024, 1024)} | 0.029 | 0.0289 | 0.031 | 0.0307 | 2097.1521 | 2097.1521 |
abs | {'data': (10000, 1)} | 0.0178 | 0.0177 | 0.0171 | 0.0172 | 20.0 | 20.0 |
abs | {'data': (10000, 100)} | 0.0287 | 0.0281 | 0.0302 | 0.0305 | 2000.0 | 2000.0 |
arccos | {'data': (1024, 1024)} | 0.0292 | 0.0295 | 0.0345 | 0.0349 | 2097.1521 | 2097.1521 |
arccos | {'data': (10000, 1)} | 0.0181 | 0.0178 | 0.0175 | 0.0177 | 20.0 | 20.0 |
arccos | {'data': (10000, 100)} | 0.0291 | 0.0286 | 0.0331 | 0.0333 | 2000.0 | 2000.0 |
arccosh | {'data': (1024, 1024)} | 0.0313 | 0.0316 | 0.0339 | 0.0341 | 2097.1521 | 2097.1521 |
arccosh | {'data': (10000, 1)} | 0.0179 | 0.0182 | 0.0175 | 0.0177 | 20.0 | 20.0 |
arccosh | {'data': (10000, 100)} | 0.0309 | 0.0309 | 0.0323 | 0.0328 | 2000.0 | 2000.0 |
arcsin | {'data': (1024, 1024)} | 0.0287 | 0.0295 | 0.0333 | 0.0338 | 2097.1521 | 2097.1521 |
arcsin | {'data': (10000, 1)} | 0.0174 | 0.018 | 0.0171 | 0.0177 | 20.0 | 20.0 |
arcsin | {'data': (10000, 100)} | 0.0284 | 0.0289 | 0.0318 | 0.0326 | 2000.0 | 2000.0 |
arcsinh | {'data': (1024, 1024)} | 0.0303 | 0.0326 | 0.033 | 0.0339 | 2097.1521 | 4194.3042 |
arcsinh | {'data': (10000, 1)} | 0.0182 | 0.0197 | 0.0174 | 0.0182 | 20.0 | 20.0 |
arcsinh | {'data': (10000, 100)} | 0.0301 | 0.0315 | 0.0315 | 0.0325 | 2000.0 | 2000.0 |
arctan | {'data': (1024, 1024)} | 0.0308 | 0.0294 | 0.036 | 0.0325 | 2097.1521 | 2097.1521 |
arctan | {'data': (10000, 1)} | 0.0181 | 0.0179 | 0.0176 | 0.0173 | 20.0 | 20.0 |
arctan | {'data': (10000, 100)} | 0.0291 | 0.0289 | 0.0314 | 0.0314 | 2000.0 | 2000.0 |
arctanh | {'data': (1024, 1024)} | 0.0294 | 0.0297 | 0.0324 | 0.0323 | 2097.1521 | 2097.1521 |
arctanh | {'data': (10000, 1)} | 0.0178 | 0.0179 | 0.017 | 0.0172 | 20.0 | 20.0 |
arctanh | {'data': (10000, 100)} | 0.0288 | 0.0289 | 0.0312 | 0.0313 | 2000.0 | 2000.0 |
argmax_channel | {'data': (1024, 1024)} | 0.282 | 0.2803 | --- | --- | 2.048 | 2.048 |
argmax_channel | {'data': (10000, 1)} | 0.019 | 0.0189 | --- | --- | 20.0 | 20.0 |
argmax_channel | {'data': (10000, 100)} | 0.0449 | 0.0359 | --- | --- | 20.0 | 20.0 |
batch_dot | {'lhs': (32, 1024, 1024), 'rhs': (32, 1024, 1024)} | 4.6829 | 4.7 | 9.2628 | 9.2645 | 67108.8672 | 67108.8672 |
batch_dot | {'lhs': (32, 1000, 10), 'transpose_b': True, 'rhs': (32, 1000, 10)} | 0.2658 | 0.291 | 1.2259 | 1.2288 | 64000.0 | 64000.0 |
batch_dot | {'transpose_a': True, 'lhs': (32, 1000, 1), 'transpose_b': True, 'rhs': (32, 100, 1000)} | 0.0436 | 0.0459 | 0.0614 | 0.0643 | 6.4 | 6.4 |
broadcast_add | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0242 | 0.0197 | 0.0261 | 0.0203 | 0.012 | 0.024 |
broadcast_div | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0257 | 0.0184 | 0.0289 | 0.022 | 0.012 | 0.012 |
broadcast_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0178 | 0.0185 | --- | --- | 0.012 | 0.012 |
broadcast_greater | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0237 | 0.0185 | --- | --- | 0.024 | 0.012 |
broadcast_greater_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0233 | 0.0186 | --- | --- | 0.012 | 0.012 |
broadcast_hypot | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0206 | 0.0184 | 0.0227 | 0.0218 | 0.012 | 0.012 |
broadcast_lesser | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0235 | 0.0185 | --- | --- | 0.012 | 0.012 |
broadcast_lesser_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0239 | 0.0199 | --- | --- | 0.012 | 0.012 |
broadcast_logical_and | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0238 | 0.0189 | --- | --- | 0.012 | 0.012 |
broadcast_logical_or | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.024 | 0.0184 | --- | --- | 0.012 | 0.012 |
broadcast_logical_xor | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0236 | 0.018 | --- | --- | 0.012 | 0.012 |
broadcast_maximum | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0253 | 0.0187 | 0.028 | 0.0214 | 0.012 | 0.012 |
broadcast_minimum | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0254 | 0.0187 | 0.0279 | 0.0214 | 0.012 | 0.012 |
broadcast_minus | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | --- | --- | --- | --- | 0.012 | 0.012 |
broadcast_mod | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0184 | 0.019 | 0.0212 | 0.0223 | 0.012 | 0.012 |
broadcast_mul | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0253 | 0.0187 | 0.0281 | 0.0216 | 0.012 | 0.012 |
broadcast_not_equal | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.018 | 0.0183 | --- | --- | 0.012 | 0.012 |
broadcast_plus | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | --- | --- | --- | --- | 0.012 | 0.012 |
broadcast_power | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0184 | 0.0189 | 0.0217 | 0.0218 | 0.012 | 0.012 |
broadcast_sub | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0252 | 0.0187 | 0.0273 | 0.0206 | 0.012 | 0.012 |
cbrt | {'data': (1024, 1024)} | 0.0288 | 0.0289 | 0.0324 | 0.0325 | 2097.1521 | 2097.1521 |
cbrt | {'data': (10000, 1)} | 0.0176 | 0.0181 | 0.0171 | 0.0167 | 20.0 | 20.0 |
cbrt | {'data': (10000, 100)} | 0.0283 | 0.0286 | 0.0314 | 0.0309 | 2000.0 | 2000.0 |
ceil | {'data': (1024, 1024)} | 0.0292 | 0.0292 | --- | --- | 2097.1521 | 2097.1521 |
ceil | {'data': (10000, 1)} | 0.0178 | 0.0177 | --- | --- | 20.0 | 20.0 |
ceil | {'data': (10000, 100)} | 0.0285 | 0.0283 | --- | --- | 2000.0 | 2000.0 |
cos | {'data': (1024, 1024)} | 0.0291 | 0.0311 | 0.0306 | 0.0323 | 2097.1521 | 2097.1521 |
cos | {'data': (10000, 1)} | 0.0176 | 0.0196 | 0.0163 | 0.0181 | 20.0 | 20.0 |
cos | {'data': (10000, 100)} | 0.0286 | 0.0302 | 0.03 | 0.0317 | 2000.0 | 2000.0 |
cosh | {'data': (1024, 1024)} | 0.0291 | 0.0294 | 0.0316 | 0.0313 | 2097.1521 | 2097.1521 |
cosh | {'data': (10000, 1)} | 0.0178 | 0.0179 | 0.0173 | 0.0171 | 20.0 | 20.0 |
cosh | {'data': (10000, 100)} | 0.0286 | 0.032 | 0.0308 | 0.0345 | 2000.0 | 2000.0 |
degrees | {'data': (1024, 1024)} | 0.0289 | 0.029 | 0.0256 | 0.0251 | 2097.1521 | 2097.1521 |
degrees | {'data': (10000, 1)} | 0.0176 | 0.0177 | 0.0172 | 0.0169 | 20.0 | 20.0 |
degrees | {'data': (10000, 100)} | 0.0286 | 0.0286 | 0.0245 | 0.0242 | 2000.0 | 2000.0 |
dot | {'lhs': (1024, 1024), 'rhs': (1024, 1024)} | 0.2165 | 0.2162 | 0.4085 | 0.4054 | 2097.1521 | 2097.1521 |
dot | {'lhs': (1000, 10), 'transpose_b': True, 'rhs': (1000, 10)} | 0.0308 | 0.0303 | 0.0604 | 0.0606 | 2000.0 | 2000.0 |
dot | {'transpose_a': True, 'lhs': (1000, 1), 'transpose_b': True, 'rhs': (100, 1000)} | 0.0421 | 0.041 | 0.0375 | 0.038 | 0.2 | 0.2 |
elemwise_add | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0186 | 0.0185 | --- | --- | 0.012 | 0.012 |
elemwise_div | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0188 | 0.0181 | --- | --- | 0.012 | 0.012 |
elemwise_mul | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0188 | 0.0181 | --- | --- | 0.012 | 0.012 |
elemwise_sub | {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} | 0.0186 | 0.0181 | --- | --- | 0.012 | 0.012 |
erf | {'data': (1024, 1024)} | 0.0287 | 0.0293 | 0.0307 | 0.0309 | 2097.1521 | 2097.1521 |
erf | {'data': (10000, 1)} | 0.017 | 0.0181 | 0.0164 | 0.0172 | 20.0 | 20.0 |
erf | {'data': (10000, 100)} | 0.0279 | 0.0286 | 0.0298 | 0.0306 | 2000.0 | 2000.0 |
exp | {'data': (1024, 1024)} | 0.0291 | 0.0294 | --- | --- | 2097.1521 | 2097.1521 |
exp | {'data': (10000, 1)} | 0.0178 | 0.0178 | --- | --- | 20.0 | 20.0 |
exp | {'data': (10000, 100)} | 0.0286 | 0.0286 | --- | --- | 2000.0 | 2000.0 |
expm1 | {'data': (1024, 1024)} | 0.0293 | 0.0294 | 0.0312 | 0.0305 | 2097.1521 | 2097.1521 |
expm1 | {'data': (10000, 1)} | 0.0177 | 0.018 | 0.0169 | 0.0171 | 20.0 | 20.0 |
expm1 | {'data': (10000, 100)} | 0.0287 | 0.0286 | 0.0303 | 0.0305 | 2000.0 | 2000.0 |
fix | {'data': (1024, 1024)} | 0.029 | 0.0299 | --- | --- | 4194.3042 | 2097.1521 |
fix | {'data': (10000, 1)} | 0.0177 | 0.0183 | --- | --- | 20.0 | 20.0 |
fix | {'data': (10000, 100)} | 0.0284 | 0.0292 | --- | --- | 2000.0 | 2000.0 |
flatten | {'data': (1024, 1024)} | --- | --- | --- | --- | 2097.1521 | 2097.1521 |
flatten | {'data': (10000, 1)} | --- | --- | --- | --- | 20.0 | 20.0 |
flatten | {'data': (10000, 100)} | --- | --- | --- | --- | 2000.0 | 2000.0 |
floor | {'data': (1024, 1024)} | 0.0289 | 0.0291 | --- | --- | 2097.1521 | 2097.1521 |
floor | {'data': (10000, 1)} | 0.0175 | 0.0183 | --- | --- | 20.0 | 20.0 |
floor | {'data': (10000, 100)} | 0.0282 | 0.0297 | --- | --- | 2000.0 | 2000.0 |
gamma | {'data': (1024, 1024)} | 0.0323 | 0.0343 | 0.0683 | 0.0683 | 2097.1521 | 2097.1521 |
gamma | {'data': (10000, 1)} | 0.0181 | 0.0186 | 0.0197 | 0.0199 | 20.0 | 20.0 |
gamma | {'data': (10000, 100)} | 0.0321 | 0.032 | 0.0663 | 0.0656 | 2000.0 | 2000.0 |
gammaln | {'data': (1024, 1024)} | 0.0365 | 0.0371 | 0.0589 | 0.0593 | 2097.1521 | 2097.1521 |
gammaln | {'data': (10000, 1)} | 0.0181 | 0.0182 | 0.0189 | 0.0191 | 20.0 | 20.0 |
gammaln | {'data': (10000, 100)} | 0.037 | 0.036 | 0.0575 | 0.057 | 2000.0 | 2000.0 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (1024, 1024)} | 0.0285 | 0.0294 | 0.0319 | 0.0325 | 2097.1521 | 2097.1521 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (10000, 1)} | 0.0173 | 0.0183 | 0.0166 | 0.0167 | 20.0 | 20.0 |
hard_sigmoid | {'alpha': 0.25, 'beta': 0.5, 'data': (10000, 100)} | 0.0277 | 0.0286 | 0.0308 | 0.0309 | 2000.0 | 2000.0 |
identity | {'data': (1024, 1024)} | --- | --- | --- | --- | 2097.1521 | 2097.1521 |
identity | {'data': (10000, 1)} | --- | --- | --- | --- | 20.0 | 20.0 |
identity | {'data': (10000, 100)} | --- | --- | --- | --- | 2000.0 | 2000.0 |
log | {'data': (1024, 1024)} | 0.0289 | 0.0291 | 0.0325 | 0.0324 | 2097.1521 | 2097.1521 |
log | {'data': (10000, 1)} | 0.0175 | 0.0178 | 0.0169 | 0.0172 | 20.0 | 20.0 |
log | {'data': (10000, 100)} | 0.0286 | 0.0285 | 0.031 | 0.0314 | 2000.0 | 2000.0 |
log10 | {'data': (1024, 1024)} | 0.0287 | 0.0294 | 0.0321 | 0.0326 | 2097.1521 | 2097.1521 |
log10 | {'data': (10000, 1)} | 0.0174 | 0.0179 | 0.0165 | 0.0172 | 20.0 | 20.0 |
log10 | {'data': (10000, 100)} | 0.0282 | 0.0284 | 0.0307 | 0.0313 | 2000.0 | 2000.0 |
log1p | {'data': (1024, 1024)} | 0.029 | 0.0296 | 0.0323 | 0.0328 | 2097.1521 | 2097.1521 |
log1p | {'data': (10000, 1)} | 0.0178 | 0.0181 | 0.017 | 0.0175 | 20.0 | 20.0 |
log1p | {'data': (10000, 100)} | 0.0287 | 0.0287 | 0.0309 | 0.0316 | 2000.0 | 2000.0 |
log2 | {'data': (1024, 1024)} | 0.029 | 0.0291 | 0.0327 | 0.0326 | 2097.1521 | 2097.1521 |
log2 | {'data': (10000, 1)} | 0.0176 | 0.0179 | 0.0171 | 0.0169 | 20.0 | 20.0 |
log2 | {'data': (10000, 100)} | 0.0285 | 0.0287 | 0.0313 | 0.0315 | 2000.0 | 2000.0 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (1024, 1024)} | 0.0395 | 0.0406 | 0.0406 | 0.0367 | 2097.1521 | 2097.1521 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 1)} | 0.0604 | 0.0521 | 0.0443 | 0.0367 | 20.0 | 20.0 |
log_softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 100)} | 0.0852 | 0.0773 | 0.0605 | 0.0526 | 2000.0 | 2000.0 |
logical_not | {'data': (1024, 1024)} | 0.0292 | 0.0285 | --- | --- | 2097.1521 | 2097.1521 |
logical_not | {'data': (10000, 1)} | 0.0177 | 0.017 | --- | --- | 20.0 | 20.0 |
logical_not | {'data': (10000, 100)} | 0.0284 | 0.0288 | --- | --- | 2000.0 | 2000.0 |
make_loss | {'data': (1024, 1024)} | 0.0316 | 0.0326 | --- | --- | 2097.1521 | 2097.1521 |
make_loss | {'data': (10000, 1)} | 0.0197 | 0.02 | --- | --- | 20.0 | 20.0 |
make_loss | {'data': (10000, 100)} | 0.034 | 0.0341 | --- | --- | 2000.0 | 2000.0 |
max | {'axis': (), 'data': (1024, 1024)} | 0.0693 | 0.0643 | 0.0453 | 0.0973 | 0.002 | 0.002 |
max | {'axis': 0, 'data': (10000, 1)} | 0.0303 | 0.0307 | 0.02 | 0.0204 | 0.002 | 0.002 |
max | {'axis': (0, 1), 'data': (10000, 100)} | 0.068 | 0.0639 | 0.0444 | 0.0938 | 0.002 | 0.002 |
max_axis | {'axis': (), 'data': (1024, 1024)} | 0.0626 | 0.0545 | --- | --- | 0.002 | 0.002 |
max_axis | {'axis': 0, 'data': (10000, 1)} | 0.0311 | 0.0298 | --- | --- | 0.002 | 0.002 |
max_axis | {'axis': (0, 1), 'data': (10000, 100)} | 0.062 | 0.0539 | --- | --- | 0.002 | 0.002 |
mean | {'axis': (), 'data': (1024, 1024)} | 0.0856 | 0.077 | 0.0417 | 0.04 | 0.002 | 0.002 |
mean | {'axis': 0, 'data': (10000, 1)} | 0.0359 | 0.0351 | 0.0223 | 0.0236 | 0.002 | 0.002 |
mean | {'axis': (0, 1), 'data': (10000, 100)} | 0.0834 | 0.0756 | 0.0403 | 0.0394 | 0.004 | 0.002 |
min | {'axis': (), 'data': (1024, 1024)} | 0.0724 | 0.064 | 0.0461 | 0.0976 | 0.002 | 0.002 |
min | {'axis': 0, 'data': (10000, 1)} | 0.0325 | 0.0304 | 0.0209 | 0.0206 | 0.002 | 0.002 |
min | {'axis': (0, 1), 'data': (10000, 100)} | 0.0716 | 0.0633 | 0.0453 | 0.0943 | 0.002 | 0.002 |
min_axis | {'axis': (), 'data': (1024, 1024)} | 0.0622 | 0.0547 | --- | --- | 0.002 | 0.002 |
min_axis | {'axis': 0, 'data': (10000, 1)} | 0.0304 | 0.0298 | --- | --- | 0.002 | 0.002 |
min_axis | {'axis': (0, 1), 'data': (10000, 100)} | 0.0615 | 0.0542 | --- | --- | 0.002 | 0.002 |
nanprod | {'axis': (), 'data': (1024, 1024)} | 0.0726 | 0.0681 | 0.0503 | 0.098 | 0.002 | 0.002 |
nanprod | {'axis': 0, 'data': (10000, 1)} | 0.0315 | 0.0314 | 0.0209 | 0.0204 | 0.002 | 0.002 |
nanprod | {'axis': (0, 1), 'data': (10000, 100)} | 0.0714 | 0.0674 | 0.0491 | 0.0943 | 0.002 | 0.002 |
nansum | {'axis': (), 'data': (1024, 1024)} | 0.0764 | 0.0707 | 0.0362 | 0.0984 | 0.002 | 0.002 |
nansum | {'axis': 0, 'data': (10000, 1)} | 0.0341 | 0.0338 | 0.0203 | 0.0205 | 0.002 | 0.002 |
nansum | {'axis': (0, 1), 'data': (10000, 100)} | 0.0753 | 0.0699 | 0.0359 | 0.095 | 0.002 | 0.002 |
negative | {'data': (1024, 1024)} | 0.0285 | 0.0291 | --- | --- | 2097.1521 | 2097.1521 |
negative | {'data': (10000, 1)} | 0.0175 | 0.0174 | --- | --- | 20.0 | 20.0 |
negative | {'data': (10000, 100)} | 0.0281 | 0.0285 | --- | --- | 2000.0 | 2000.0 |
ones_like | {'data': (1024, 1024)} | 0.0248 | 0.0248 | --- | --- | 2097.1521 | 2097.1521 |
ones_like | {'data': (10000, 1)} | 0.0175 | 0.0174 | --- | --- | 20.0 | 20.0 |
ones_like | {'data': (10000, 100)} | 0.0246 | 0.0243 | --- | --- | 4000.0 | 2000.0 |
prod | {'axis': (), 'data': (1024, 1024)} | 0.0691 | 0.0651 | 0.0493 | 0.0985 | 0.002 | 0.002 |
prod | {'axis': 0, 'data': (10000, 1)} | 0.0307 | 0.0309 | 0.0207 | 0.0207 | 0.002 | 0.002 |
prod | {'axis': (0, 1), 'data': (10000, 100)} | 0.0679 | 0.0633 | 0.0493 | 0.0947 | 0.002 | 0.002 |
radians | {'data': (1024, 1024)} | 0.029 | 0.0289 | 0.0255 | 0.0252 | 2097.1521 | 2097.1521 |
radians | {'data': (10000, 1)} | 0.0176 | 0.0179 | 0.0171 | 0.0172 | 20.0 | 20.0 |
radians | {'data': (10000, 100)} | 0.0286 | 0.0286 | 0.0237 | 0.0243 | 2000.0 | 2000.0 |
random_exponential | {'shape': (1024, 1024)} | 18.9581 | 15.748 | --- | --- | 2097.1521 | 2097.1521 |
random_exponential | {'shape': (10000, 1)} | 0.1939 | 0.1881 | --- | --- | 20.0 | 20.0 |
random_exponential | {'shape': (10000, 100)} | 18.1409 | 15.0313 | --- | --- | 2000.0 | 2000.0 |
random_gamma | {'shape': (1024, 1024)} | 57.6804 | 46.4024 | --- | --- | 2097.1521 | 2097.1521 |
random_gamma | {'shape': (10000, 1)} | 0.5812 | 0.4907 | --- | --- | 20.0 | 20.0 |
random_gamma | {'shape': (10000, 100)} | 55.0833 | 44.321 | --- | --- | 2000.0 | 2000.0 |
random_generalized_negative_binomial | {'shape': (1024, 1024)} | 71.1868 | 63.0437 | --- | --- | 2097.1521 | 2097.1521 |
random_generalized_negative_binomial | {'shape': (10000, 1)} | 0.7127 | 0.6538 | --- | --- | 20.0 | 20.0 |
random_generalized_negative_binomial | {'shape': (10000, 100)} | 68.1477 | 60.159 | --- | --- | 2000.0 | 2000.0 |
random_negative_binomial | {'p': 1, 'k': 1, 'shape': (1024, 1024)} | 61.784 | 52.0455 | --- | --- | 2097.1521 | 2097.1521 |
random_negative_binomial | {'p': 1, 'k': 1, 'shape': (10000, 1)} | 0.62 | 0.548 | --- | --- | 20.0 | 20.0 |
random_negative_binomial | {'p': 1, 'k': 1, 'shape': (10000, 100)} | 59.2521 | 49.6821 | --- | --- | 2000.0 | 2000.0 |
random_normal | {'shape': (1024, 1024)} | 19.0541 | 16.3413 | --- | --- | 2097.1521 | 2097.1521 |
random_normal | {'shape': (10000, 1)} | 0.2101 | 0.1922 | --- | --- | 20.0 | 20.0 |
random_normal | {'shape': (10000, 100)} | 18.2038 | 15.5969 | --- | --- | 2000.0 | 2000.0 |
random_poisson | {'shape': (1024, 1024)} | 18.8666 | 15.4039 | --- | --- | 2097.1521 | 2097.1521 |
random_poisson | {'shape': (10000, 1)} | 0.2039 | 0.1787 | --- | --- | 20.0 | 20.0 |
random_poisson | {'shape': (10000, 100)} | 18.2745 | 14.9178 | --- | --- | 2000.0 | 2000.0 |
random_randint | {'high': 5, 'low': 0, 'shape': (1024, 1024)} | 10.889 | 9.045 | --- | --- | 2097.1521 | 2097.1521 |
random_randint | {'high': 5, 'low': 0, 'shape': (10000, 1)} | 0.1155 | 0.1178 | --- | --- | 20.0 | 20.0 |
random_randint | {'high': 5, 'low': 0, 'shape': (10000, 100)} | 10.4834 | 8.6332 | --- | --- | 2000.0 | 2000.0 |
random_uniform | {'high': 5, 'low': 0, 'shape': (1024, 1024)} | 6.141 | 3.1128 | --- | --- | 2097.1521 | 2097.1521 |
random_uniform | {'high': 5, 'low': 0, 'shape': (10000, 1)} | 0.0798 | 0.0632 | --- | --- | 20.0 | 20.0 |
random_uniform | {'high': 5, 'low': 0, 'shape': (10000, 100)} | 6.0672 | 2.9739 | --- | --- | 2000.0 | 2000.0 |
rcbrt | {'data': (1024, 1024)} | 0.0294 | 0.0298 | 0.0317 | 0.0319 | 2097.1521 | 2097.1521 |
rcbrt | {'data': (10000, 1)} | 0.0177 | 0.0179 | 0.0165 | 0.0172 | 20.0 | 20.0 |
rcbrt | {'data': (10000, 100)} | 0.0285 | 0.0286 | 0.0298 | 0.0311 | 2000.0 | 2000.0 |
reciprocal | {'data': (1024, 1024)} | 0.0285 | 0.0298 | 0.0323 | 0.0325 | 2097.1521 | 2097.1521 |
reciprocal | {'data': (10000, 1)} | 0.0174 | 0.0182 | 0.0166 | 0.0174 | 20.0 | 20.0 |
reciprocal | {'data': (10000, 100)} | 0.028 | 0.0284 | 0.0308 | 0.0314 | 2000.0 | 2000.0 |
relu | {'data': (1024, 1024)} | 0.0283 | 0.029 | 0.0306 | 0.0305 | 2097.1521 | 2097.1521 |
relu | {'data': (10000, 1)} | 0.0175 | 0.0177 | 0.0168 | 0.0171 | 20.0 | 20.0 |
relu | {'data': (10000, 100)} | 0.0279 | 0.0281 | 0.0296 | 0.0297 | 2000.0 | 2000.0 |
rint | {'data': (1024, 1024)} | 0.0292 | 0.0296 | --- | --- | 2097.1521 | 2097.1521 |
rint | {'data': (10000, 1)} | 0.0178 | 0.0186 | --- | --- | 20.0 | 20.0 |
rint | {'data': (10000, 100)} | 0.0287 | 0.0292 | --- | --- | 2000.0 | 2000.0 |
round | {'data': (1024, 1024)} | 0.0282 | 0.029 | --- | --- | 2097.1521 | 2097.1521 |
round | {'data': (10000, 1)} | 0.0172 | 0.0177 | --- | --- | 20.0 | 20.0 |
round | {'data': (10000, 100)} | 0.0276 | 0.0284 | --- | --- | 2000.0 | 2000.0 |
rsqrt | {'data': (1024, 1024)} | 0.0292 | 0.0314 | 0.0343 | 0.0357 | 2097.1521 | 2097.1521 |
rsqrt | {'data': (10000, 1)} | 0.0174 | 0.0201 | 0.0172 | 0.0185 | 20.0 | 20.0 |
rsqrt | {'data': (10000, 100)} | 0.029 | 0.0308 | 0.0328 | 0.034 | 2000.0 | 2000.0 |
sample_exponential | {'lam': [1.0, 8.5], 'shape': (1024, 1024)} | 0.1351 | 0.1445 | --- | --- | 4194.3042 | 4194.3042 |
sample_exponential | {'lam': [1.0, 8.5], 'shape': (10000, 1)} | 0.0752 | 0.0819 | --- | --- | 40.0 | 40.0 |
sample_exponential | {'lam': [1.0, 8.5], 'shape': (10000, 100)} | 0.1329 | 0.1421 | --- | --- | 4000.0 | 4000.0 |
sample_gamma | {'alpha': [0.0, 2.5], 'beta': [1.0, 0.7], 'shape': (1024, 1024)} | 0.3482 | 0.3107 | --- | --- | 4194.3042 | 4194.3042 |
sample_gamma | {'alpha': [0.0, 2.5], 'beta': [1.0, 0.7], 'shape': (10000, 1)} | 0.2858 | 0.2473 | --- | --- | 40.0 | 40.0 |
sample_gamma | {'alpha': [0.0, 2.5], 'beta': [1.0, 0.7], 'shape': (10000, 100)} | 0.346 | 0.3003 | --- | --- | 4000.0 | 4000.0 |
sample_generalized_negative_binomial | {'mu': [2.0, 2.5], 'alpha': [0.0, 2.5], 'shape': (1024, 1024)} | 1.0807 | 1.0976 | --- | --- | 4194.3042 | 4194.3042 |
sample_generalized_negative_binomial | {'mu': [2.0, 2.5], 'alpha': [0.0, 2.5], 'shape': (10000, 1)} | 0.5691 | 0.5931 | --- | --- | 40.0 | 40.0 |
sample_generalized_negative_binomial | {'mu': [2.0, 2.5], 'alpha': [0.0, 2.5], 'shape': (10000, 100)} | 0.9628 | 0.9685 | --- | --- | 8000.0 | 4000.0 |
sample_negative_binomial | {'p': [0.4, 0.77], 'k': [20, 49], 'shape': (1024, 1024)} | 1.3997 | 1.4196 | --- | --- | 4194.3042 | 4194.3042 |
sample_negative_binomial | {'p': [0.4, 0.77], 'k': [20, 49], 'shape': (10000, 1)} | 0.8096 | 0.8071 | --- | --- | 40.0 | 40.0 |
sample_negative_binomial | {'p': [0.4, 0.77], 'k': [20, 49], 'shape': (10000, 100)} | 1.3842 | 1.4067 | --- | --- | 4000.0 | 8000.0 |
sample_normal | {'mu': [2.0, 2.5], 'sigma': [1.0, 3.7], 'shape': (1024, 1024)} | 0.1451 | 0.1323 | --- | --- | 8388.6084 | 4194.3042 |
sample_normal | {'mu': [2.0, 2.5], 'sigma': [1.0, 3.7], 'shape': (10000, 1)} | 0.0538 | 0.0834 | --- | --- | 40.0 | 40.0 |
sample_normal | {'mu': [2.0, 2.5], 'sigma': [1.0, 3.7], 'shape': (10000, 100)} | 0.1429 | 0.1298 | --- | --- | 8000.0 | 4000.0 |
sample_poisson | {'lam': [1.0, 8.5], 'shape': (1024, 1024)} | 0.402 | 0.4752 | --- | --- | 4194.3042 | 4194.3042 |
sample_poisson | {'lam': [1.0, 8.5], 'shape': (10000, 1)} | 0.3301 | 0.4037 | --- | --- | 40.0 | 40.0 |
sample_poisson | {'lam': [1.0, 8.5], 'shape': (10000, 100)} | 0.4018 | 0.4745 | --- | --- | 8000.0 | 4000.0 |
sample_uniform | {'high': [1.0, 3.7], 'low': [0.0, 2.5], 'shape': (1024, 1024)} | 0.1466 | 0.1485 | --- | --- | 4194.3042 | 4194.3042 |
sample_uniform | {'high': [1.0, 3.7], 'low': [0.0, 2.5], 'shape': (10000, 1)} | 0.0683 | 0.0584 | --- | --- | 40.0 | 40.0 |
sample_uniform | {'high': [1.0, 3.7], 'low': [0.0, 2.5], 'shape': (10000, 100)} | 0.144 | 0.1426 | --- | --- | 4000.0 | 8000.0 |
shuffle | {'data': (1024, 1024)} | 0.0826 | 0.086 | --- | --- | 2097.1521 | 2097.1521 |
shuffle | {'data': (10000, 1)} | 0.4134 | 0.2593 | --- | --- | 100.0 | 60.0 |
shuffle | {'data': (10000, 100)} | 0.4204 | 0.2638 | --- | --- | 2000.0 | 2000.0 |
sigmoid | {'data': (1024, 1024)} | 0.029 | 0.0328 | 0.0309 | 0.0341 | 2097.1521 | 2097.1521 |
sigmoid | {'data': (10000, 1)} | 0.0177 | 0.0182 | 0.0171 | 0.0175 | 20.0 | 20.0 |
sigmoid | {'data': (10000, 100)} | 0.0285 | 0.0286 | 0.03 | 0.03 | 2000.0 | 2000.0 |
sign | {'data': (1024, 1024)} | 0.0291 | 0.029 | 0.025 | 0.0252 | 2097.1521 | 2097.1521 |
sign | {'data': (10000, 1)} | 0.0175 | 0.0179 | 0.0162 | 0.017 | 20.0 | 20.0 |
sign | {'data': (10000, 100)} | 0.0285 | 0.0285 | 0.0234 | 0.0243 | 2000.0 | 2000.0 |
sin | {'data': (1024, 1024)} | 0.0298 | 0.0301 | 0.032 | 0.0314 | 2097.1521 | 2097.1521 |
sin | {'data': (10000, 1)} | 0.0253 | 0.0184 | 0.0176 | 0.0172 | 20.0 | 20.0 |
sin | {'data': (10000, 100)} | 0.0289 | 0.0291 | 0.031 | 0.031 | 2000.0 | 2000.0 |
sinh | {'data': (1024, 1024)} | 0.0295 | 0.0314 | 0.0311 | 0.032 | 2097.1521 | 2097.1521 |
sinh | {'data': (10000, 1)} | 0.0177 | 0.0198 | 0.0169 | 0.0182 | 20.0 | 20.0 |
sinh | {'data': (10000, 100)} | 0.0286 | 0.0298 | 0.0303 | 0.0315 | 2000.0 | 2000.0 |
size_array | {'data': (1024, 1024)} | 0.0201 | 0.0159 | --- | --- | 0.004 | 0.004 |
size_array | {'data': (10000, 1)} | 0.0284 | 0.0228 | --- | --- | 0.004 | 0.004 |
size_array | {'data': (10000, 100)} | 0.0199 | 0.016 | --- | --- | 0.004 | 0.004 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (1024, 1024)} | 0.0431 | 0.0435 | 0.0448 | 0.0423 | 2097.1521 | 2097.1521 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 1)} | 0.0605 | 0.0531 | 0.0432 | 0.0362 | 20.0 | 20.0 |
softmax | {'temperature': 0.5, 'axis': -1, 'data': (10000, 100)} | 0.0856 | 0.0776 | 0.0606 | 0.0595 | 2000.0 | 2000.0 |
softsign | {'data': (1024, 1024)} | 0.0291 | 0.029 | 0.0315 | 0.0324 | 2097.1521 | 2097.1521 |
softsign | {'data': (10000, 1)} | 0.0179 | 0.0177 | 0.0161 | 0.0171 | 20.0 | 20.0 |
softsign | {'data': (10000, 100)} | 0.0285 | 0.0283 | 0.0301 | 0.0313 | 2000.0 | 2000.0 |
sqrt | {'data': (1024, 1024)} | 0.0293 | 0.0291 | 0.0337 | 0.0334 | 2097.1521 | 2097.1521 |
sqrt | {'data': (10000, 1)} | 0.0183 | 0.0179 | 0.0177 | 0.0174 | 20.0 | 20.0 |
sqrt | {'data': (10000, 100)} | 0.0289 | 0.0285 | 0.0323 | 0.0321 | 2000.0 | 2000.0 |
square | {'data': (1024, 1024)} | 0.0285 | 0.0292 | 0.0306 | 0.0309 | 2097.1521 | 2097.1521 |
square | {'data': (10000, 1)} | 0.0173 | 0.0177 | 0.0166 | 0.0174 | 20.0 | 20.0 |
square | {'data': (10000, 100)} | 0.0285 | 0.028 | 0.03 | 0.0303 | 2000.0 | 2000.0 |
stop_gradient | {'data': (1024, 1024)} | --- | --- | --- | --- | 2097.1521 | 2097.1521 |
stop_gradient | {'data': (10000, 1)} | --- | --- | --- | --- | 40.0 | 20.0 |
stop_gradient | {'data': (10000, 100)} | --- | --- | --- | --- | 2000.0 | 2000.0 |
sum | {'axis': (), 'data': (1024, 1024)} | 0.0836 | 0.0752 | 0.0328 | 0.0329 | 0.002 | 0.002 |
sum | {'axis': 0, 'data': (10000, 1)} | 0.0334 | 0.0344 | 0.0184 | 0.0197 | 0.002 | 0.002 |
sum | {'axis': (0, 1), 'data': (10000, 100)} | 0.0811 | 0.0737 | 0.0321 | 0.0319 | 0.002 | 0.002 |
sum_axis | {'axis': (), 'data': (1024, 1024)} | 0.0681 | 0.0616 | --- | --- | 0.002 | 0.002 |
sum_axis | {'axis': 0, 'data': (10000, 1)} | 0.033 | 0.0328 | --- | --- | 0.002 | 0.002 |
sum_axis | {'axis': (0, 1), 'data': (10000, 100)} | 0.0673 | 0.0611 | --- | --- | 0.002 | 0.002 |
tan | {'data': (1024, 1024)} | 0.0304 | 0.0303 | 0.031 | 0.0305 | 2097.1521 | 2097.1521 |
tan | {'data': (10000, 1)} | 0.0178 | 0.0177 | 0.0168 | 0.017 | 20.0 | 20.0 |
tan | {'data': (10000, 100)} | 0.0299 | 0.0296 | 0.0301 | 0.0301 | 2000.0 | 2000.0 |
tanh | {'data': (1024, 1024)} | 0.0292 | 0.0301 | 0.0311 | 0.031 | 2097.1521 | 2097.1521 |
tanh | {'data': (10000, 1)} | 0.0177 | 0.0255 | 0.017 | 0.0174 | 20.0 | 20.0 |
tanh | {'data': (10000, 100)} | 0.0285 | 0.029 | 0.0301 | 0.0302 | 2000.0 | 2000.0 |
trunc | {'data': (1024, 1024)} | 0.0292 | 0.0297 | --- | --- | 2097.1521 | 2097.1521 |
trunc | {'data': (10000, 1)} | 0.0179 | 0.0184 | --- | --- | 20.0 | 20.0 |
trunc | {'data': (10000, 100)} | 0.0288 | 0.0292 | --- | --- | 2000.0 | 2000.0 |
zeros_like | {'data': (1024, 1024)} | 0.025 | 0.0254 | --- | --- | 2097.1521 | 2097.1521 |
zeros_like | {'data': (10000, 1)} | 0.0179 | 0.0185 | --- | --- | 20.0 | 20.0 |
zeros_like | {'data': (10000, 100)} | 0.025 | 0.0252 | --- | --- | 4000.0 | 2000.0 |
Is MKL-DNN used to get the performance on CPU?
Is MKL-DNN used to get the performance on CPU?
Yes these numbers are using MXNet-MKL
It's alarming how different these results are for many ops.
Would be nice to see a & diff % column. Can you supply an xls or csv for easy analysis?
Thank for doing with excel sheet for now. Will update the tool to do this automatically going forward.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It's alarming how different these results are for many ops.
Would be nice to see a & diff % column. Can you supply an xls or csv for easy analysis?