Skip to content

Instantly share code, notes, and snippets.

@roywei
Created June 28, 2019 16:19
Show Gist options
  • Save roywei/41fce930f013ff3b54cda6e86eaaf66b to your computer and use it in GitHub Desktop.
Save roywei/41fce930f013ff3b54cda6e86eaaf66b to your computer and use it in GitHub Desktop.
Epoch 1.4.1 Duration(ms) 1.5.0 Duration(ms)
0 230.757866 236.492851
1 222.984238 210.10698
2 201.988141 216.398906
3 180.265748 193.985564
4 186.75808 184.309184
5 175.55141 189.248856
6 176.789087 181.657551
7 175.327701 178.745348
8 170.667204 177.222914
9 171.072025 179.044087
10 165.677896 180.915839
11 167.697753 179.945776
12 169.017815 179.334626
13 171.944214 174.748444
14 171.18974 176.74482
15 170.85772 175.943712
16 169.797115 173.363115
17 171.359856 179.799984
18 165.101575 173.906772
19 166.644768 171.334257
20 165.398719 170.746719
21 163.872557 174.593472
22 163.192172 173.862411
23 164.590395 170.003541
24 165.034882 169.336788
25 168.192282 174.610201
26 168.798925 169.416318
27 166.190207 172.121464
28 168.176571 168.235358
29 161.458971 165.334522
30 164.072138 169.794838
31 162.390821 163.857784
32 165.762179 169.176509
33 165.757065 171.972283
34 165.252418 166.855528
35 162.706386 167.900561
36 162.505737 164.058693
37 159.717428 165.824952
38 166.977237 164.575574
39 165.665132 166.82324
40 165.222445 166.34205
41 160.153508 169.027454
42 162.431273 168.888793
43 157.583668 167.751243
44 162.019849 166.216257
45 162.513379 165.081609
46 159.829102 168.513828
47 161.873211 169.194472
48 161.379906 163.675929
49 160.033012 167.857614
average of 10-49 eochs 164.9510007 170.4421838
-3%
@roywei
Copy link
Author

roywei commented Jun 28, 2019

Note: positive means speed up, negative means regression

Operator Inputs Avg Forward Time (ms) (1.4.1) Avg Forward Time (ms) (1.5.0) Speed improvement Avg. Backward Time (ms) (1.4.1) Avg. Backward Time (ms) (1.5.0) Speed improvement
BatchNorm {'beta': (3,), 'moving_mean': (3,), 'gamma': (3,), 'data': (32, 3, 256, 256), 'moving_var': (3,)} 8.0009 7.9855 0% 7.9907 8.014 0%
BatchNorm {'beta': (3,), 'moving_mean': (3,), 'gamma': (3,), 'data': (32, 3, 10000, 10), 'moving_var': (3,)} 14.4459 14.599 -1% 12.363 12.51 -1%
Pooling {'kernel': 3, 'data': (32, 3, 256), 'global_pool': 0, 'pad': 1, 'pool_type': 'avg', 'stride': 1} 0.1805 0.1746 3% 0.2323 0.2315 0%
FullyConnected {'num_hidden': 64, 'flatten': True, 'weight': (64, 196608), 'bias': (64,), 'data': (32, 3, 256, 256)} 1.7245 1.6666 3% 4.664 4.5218 3%
FullyConnected {'num_hidden': 64, 'flatten': False, 'weight': (64, 256), 'bias': (64,), 'data': (32, 3, 256, 256)} 0.7191 0.4847 33% 1.7348 1.8349 -6%
batch_dot {'lhs': (32, 1024, 1024), 'rhs': (32, 1024, 1024)} 60.4961 56.387 7% 72.327 70.249 3%
batch_dot {'lhs': (32, 1000, 10), 'rhs': (32, 1000, 10), 'transpose_b': True} 32.023 35.888 -12% 6.6535 6.3897 4%
batch_dot {'lhs': (32, 1000, 1), 'rhs': (32, 100, 1000), 'transpose_b': True, 'transpose_a': True} 0.207 0.6677 -223% 1.1905 1.2125 -2%
dot {'lhs': (1024, 1024), 'rhs': (1024, 1024)} 1.0055 0.8465 16% 2.3081 1.9458 16%
dot {'lhs': (1000, 10), 'rhs': (1000, 10), 'transpose_b': True} 0.1167 0.0817 30% 0.1221 0.1437 -18%
dot {'lhs': (1000, 1), 'rhs': (100, 1000), 'transpose_b': True, 'transpose_a': True} 0.0102 0.0161 -58% 0.0238 0.0429 -80%
broadcast_mul {'lhs': [(1024, 1024), (10000, 10), (10000, 1)], 'rhs': [(1024, 1024), (10000, 10), (10000, 1)]} 0.0019 0.0047 -147% 0.0022 0.0052 -136%
log_softmax {'temperature': 0.5, 'axis': -1, 'data': (1024, 1024)} 1.176 1.1758 0% 0.5789 0.5908 -2%
log_softmax {'temperature': 0.5, 'axis': -1, 'data': (10000, 1)} 0.0213 0.0172 19% 0.0184 0.0139 24%
log_softmax {'temperature': 0.5, 'axis': -1, 'data': (10000, 100)} 0.9789 1.1135 -14% 0.4785 0.5578 -17%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment