Skip to content

Instantly share code, notes, and snippets.

@codecircuit
Created October 19, 2016 14:51
Show Gist options
  • Save codecircuit/d8d58eab0a0a591b5c625b0b5eae7459 to your computer and use it in GitHub Desktop.
Save codecircuit/d8d58eab0a0a591b5c625b0b5eae7459 to your computer and use it in GitHub Desktop.
Benchmark of a Matrix Multiplication with Mekong's automatic partitioning workflow
gpus N app_htod_time[s] app_dtoh_time[s] app_kernel_time[s] dependency_resolution_calculation_time[s] dependency_resolution_memcpy_time[s] linearization_time[s]
1 1024 0.00117856 0.000707669 0.298789 N/A N/A N/A
1 1024 0.00119194 0.000687675 0.310544 N/A N/A N/A
1 1024 0.001198 0.000679569 0.301593 N/A N/A N/A
1 1024 0.00177307 0.000818265 0.304178 N/A N/A N/A
1 1024 0.00178578 0.000770523 0.297574 N/A N/A N/A
1 12288 0.180836 0.0876262 351.563 N/A N/A N/A
1 12288 0.188598 0.0859708 351.569 N/A N/A N/A
1 12288 0.2054 0.0847952 351.524 N/A N/A N/A
1 12288 0.205829 0.0876807 351.641 N/A N/A N/A
1 12288 0.205945 0.0843384 351.55 N/A N/A N/A
1 16384 0.30739 0.124854 832.093 N/A N/A N/A
1 16384 0.308914 0.148667 832.137 N/A N/A N/A
1 16384 0.309542 0.150355 832.119 N/A N/A N/A
1 16384 0.314142 0.154277 832.162 N/A N/A N/A
1 16384 0.33887 0.152925 832.191 N/A N/A N/A
1 2048 0.00504232 0.00658214 1.80885 N/A N/A N/A
1 2048 0.00515761 0.00637629 1.79833 N/A N/A N/A
1 2048 0.00516392 0.00643581 1.76621 N/A N/A N/A
1 2048 0.00598976 0.00641998 1.82543 N/A N/A N/A
1 2048 0.00603625 0.00644166 1.77671 N/A N/A N/A
1 20480 0.460992 0.238697 1624.62 N/A N/A N/A
1 20480 0.464403 0.265974 1624.29 N/A N/A N/A
1 20480 0.47702 0.235815 1624.41 N/A N/A N/A
1 20480 1.38213 0.269189 1624.42 N/A N/A N/A
1 20480 1.39669 0.250381 1624.56 N/A N/A N/A
1 24576 0.655541 0.429884 2804.33 N/A N/A N/A
1 24576 0.657028 0.429112 2804.09 N/A N/A N/A
1 24576 0.657325 0.427222 2804.41 N/A N/A N/A
1 24576 2.0712 0.401655 2739.03 N/A N/A N/A
1 24576 2.07397 0.333131 2804.92 N/A N/A N/A
1 28672 N/A N/A N/A N/A N/A N/A
1 28672 N/A N/A N/A N/A N/A N/A
1 28672 N/A N/A N/A N/A N/A N/A
1 28672 N/A N/A N/A N/A N/A N/A
1 28672 N/A N/A N/A N/A N/A N/A
1 4096 0.0231579 0.00999118 13.1715 N/A N/A N/A
1 4096 0.0232468 0.00982912 13.163 N/A N/A N/A
1 4096 0.023328 0.00991858 13.1621 N/A N/A N/A
1 4096 0.0234837 0.009828 13.1503 N/A N/A N/A
1 4096 0.0235541 0.00984335 13.2036 N/A N/A N/A
1 8192 0.0784476 0.0372159 104.284 N/A N/A N/A
1 8192 0.0920494 0.038134 104.241 N/A N/A N/A
1 8192 0.0921484 0.0383662 104.223 N/A N/A N/A
1 8192 0.0924822 0.0389563 104.229 N/A N/A N/A
1 8192 0.092939 0.0385861 104.276 N/A N/A N/A
3 1024 0.00328311 0.00926315 0.112731 1.8791e-05 0 0.00835358
3 1024 0.00329086 0.00922859 0.112675 2.0921e-05 0 0.00833679
3 1024 0.00440994 0.00924595 0.11267 2.0884e-05 0 0.0082301
3 1024 0.00445893 0.009489 0.112683 1.7858e-05 0 0.00848686
3 1024 0.0072908 0.00907376 0.112673 2.0147e-05 0 0.00811212
3 12288 0.561703 0.176868 118.13 2.7916e-05 0 0.0913347
3 12288 0.654852 0.263932 118.119 3.4159e-05 0 0.0895775
3 12288 0.657381 0.255674 118.093 3.5371e-05 0 0.0979694
3 12288 0.833826 0.145477 118.046 4.7146e-05 0 0.0907535
3 12288 0.835135 0.227539 118.027 4.1051e-05 0 0.0903243
3 16384 0.824688 0.309549 278.873 3.8552e-05 0 0.123685
3 16384 1.1249 0.326178 278.747 4.0411e-05 0 0.121556
3 16384 1.15225 0.256605 278.873 3.8743e-05 0 0.121879
3 16384 1.1647 0.346013 278.732 3.91e-05 0 0.122295
3 16384 1.1786 0.339201 278.917 4.1596e-05 0 0.134053
3 2048 0.0167991 0.0194809 0.722691 2.3263e-05 0 0.016274
3 2048 0.0182448 0.0199675 0.744829 3.1988e-05 0 0.0177133
3 2048 0.0182918 0.0181698 0.714351 3.221e-05 0 0.0163193
3 2048 0.0233924 0.0182318 0.714997 2.5599e-05 0 0.0162061
3 2048 0.0282978 0.0181439 0.711991 3.1811e-05 0 0.0155482
3 20480 1.60845 0.48895 544.991 2.7212e-05 0 0.157349
3 20480 1.60874 0.397964 545.077 2.7745e-05 0 0.155793
3 20480 1.82533 0.660884 545.003 3.6109e-05 0 0.153028
3 20480 2.10579 0.346034 545.062 0.000132797 0 0.155545
3 20480 2.2901 0.347463 544.927 4.3593e-05 0 0.154901
3 24576 1.67869 0.488468 942.637 3.9924e-05 0 0.206598
3 24576 1.75217 0.479492 942.81 3.83e-05 0 0.199061
3 24576 2.3042 0.696774 942.718 2.8111e-05 0 0.193464
3 24576 2.31521 0.555205 942.651 2.8201e-05 0 0.19697
3 24576 2.46059 0.472228 942.6 4.0313e-05 0 0.192095
3 28672 2.23796 0.557628 1492.88 0.000120167 0 0.222137
3 28672 2.24296 0.555431 1492.77 0.00010871 0 0.226154
3 28672 2.24822 0.561127 1493.07 2.5955e-05 0 0.227551
3 28672 3.7768 0.569882 1492.53 9.7885e-05 0 0.236113
3 28672 4.4029 0.555564 1493.21 0.000121551 0 0.224804
3 4096 0.0531384 0.0372385 4.54067 2.889e-05 0 0.0307754
3 4096 0.0532432 0.0374889 4.56923 2.5956e-05 0 0.0311661
3 4096 0.0730919 0.0396416 4.52502 2.6071e-05 0 0.0330979
3 4096 0.0930817 0.0400355 4.56738 3.4291e-05 0 0.0332124
3 4096 0.093738 0.0374264 4.54704 2.5769e-05 0 0.0308935
3 8192 0.213381 0.155906 35.1825 2.6688e-05 0 0.060014
3 8192 0.250138 0.0930525 35.1869 2.4349e-05 0 0.0584394
3 8192 0.291396 0.149044 35.2114 2.754e-05 0 0.0588106
3 8192 0.370589 0.152634 35.152 2.6861e-05 0 0.0625358
3 8192 0.373011 0.122715 35.1778 3.9765e-05 0 0.0611953
4 1024 0.00424934 0.00798465 0.0813198 2.011e-05 0 0.00680675
4 1024 0.00500353 0.00853599 0.0815848 1.704e-05 0 0.00732984
4 1024 0.00539431 0.00775142 0.0816003 1.9872e-05 0 0.00650655
4 1024 0.00665654 0.00808147 0.0813291 2.1891e-05 0 0.00651826
4 1024 0.00829979 0.00822191 0.0817169 1.8329e-05 0 0.00677913
4 12288 0.631718 0.262403 88.7396 4.4024e-05 0 0.0699107
4 12288 0.813772 0.230718 88.719 2.7439e-05 0 0.0693203
4 12288 0.992681 0.130307 88.6013 3.3891e-05 0 0.071082
4 12288 0.993428 0.218056 88.7077 3.1979e-05 0 0.076728
4 12288 1.17092 0.208521 88.706 2.9883e-05 0 0.0772021
4 16384 1.12747 0.253658 209.896 2.9474e-05 0 0.0929773
4 16384 1.35109 0.270686 209.831 3.8084e-05 0 0.0895341
4 16384 1.42424 0.329146 208.832 3.8942e-05 0 0.0941216
4 16384 1.43119 0.217872 209.915 3.4299e-05 0 0.092969
4 16384 2.76237 0.302438 209.97 4.4246e-05 0 0.0983166
4 2048 0.0175523 0.0147529 0.589609 3.0612e-05 0 0.0123659
4 2048 0.0176086 0.0141757 0.58641 3.3362e-05 0 0.0121677
4 2048 0.0176143 0.0145419 0.58342 3.4418e-05 0 0.0124449
4 2048 0.0223678 0.0148413 0.588195 2.5134e-05 0 0.0124515
4 2048 0.0326701 0.0153402 0.585664 3.2062e-05 0 0.0121571
4 20480 2.0976 0.357787 409.589 2.7768e-05 0 0.117828
4 20480 2.15988 0.358864 409.63 3.4147e-05 0 0.115995
4 20480 2.17189 0.352906 409.649 3.637e-05 0 0.116119
4 20480 2.6963 0.352784 409.548 3.2969e-05 0 0.118445
4 20480 2.78637 0.445335 409.636 7.0237e-05 0 0.12109
4 24576 3.12427 0.520685 708.174 3.7485e-05 0 0.148618
4 24576 3.80379 0.436116 708.3 0.000100087 0 0.147245
4 24576 3.85687 0.437108 708.006 0.000112201 0 0.145439
4 24576 4.58185 0.437335 708.269 8.3601e-05 0 0.14578
4 24576 4.62506 0.414518 708.226 6.7973e-05 0 0.143999
4 28672 2.94256 0.569035 1123.76 9.8789e-05 0 0.170686
4 28672 3.10711 0.575125 1123.53 4.0118e-05 0 0.170874
4 28672 3.91941 0.576468 1123.85 8.0959e-05 0 0.173558
4 28672 4.0694 1.11628 1124.02 3.972e-05 0 0.171265
4 28672 5.38295 0.562529 1123.76 0.000112977 0 0.166384
4 4096 0.0749119 0.029967 3.44954 3.4427e-05 0 0.0234991
4 4096 0.0816261 0.0299385 3.45589 3.471e-05 0 0.0235342
4 4096 0.0904446 0.0297327 3.44028 3.3619e-05 0 0.0232289
4 4096 0.128435 0.0300389 3.43012 4.2979e-05 0 0.023315
4 4096 0.131614 0.031896 3.44401 3.175e-05 0 0.025098
4 8192 0.360992 0.0995545 26.3889 3.2557e-05 0 0.0461854
4 8192 0.442019 0.0998839 26.3889 4.3557e-05 0 0.0440068
4 8192 0.4432 0.0698227 26.4246 3.0847e-05 0 0.0459277
4 8192 0.445117 0.0723922 26.4163 3.0758e-05 0 0.0483383
4 8192 0.498962 0.122846 26.3866 3.2248e-05 0 0.0491055
5 1024 0.00626026 0.00711719 0.072746 1.9767e-05 0 0.00604534
5 1024 0.00629822 0.00714563 0.0724659 2.1621e-05 0 0.00606253
5 1024 0.00636468 0.00694546 0.072463 1.9969e-05 0 0.0058803
5 1024 0.0064734 0.00719031 0.0727505 1.7806e-05 0 0.0059734
5 1024 0.00733282 0.00709627 0.0716674 1.9791e-05 0 0.00586863
5 12288 0.966264 0.224839 71.2201 4.2827e-05 0 0.0583948
5 12288 0.975717 0.221451 71.1773 3.9983e-05 0 0.0554821
5 12288 0.975889 0.1438 71.2071 3.9109e-05 0 0.0581037
5 12288 1.14896 0.218581 71.2074 3.3951e-05 0 0.0598956
5 12288 1.14907 0.119704 71.165 3.3599e-05 0 0.0654929
5 16384 1.68516 0.326561 167.924 5.3951e-05 0 0.0803435
5 16384 1.71246 0.227562 167.921 2.8604e-05 0 0.0752803
5 16384 1.7667 0.229319 167.953 3.2901e-05 0 0.07822
5 16384 2.00706 0.238478 167.902 4.1727e-05 0 0.0759504
5 16384 2.02655 0.190534 167.985 3.1777e-05 0 0.0774029
5 2048 0.0218087 0.0126118 0.488012 2.5232e-05 0 0.0101371
5 2048 0.02669 0.0126952 0.484806 3.2503e-05 0 0.0101045
5 2048 0.0266934 0.013037 0.462031 3.131e-05 0 0.0105198
5 2048 0.031632 0.012919 0.479333 3.1838e-05 0 0.0102177
5 2048 0.0318694 0.0125402 0.46691 3.2788e-05 0 0.0102146
5 20480 2.58632 0.344877 327.741 3.2379e-05 0 0.0968003
5 20480 2.59351 0.318252 327.77 2.8883e-05 0 0.098235
5 20480 2.67112 0.310584 327.77 3.4275e-05 0 0.0932044
5 20480 2.67286 0.309997 327.778 3.4117e-05 0 0.0931279
5 20480 3.96178 0.35332 327.805 4.3926e-05 0 0.0954063
5 24576 2.84004 0.427904 568.146 8.8899e-05 0 0.12408
5 24576 3.58511 0.416318 568.18 0.00013624 0 0.122975
5 24576 4.40517 0.411551 568.122 9.9022e-05 0 0.11279
5 24576 5.13899 0.413256 568.148 0.000121056 0 0.117069
5 24576 5.21971 0.421656 568.14 6.2848e-05 0 0.119361
5 28672 3.69871 0.508519 898.196 0.000113927 0 0.134087
5 28672 4.83615 0.508001 898.271 4.3925e-05 0 0.13358
5 28672 5.16482 0.803332 898.345 3.3206e-05 0 0.134895
5 28672 5.91757 0.506895 898.164 4.1868e-05 0 0.133536
5 28672 5.94058 0.624703 898.439 5.8852e-05 0 0.13737
5 4096 0.0881631 0.02589 2.86295 3.5837e-05 0 0.0194658
5 4096 0.110979 0.0260996 2.8703 3.3729e-05 0 0.0195574
5 4096 0.150328 0.028534 2.84367 2.3779e-05 0 0.0216342
5 4096 0.167718 0.0282552 2.83265 3.6147e-05 0 0.021278
5 4096 0.194351 0.025488 2.85847 3.0074e-05 0 0.0189319
5 8192 0.355962 0.119344 21.3627 2.7211e-05 0 0.0373668
5 8192 0.408536 0.0725277 21.3434 3.7949e-05 0 0.0382299
5 8192 0.465773 0.0755287 21.3497 4.3627e-05 0 0.0396097
5 8192 0.520241 0.0610231 21.344 3.9839e-05 0 0.0368117
5 8192 0.60072 0.10109 21.3277 4.216e-05 0 0.0371998
6 1024 0.00868206 0.00693985 0.063555 1.8213e-05 0 0.00544578
6 1024 0.00951519 0.00664554 0.0635664 2.55e-05 0 0.00548882
6 1024 0.0102805 0.00711752 0.0635424 2.1641e-05 0 0.00602058
6 1024 0.0103384 0.0068385 0.0635665 2.2948e-05 0 0.00572505
6 1024 0.01305 0.00686724 0.0639296 2.1714e-05 0 0.0053011
6 12288 1.15126 0.130873 59.3305 2.3829e-05 0 0.0473259
6 12288 1.15758 0.1231 59.2722 2.2994e-05 0 0.0470939
6 12288 1.30832 0.202558 59.3446 3.2172e-05 0 0.0467794
6 12288 1.31029 0.213432 59.3045 3.9033e-05 0 0.0492881
6 12288 1.3326 0.154162 59.3528 2.7425e-05 0 0.0462263
6 16384 1.96666 0.164346 140.297 2.7546e-05 0 0.062182
6 16384 1.99509 0.281598 140.31 3.2348e-05 0 0.0639401
6 16384 2.05101 0.199853 140.349 2.3131e-05 0 0.0662001
6 16384 2.91927 0.213653 140.284 4.3901e-05 0 0.062205
6 16384 2.92359 0.220366 140.327 5.293e-05 0 0.0637846
6 2048 0.0314814 0.011556 0.414018 3.2451e-05 0 0.00882496
6 2048 0.0319555 0.011903 0.414314 2.4681e-05 0 0.00911698
6 2048 0.032945 0.0120916 0.417465 2.9634e-05 0 0.00880646
6 2048 0.0477218 0.0116474 0.423377 3.1195e-05 0 0.00869036
6 2048 0.0488801 0.0133713 0.419442 3.4214e-05 0 0.00876038
6 20480 3.16503 0.299299 273.96 3.0388e-05 0 0.0790437
6 20480 3.23449 0.29552 274.001 2.2567e-05 0 0.0781281
6 20480 4.04068 0.327656 274.046 5.6017e-05 0 0.0811092
6 20480 4.04695 0.329623 274.074 2.8345e-05 0 0.0787502
6 20480 4.05313 0.330453 272.467 2.6294e-05 0 0.0797482
6 24576 4.33369 0.398697 473.502 6.7977e-05 0 0.0938073
6 24576 4.38721 0.604233 473.434 3.094e-05 0 0.0934315
6 24576 4.44323 0.401801 473.526 2.3107e-05 0 0.102057
6 24576 4.92742 0.378692 473.565 1.976e-05 0 0.0989457
6 24576 5.05733 0.41492 473.513 4.225e-05 0 0.0955489
6 28672 5.52412 0.481071 749.927 4.3788e-05 0 0.112869
6 28672 6.13378 0.792268 749.922 2.391e-05 0 0.110839
6 28672 7.71523 0.468687 749.773 3.8279e-05 0 0.109595
6 28672 7.7397 0.474836 750.098 2.3126e-05 0 0.1176
6 28672 8.84258 0.472475 749.856 3.3455e-05 0 0.111949
6 4096 0.108538 0.0238656 2.44185 3.4194e-05 0 0.0163448
6 4096 0.130683 0.0264544 2.42098 2.3734e-05 0 0.0163579
6 4096 0.150465 0.0231551 2.40158 3.6294e-05 0 0.0164829
6 4096 0.173965 0.0234687 2.39762 3.329e-05 0 0.016687
6 4096 0.189678 0.0235607 2.40704 3.3561e-05 0 0.0163821
6 8192 0.422717 0.0544958 17.7945 3.1691e-05 0 0.0307614
6 8192 0.504093 0.0661697 17.8199 2.6698e-05 0 0.0320695
6 8192 0.515602 0.0714152 17.8147 4.0408e-05 0 0.0316571
6 8192 0.523375 0.0676408 17.7866 3.0214e-05 0 0.0320871
6 8192 0.59181 0.0559596 17.8015 2.6927e-05 0 0.0322152
7 1024 0.00978998 0.00636856 0.0556308 1.7669e-05 0 0.00510286
7 1024 0.00994451 0.00632937 0.0556765 2.5176e-05 0 0.00513655
7 1024 0.0101994 0.00626071 0.0556742 2.3285e-05 0 0.0050703
7 1024 0.0102756 0.00592422 0.0557043 2.4316e-05 0 0.00476276
7 1024 0.0133859 0.00633544 0.0556772 2.3259e-05 0 0.0049827
7 12288 1.29007 0.11987 51.0541 3.1887e-05 0 0.0408036
7 12288 1.33134 0.124977 51.0485 3.1385e-05 0 0.0432223
7 12288 1.64445 0.0918237 51.0042 4.1037e-05 0 0.0395923
7 12288 1.64553 0.187148 50.991 4.5013e-05 0 0.0401444
7 12288 2.00733 0.0943252 50.9144 4.5447e-05 0 0.0416955
7 16384 2.34954 0.190777 120.631 2.3275e-05 0 0.0547237
7 16384 2.55598 0.281225 120.64 3.7475e-05 0 0.0559481
7 16384 3.21057 0.262668 120.571 3.5592e-05 0 0.0533235
7 16384 3.2183 0.263928 120.749 5.1073e-05 0 0.0584708
7 16384 3.25534 0.156623 120.572 4.2661e-05 0 0.0528344
7 2048 0.0354187 0.0140891 0.381142 3.2133e-05 0 0.00810734
7 2048 0.0356042 0.0109749 0.38771 3.0884e-05 0 0.00827697
7 2048 0.0404762 0.0107848 0.356061 2.9271e-05 0 0.00800876
7 2048 0.0405318 0.0110416 0.376421 3.1508e-05 0 0.00837162
7 2048 0.0456832 0.0108622 0.388067 3.3607e-05 0 0.00813526
7 20480 3.38207 0.251762 235.599 3.389e-05 0 0.0681499
7 20480 3.94381 0.255014 235.542 4.3054e-05 0 0.068471
7 20480 4.03254 0.359563 235.599 4.4439e-05 0 0.0684681
7 20480 4.41082 0.305668 235.621 7.542e-05 0 0.072346
7 20480 4.48182 0.312956 235.574 7.4034e-05 0 0.0667228
7 24576 5.09756 0.393157 405.839 3.945e-05 0 0.0876702
7 24576 5.3717 0.451435 405.806 4.9784e-05 0 0.0819838
7 24576 5.54643 0.40494 405.867 3.4677e-05 0 0.0820604
7 24576 6.26637 0.396145 406.04 7.8521e-05 0 0.0840211
7 24576 6.34736 0.393793 405.884 6.926e-05 0 0.0799823
7 28672 6.28508 0.482708 644.129 5.057e-05 0 0.0949674
7 28672 7.07537 0.533277 644.226 3.1667e-05 0 0.0949927
7 28672 7.07588 0.776838 644.154 3.0978e-05 0 0.0941872
7 28672 7.3636 0.489632 644.239 4.1981e-05 0 0.0994521
7 28672 7.39247 0.505683 648.631 4.4679e-05 0 0.0959285
7 4096 0.12874 0.0210448 2.10497 2.596e-05 0 0.0144786
7 4096 0.145725 0.0223648 2.11394 3.5547e-05 0 0.0156106
7 4096 0.203679 0.0212802 2.1314 3.3857e-05 0 0.0142137
7 4096 0.204631 0.0370565 2.11569 3.5902e-05 0 0.0148838
7 4096 0.224664 0.0216003 2.11325 3.5107e-05 0 0.0143194
7 8192 0.498628 0.0617634 15.3457 3.1834e-05 0 0.0272779
7 8192 0.499254 0.056156 15.3637 2.5814e-05 0 0.0269268
7 8192 0.731396 0.0806124 15.3347 4.2377e-05 0 0.0271821
7 8192 0.745145 0.0525766 15.3322 3.8066e-05 0 0.0281036
7 8192 0.808026 0.0625311 15.3727 3.8282e-05 0 0.0272685
8 1024 0.0092223 0.00729795 0.0411018 2.156e-05 0 0.0061383
8 1024 0.00957748 0.0054867 0.041105 2.1652e-05 0 0.00397303
8 1024 0.0101997 0.00519405 0.041086 1.9914e-05 0 0.00393559
8 1024 0.0116801 0.00518894 0.04113 2.0763e-05 0 0.00393224
8 1024 0.0135127 0.00523433 0.0410069 2.6709e-05 0 0.00404032
8 12288 1.62613 0.207884 44.5242 4.7126e-05 0 0.0358382
8 12288 1.63204 0.0874535 44.6654 3.586e-05 0 0.0353357
8 12288 1.65318 0.10326 44.6873 4.0311e-05 0 0.0357071
8 12288 1.98619 0.138443 44.6811 4.6583e-05 0 0.0353805
8 12288 2.47477 0.114656 44.6391 2.9835e-05 0 0.0351976
8 16384 2.81796 0.188824 105.576 3.059e-05 0 0.0456644
8 16384 2.8519 0.275621 105.628 3.9977e-05 0 0.0467352
8 16384 3.16222 0.145513 105.488 2.582e-05 0 0.0474477
8 16384 3.26047 0.269743 105.522 4.0806e-05 0 0.0473275
8 16384 3.53387 0.157261 105.51 6.59e-05 0 0.047903
8 2048 0.0346667 0.00951557 0.309715 3.5083e-05 0 0.00678287
8 2048 0.0499709 0.0100524 0.308353 3.3529e-05 0 0.00718865
8 2048 0.0500415 0.0123193 0.316272 3.0703e-05 0 0.00934259
8 2048 0.0565316 0.0119153 0.30486 3.1608e-05 0 0.00718995
8 2048 0.0605222 0.0100491 0.312657 3.0876e-05 0 0.00683571
8 20480 4.02431 0.293802 205.42 7.4664e-05 0 0.0598119
8 20480 4.2869 0.277996 205.993 5.2543e-05 0 0.0588491
8 20480 4.2901 0.413242 205.978 5.139e-05 0 0.0612281
8 20480 4.33891 0.306369 206.005 3.278e-05 0 0.0583006
8 20480 5.25432 0.282571 206.007 0.000113913 0 0.0585491
8 24576 5.35024 0.34207 355.156 4.3358e-05 0 0.0703982
8 24576 5.80158 0.411689 356.121 3.2988e-05 0 0.0707825
8 24576 6.14205 0.423966 356.153 3.9844e-05 0 0.0725949
8 24576 6.1629 0.47001 356.199 3.6903e-05 0 0.0750209
8 24576 6.87922 0.335357 356.11 4.0466e-05 0 0.0700159
8 28672 8.08551 0.533685 565.265 4.4994e-05 0 0.0826947
8 28672 8.08641 0.484666 565.159 4.3735e-05 0 0.0866256
8 28672 8.19656 0.764136 565.077 2.2186e-05 0 0.0846491
8 28672 8.35491 0.785607 563.635 3.0479e-05 0 0.0854427
8 28672 9.2499 0.484437 565.126 4.2301e-05 0 0.0812056
8 4096 0.142966 0.0235583 1.84659 3.3614e-05 0 0.0124814
8 4096 0.169629 0.0356551 1.79912 2.6554e-05 0 0.0131801
8 4096 0.200391 0.0196807 1.80042 3.2571e-05 0 0.0126221
8 4096 0.202738 0.0194258 1.82419 2.7929e-05 0 0.0123935
8 4096 0.261749 0.0200054 1.83626 3.7544e-05 0 0.0123137
8 8192 0.590385 0.061555 13.339 3.2753e-05 0 0.0238377
8 8192 0.641999 0.0496042 13.37 3.3167e-05 0 0.025731
8 8192 0.695969 0.0631701 13.3185 2.4224e-05 0 0.0239738
8 8192 0.723803 0.0603306 13.3479 2.5732e-05 0 0.0242032
8 8192 1.17781 0.0593103 13.331 2.7234e-05 0 0.0244892
9 1024 0.010447 0.00521514 0.0410959 2.3747e-05 0 0.00397907
9 1024 0.011976 0.00671016 0.0409182 2.4568e-05 0 0.00554693
9 1024 0.0129689 0.00572037 0.0409595 1.9036e-05 0 0.00439408
9 1024 0.0138766 0.0054796 0.0410329 1.9175e-05 0 0.00392319
9 1024 0.0145504 0.00536505 0.0410971 1.8858e-05 0 0.00400398
9 12288 1.61952 0.200256 39.9477 4.2282e-05 0 0.0359929
9 12288 1.6471 0.208693 39.8066 5.0237e-05 0 0.0358112
9 12288 1.95013 0.108836 39.9159 3.1749e-05 0 0.0317545
9 12288 2.14742 0.12831 39.8664 5.011e-05 0 0.0309462
9 12288 2.3583 0.141778 39.94 6.6167e-05 0 0.0329552
9 16384 2.93255 0.172581 94.0631 6.0937e-05 0 0.0415141
9 16384 3.35622 0.218024 94.0094 7.2064e-05 0 0.042682
9 16384 3.62784 0.184272 94.04 0.000123064 0 0.0418996
9 16384 3.62935 0.218039 94.0543 4.5792e-05 0 0.041814
9 16384 4.25166 0.292918 93.6736 4.0361e-05 0 0.0453459
9 2048 0.0450266 0.0117536 0.306365 3.0105e-05 0 0.00700229
9 2048 0.049274 0.0147161 0.302245 4.0062e-05 0 0.0119291
9 2048 0.0509845 0.00998514 0.311185 2.6331e-05 0 0.00691135
9 2048 0.0513843 0.00976248 0.30886 3.8005e-05 0 0.00675168
9 2048 0.0518574 0.0100607 0.300983 3.6971e-05 0 0.00715928
9 20480 4.47293 0.241737 183.3 2.5293e-05 0 0.057027
9 20480 4.80797 0.304212 183.303 8.4123e-05 0 0.0539045
9 20480 4.97572 0.253039 183.29 5.4334e-05 0 0.0535539
9 20480 5.29565 0.224923 183.268 4.2589e-05 0 0.051252
9 20480 5.44186 0.228345 183.162 6.1097e-05 0 0.0521539
9 24576 6.38807 0.406517 315.419 4.1902e-05 0 0.0650924
9 24576 7.26259 0.400136 317.32 4.1289e-05 0 0.0656122
9 24576 7.71945 0.377894 317.349 3.4786e-05 0 0.0641976
9 24576 8.93686 0.410238 317.348 9.135e-05 0 0.0705314
9 24576 9.84167 0.352018 317.353 4.0035e-05 0 0.0628436
9 28672 10.0777 0.518594 501.815 3.5143e-05 0 0.0744184
9 28672 11.5716 0.513493 501.788 4.9636e-05 0 0.072472
9 28672 14.9434 0.804742 501.822 4.0131e-05 0 0.0746685
9 28672 9.08121 0.486069 501.701 5.2514e-05 0 0.0812852
9 28672 9.71272 0.613169 501.844 2.515e-05 0 0.0772959
9 4096 0.164913 0.0187859 1.69781 3.8673e-05 0 0.0116227
9 4096 0.179098 0.0202342 1.71836 3.1727e-05 0 0.011784
9 4096 0.182106 0.0251585 1.69786 2.9178e-05 0 0.0117544
9 4096 0.248445 0.0348751 1.72749 3.6525e-05 0 0.0119745
9 4096 0.250187 0.0215842 1.7074 5.2063e-05 0 0.0115217
9 8192 0.749907 0.0510509 12.0703 4.7571e-05 0 0.0252916
9 8192 0.754166 0.0531318 12.0485 6.0616e-05 0 0.0220936
9 8192 0.807215 0.0935573 12.0965 3.9173e-05 0 0.0217854
9 8192 0.815185 0.0465991 12.0491 4.8674e-05 0 0.021194
9 8192 0.982861 0.0473382 12.0889 5.796e-05 0 0.0212581
10 1024 0.0118196 0.00566997 0.0409166 2.1467e-05 0 0.00442736
10 1024 0.0122263 0.00591856 0.0409174 2.3675e-05 0 0.00430895
10 1024 0.0123964 0.00572356 0.0409216 2.1107e-05 0 0.00449114
10 1024 0.0162042 0.00563924 0.040951 2.1372e-05 0 0.00405171
10 1024 0.0168784 0.0074921 0.0409121 1.8981e-05 0 0.00610693
10 12288 1.80307 0.186799 36.1517 5.3859e-05 0 0.0284313
10 12288 1.94779 0.0965577 36.1093 4.7709e-05 0 0.0287085
10 12288 2.18095 0.0867267 36.0807 3.5966e-05 0 0.028591
10 12288 2.18833 0.0994824 36.1577 2.3018e-05 0 0.0292206
10 12288 2.21606 0.0895323 36.1045 4.1941e-05 0 0.0289283
10 16384 3.57314 0.288934 84.8144 2.8243e-05 0 0.0376346
10 16384 3.83585 0.148747 84.7932 4.3836e-05 0 0.0397525
10 16384 3.86047 0.284114 84.7892 5.6746e-05 0 0.0388182
10 16384 4.0207 0.176444 84.8794 6.3853e-05 0 0.0422696
10 16384 4.39897 0.17487 84.8847 4.8618e-05 0 0.0433282
10 2048 0.0523775 0.0109854 0.283196 3.2167e-05 0 0.006069
10 2048 0.0602048 0.00917639 0.283318 3.0568e-05 0 0.00615315
10 2048 0.060923 0.0110302 0.272862 3.2572e-05 0 0.00596007
10 2048 0.0609538 0.00944057 0.279725 3.311e-05 0 0.00637061
10 2048 0.0624495 0.00949621 0.282959 3.392e-05 0 0.00612606
10 20480 5.3949 0.31351 164.779 4.735e-05 0 0.0623895
10 20480 5.80225 0.216068 164.194 3.5295e-05 0 0.0503599
10 20480 6.07147 0.312206 164.79 3.1391e-05 0 0.0501938
10 20480 6.59491 0.265057 164.796 7.668e-05 0 0.0459431
10 20480 7.71279 0.425863 164.712 8.5541e-05 0 0.0473143
10 24576 10.8661 0.537575 285.757 4.5588e-05 0 0.0580571
10 24576 8.03428 0.386758 285.726 7.1604e-05 0 0.0561189
10 24576 8.07657 0.420733 285.728 7.5383e-05 0 0.0782313
10 24576 8.38963 0.539206 285.673 7.5723e-05 0 0.0567282
10 24576 9.34607 0.535903 285.719 4.4639e-05 0 0.0572931
10 28672 10.6183 0.488973 453.049 3.5735e-05 0 0.0671033
10 28672 12.1514 0.486729 452.941 3.8841e-05 0 0.0677162
10 28672 12.2107 0.505038 452.945 5.519e-05 0 0.0689063
10 28672 13.6148 0.501769 453.049 5.3554e-05 0 0.065589
10 28672 9.34503 0.684615 453.083 3.2381e-05 0 0.0665902
10 4096 0.205755 0.0176239 1.49592 3.8007e-05 0 0.0105184
10 4096 0.22783 0.0179037 1.52174 3.2916e-05 0 0.0103612
10 4096 0.242318 0.0193037 1.50269 2.8358e-05 0 0.0119223
10 4096 0.24957 0.0296639 1.49877 3.413e-05 0 0.0105473
10 4096 0.453904 0.0345178 1.53628 3.2771e-05 0 0.0104767
10 8192 0.747052 0.0568009 10.8424 4.3121e-05 0 0.0194785
10 8192 0.824395 0.0479963 10.8707 4.8311e-05 0 0.0211941
10 8192 0.976857 0.0558349 10.8608 6.0252e-05 0 0.0300963
10 8192 0.986692 0.100021 10.877 3.355e-05 0 0.0206851
10 8192 1.08281 0.0539042 10.856 2.8779e-05 0 0.0196749
11 1024 0.0113772 0.00499354 0.0327481 2.2454e-05 0 0.00372369
11 1024 0.0145301 0.00799248 0.0324496 2.4498e-05 0 0.00638608
11 1024 0.0149662 0.00500604 0.032669 2.0855e-05 0 0.00373281
11 1024 0.0154073 0.00550857 0.0327679 2.3644e-05 0 0.00379999
11 1024 0.0165681 0.00491443 0.032648 2.2067e-05 0 0.00362003
11 12288 2.06768 0.175955 32.6327 6.4034e-05 0 0.026375
11 12288 2.19954 0.179611 32.5605 6.2687e-05 0 0.0260807
11 12288 2.37741 0.0864746 32.6198 4.3859e-05 0 0.0267845
11 12288 2.4011 0.0960507 32.6081 3.4763e-05 0 0.0259305
11 12288 2.56692 0.148615 32.5647 4.3454e-05 0 0.0263445
11 16384 4.15841 0.352785 77.2306 4.2366e-05 0 0.0370394
11 16384 4.18744 0.145352 77.3489 5.1116e-05 0 0.0346049
11 16384 4.46718 0.175925 77.4007 0.00012471 0 0.038174
11 16384 4.51359 0.259789 77.3234 5.6957e-05 0 0.0349544
11 16384 5.31529 0.163189 77.3034 4.57e-05 0 0.0355305
11 2048 0.0545391 0.00893402 0.243962 3.915e-05 0 0.00591591
11 2048 0.0553515 0.0128652 0.234017 3.1435e-05 0 0.00815661
11 2048 0.0670339 0.00901555 0.243126 3.4164e-05 0 0.00584443
11 2048 0.0694886 0.0121089 0.239582 3.4611e-05 0 0.0059836
11 2048 0.069629 0.0088683 0.243665 2.9622e-05 0 0.00567299
11 20480 5.33119 0.229026 150.214 4.1518e-05 0 0.0441507
11 20480 6.29995 0.272582 150.195 8.1327e-05 0 0.0444271
11 20480 6.93877 0.230544 150.154 4.3126e-05 0 0.046766
11 20480 7.45189 0.230833 150.154 7.0103e-05 0 0.0442488
11 20480 8.258 0.354066 150.154 4.1754e-05 0 0.0457484
11 24576 10.6629 0.531697 259.833 4.019e-05 0 0.0508907
11 24576 7.48769 0.375216 259.816 8.729e-05 0 0.051507
11 24576 8.24965 0.375443 259.753 8.2481e-05 0 0.0536789
11 24576 9.79317 0.314904 259.046 3.6743e-05 0 0.0523497
11 24576 9.8652 0.604804 259.783 4.0272e-05 0 0.0593574
11 28672 10.9718 0.51196 411.543 4.1945e-05 0 0.0637007
11 28672 11.6984 0.486215 409.466 4.1957e-05 0 0.0609077
11 28672 12.1393 0.492071 411.655 3.6278e-05 0 0.0638882
11 28672 12.6852 0.531143 411.497 5.5527e-05 0 0.0623295
11 28672 14.3241 0.74993 411.499 4.9083e-05 0 0.0614225
11 4096 0.208608 0.018207 1.4202 3.3528e-05 0 0.00998151
11 4096 0.230784 0.0249496 1.39855 3.28e-05 0 0.0104623
11 4096 0.249878 0.0184134 1.42527 3.4644e-05 0 0.010188
11 4096 0.262544 0.0176569 1.39191 3.416e-05 0 0.00976143
11 4096 0.288371 0.0264374 1.41136 3.2868e-05 0 0.00997102
11 8192 0.918467 0.0971354 10.0515 3.7919e-05 0 0.0195729
11 8192 0.956453 0.044473 10.0395 3.7652e-05 0 0.0183412
11 8192 0.966686 0.0540371 10.0246 3.6389e-05 0 0.0184672
11 8192 1.06022 0.0530892 10.035 4.1876e-05 0 0.0180642
11 8192 1.15022 0.0515529 10.0171 3.5064e-05 0 0.0186616
12 1024 0.0157875 0.00652607 0.0327831 2.1963e-05 0 0.00517856
12 1024 0.0158754 0.00834344 0.0328047 2.1918e-05 0 0.00632297
12 1024 0.0163812 0.00769418 0.0327411 2.2711e-05 0 0.00573383
12 1024 0.017436 0.00767006 0.0327912 2.1884e-05 0 0.00637613
12 1024 0.0174906 0.00936965 0.032655 2.1632e-05 0 0.00807744
12 12288 2.00275 0.18604 29.837 7.2092e-05 0 0.0307638
12 12288 2.18509 0.176338 29.8933 4.4588e-05 0 0.028003
12 12288 2.26161 0.172435 29.8664 4.6744e-05 0 0.0241607
12 12288 2.50471 0.0857314 29.8435 4.3062e-05 0 0.0249493
12 12288 2.50978 0.164697 29.8445 4.6779e-05 0 0.0302272
12 16384 4.19553 0.145553 70.8817 8.3017e-05 0 0.0320378
12 16384 4.37005 0.167415 70.9042 3.1945e-05 0 0.0319775
12 16384 4.44523 0.276829 70.9368 4.2685e-05 0 0.0320474
12 16384 4.48921 0.159992 70.9073 7.72e-05 0 0.0320587
12 16384 4.87289 0.259718 70.8541 8.1083e-05 0 0.0325521
12 2048 0.0550004 0.0124719 0.236739 0.000255503 0 0.00927521
12 2048 0.0602412 0.00907621 0.239364 3.0447e-05 0 0.00583203
12 2048 0.0634286 0.00893433 0.245196 3.6116e-05 0 0.00568191
12 2048 0.0663616 0.0121261 0.239939 3.2585e-05 0 0.00883011
12 2048 0.0726708 0.0145411 0.24141 3.3168e-05 0 0.0111945
12 20480 6.22067 0.244143 138.252 4.1958e-05 0 0.0416499
12 20480 6.38124 0.256283 138.236 3.7427e-05 0 0.0391648
12 20480 6.85248 0.310152 138.199 6.5033e-05 0 0.0443265
12 20480 7.12109 0.237293 138.243 7.3023e-05 0 0.0413077
12 20480 7.53844 0.312626 138.301 4.5524e-05 0 0.0421662
12 24576 10.8845 0.357228 237.516 3.1202e-05 0 0.046956
12 24576 11.6956 0.579367 237.476 4.2819e-05 0 0.0500108
12 24576 8.10779 0.389026 239.126 0.000104177 0 0.0476484
12 24576 9.42167 0.385237 237.488 4.6433e-05 0 0.0483899
12 24576 9.6398 0.370355 237.496 4.6577e-05 0 0.0473706
12 28672 10.8119 0.515466 377.573 2.3299e-05 0 0.0577231
12 28672 11.0224 0.470938 378.54 3.8656e-05 0 0.0571406
12 28672 12.5291 0.550508 378.616 6.687e-05 0 0.054503
12 28672 13.3604 0.479701 378.516 3.822e-05 0 0.0567346
12 28672 13.7837 0.481046 378.673 4.2168e-05 0 0.0568319
12 4096 0.242403 0.01716 1.32256 3.2896e-05 0 0.00934471
12 4096 0.26103 0.0172991 1.2947 3.3291e-05 0 0.00951779
12 4096 0.270135 0.0175867 1.3203 3.5953e-05 0 0.0092752
12 4096 0.30485 0.0223182 1.3294 3.9992e-05 0 0.00980626
12 4096 0.332151 0.0260754 1.31254 3.4862e-05 0 0.0108628
12 8192 0.98204 0.0761152 9.19105 3.6254e-05 0 0.0174774
12 8192 0.985682 0.107163 9.19927 6.602e-05 0 0.016919
12 8192 1.03955 0.0482524 9.18105 4.0465e-05 0 0.0198375
12 8192 1.11561 0.0503265 9.19363 3.3991e-05 0 0.0171941
12 8192 1.2679 0.0551346 9.20809 3.3273e-05 0 0.0170883
13 1024 0.0141219 0.00504992 0.0327887 2.1167e-05 0 0.00368629
13 1024 0.0165892 0.00518001 0.0324151 2.2303e-05 0 0.00383371
13 1024 0.0172646 0.00819419 0.0327777 2.1882e-05 0 0.00689461
13 1024 0.0181837 0.00649685 0.032764 3.0644e-05 0 0.00520185
13 1024 0.0191637 0.00510031 0.032808 2.1751e-05 0 0.00367274
13 12288 2.42576 0.133846 27.9096 4.6501e-05 0 0.0260042
13 12288 2.53565 0.166793 27.8939 4.0888e-05 0 0.0229015
13 12288 2.71922 0.0868456 27.8795 9.0927e-05 0 0.0228361
13 12288 2.75706 0.107891 27.8942 3.3181e-05 0 0.0230405
13 12288 3.31919 0.136248 27.8055 5.9424e-05 0 0.0235973
13 16384 4.46811 0.145824 65.5505 7.3703e-05 0 0.0308667
13 16384 4.90846 0.144006 65.5755 4.1244e-05 0 0.0290859
13 16384 5.19191 0.153245 65.6454 6.4972e-05 0 0.0296711
13 16384 5.45287 0.148135 65.5805 4.0462e-05 0 0.0315877
13 16384 5.47126 0.246187 65.5619 7.2547e-05 0 0.0294535
13 2048 0.0649667 0.0125169 0.208033 3.4063e-05 0 0.00899358
13 2048 0.0653653 0.0153812 0.208453 3.0977e-05 0 0.00917184
13 2048 0.0698156 0.00854566 0.205943 3.2791e-05 0 0.00510846
13 2048 0.0703806 0.00887444 0.20956 3.1073e-05 0 0.00539778
13 2048 0.0771586 0.0143574 0.208213 3.3918e-05 0 0.00888759
13 20480 6.28918 0.245489 127.979 3.3543e-05 0 0.0367334
13 20480 6.35196 0.240573 128.017 4.1167e-05 0 0.0369409
13 20480 6.93261 0.229744 127.959 4.6403e-05 0 0.0369355
13 20480 7.52785 0.29612 127.972 4.2114e-05 0 0.0365136
13 20480 7.55045 0.314724 127.955 4.316e-05 0 0.040023
13 24576 11.4352 0.376254 219.814 4.2367e-05 0 0.0444239
13 24576 12.1441 0.410473 221.053 6.0628e-05 0 0.0453313
13 24576 16.0188 0.554499 220.977 3.9746e-05 0 0.0441387
13 24576 9.39837 0.350787 221.147 3.8389e-05 0 0.0464806
13 24576 9.41232 0.354648 221.047 3.9312e-05 0 0.0501947
13 28672 13.1942 0.451587 348.231 4.737e-05 0 0.056574
13 28672 13.5581 0.479849 348.258 5.8138e-05 0 0.0573354
13 28672 16.1358 0.447596 348.3 4.9042e-05 0 0.053756
13 28672 16.3704 0.476568 348.293 8.7767e-05 0 0.0508068
13 28672 16.4093 0.474777 348.284 4.2431e-05 0 0.0514204
13 4096 0.264759 0.0236253 1.2131 3.5644e-05 0 0.0156321
13 4096 0.288243 0.0249962 1.2355 3.4366e-05 0 0.0166254
13 4096 0.291331 0.0333268 1.21316 3.7829e-05 0 0.0152825
13 4096 0.300827 0.0172934 1.22526 3.4591e-05 0 0.00892213
13 4096 0.338216 0.0192848 1.21784 3.503e-05 0 0.01012
13 8192 0.982052 0.0747517 8.42938 3.4558e-05 0 0.0170992
13 8192 1.0374 0.0441074 8.41793 3.6059e-05 0 0.0170111
13 8192 1.07515 0.0435969 8.40369 3.6105e-05 0 0.0160217
13 8192 1.07748 0.0550817 8.41766 5.8388e-05 0 0.0268055
13 8192 1.42517 0.0693349 8.46771 3.5559e-05 0 0.0154495
14 1024 0.0163956 0.00640055 0.0327976 2.2831e-05 0 0.00505714
14 1024 0.0175147 0.00518225 0.0326664 2.1333e-05 0 0.00381957
14 1024 0.0189276 0.00635031 0.0327962 1.9774e-05 0 0.00496975
14 1024 0.0189801 0.00634894 0.0327858 2.0265e-05 0 0.00493328
14 1024 0.0231647 0.00666622 0.0328137 2.1611e-05 0 0.00526554
14 12288 2.39566 0.183854 26.0621 9.1621e-05 0 0.0294115
14 12288 2.57621 0.101684 26.0243 4.1523e-05 0 0.0243319
14 12288 2.63015 0.0892506 26.0456 3.8881e-05 0 0.0223838
14 12288 3.10769 0.150181 26.0406 3.924e-05 0 0.0210704
14 12288 3.41311 0.0917025 26.0366 0.000103175 0 0.0232604
14 16384 4.0691 0.285918 61.1203 3.6266e-05 0 0.0298715
14 16384 4.47891 0.149849 61.0128 4.1026e-05 0 0.0344987
14 16384 4.48091 0.153889 61.0708 3.4161e-05 0 0.0292226
14 16384 4.78924 0.26271 60.9727 7.4667e-05 0 0.0285102
14 16384 5.76103 0.292678 60.9814 8.6557e-05 0 0.029275
14 2048 0.0739896 0.0115632 0.208135 3.3526e-05 0 0.00786334
14 2048 0.0788899 0.00866077 0.207509 3.2272e-05 0 0.0050138
14 2048 0.0801486 0.0111662 0.208771 3.6178e-05 0 0.0052525
14 2048 0.0806543 0.0119577 0.207098 3.3967e-05 0 0.00835334
14 2048 0.0906487 0.00978621 0.208055 2.8719e-05 0 0.00567106
14 20480 10.0154 0.228277 118.442 7.0444e-05 0 0.0355235
14 20480 7.80426 0.238005 118.453 5.7567e-05 0 0.0405457
14 20480 8.41849 0.297227 118.55 4.8662e-05 0 0.0366638
14 20480 8.68198 0.289523 118.459 4.6125e-05 0 0.0349886
14 20480 8.80183 0.298525 118.446 4.7067e-05 0 0.0362656
14 24576 10.5097 0.688141 204.105 4.433e-05 0 0.0429765
14 24576 10.5274 0.341119 204.176 4.41e-05 0 0.0457383
14 24576 11.0204 0.345817 205.564 4.3298e-05 0 0.0430237
14 24576 11.8489 0.346996 204.142 4.0277e-05 0 0.0400961
14 24576 9.27584 0.373531 204.12 4.6921e-05 0 0.0417695
14 28672 13.4555 0.470451 323.089 7.373e-05 0 0.0495043
14 28672 13.9928 0.452172 323.261 4.0625e-05 0 0.0563536
14 28672 14.1796 0.550689 327.053 4.7245e-05 0 0.0473633
14 28672 15.4264 0.4626 323.27 4.8807e-05 0 0.0506323
14 28672 18.1975 0.434439 323.266 3.9936e-05 0 0.0491519
14 4096 0.262075 0.0231091 1.22477 3.5194e-05 0 0.0150523
14 4096 0.307876 0.0211178 1.18792 3.7383e-05 0 0.00905298
14 4096 0.314308 0.022787 1.22345 3.4692e-05 0 0.0141475
14 4096 0.354839 0.02467 1.22249 3.7899e-05 0 0.0154371
14 4096 0.371384 0.0176312 1.20689 3.7208e-05 0 0.00860153
14 8192 1.05329 0.055816 7.99837 4.0373e-05 0 0.026734
14 8192 1.15665 0.0524923 7.97292 7.6259e-05 0 0.021119
14 8192 1.19566 0.0666341 7.99458 6.3127e-05 0 0.0155769
14 8192 1.23702 0.0746882 7.99224 7.3637e-05 0 0.0271013
14 8192 1.38677 0.044731 7.97993 4.2443e-05 0 0.0152768
15 1024 0.0161589 0.00824197 0.032421 2.2406e-05 0 0.00691175
15 1024 0.0198294 0.00537996 0.0324107 2.195e-05 0 0.00399246
15 1024 0.0199789 0.00865753 0.0319536 2.3409e-05 0 0.00654165
15 1024 0.0218227 0.00705932 0.0320379 2.5851e-05 0 0.00569885
15 1024 0.021835 0.00583671 0.0324129 2.3688e-05 0 0.00390384
15 12288 2.71805 0.0887294 24.2866 8.5527e-05 0 0.0239141
15 12288 2.72177 0.0873442 24.3134 7.4904e-05 0 0.0226113
15 12288 2.78439 0.155394 24.2788 8.598e-05 0 0.0203633
15 12288 2.93861 0.152257 24.273 8.0597e-05 0 0.0213718
15 12288 3.04931 0.0903447 24.2614 4.1309e-05 0 0.0239666
15 16384 4.98799 0.159311 57.0266 3.508e-05 0 0.0411124
15 16384 5.04058 0.207666 57.0049 0.000155019 0 0.0261835
15 16384 5.13234 0.264635 57.1147 4.4775e-05 0 0.0303196
15 16384 5.84646 0.14718 57.0081 7.1337e-05 0 0.0267187
15 16384 5.90402 0.256453 57.0547 3.7515e-05 0 0.0383067
15 2048 0.073194 0.0113504 0.20794 3.0451e-05 0 0.00761848
15 2048 0.0765489 0.0169707 0.2078 3.1714e-05 0 0.0107339
15 2048 0.0797466 0.0122393 0.206918 3.23e-05 0 0.00834775
15 2048 0.0808551 0.00924651 0.20212 3.3046e-05 0 0.00553264
15 2048 0.109753 0.0122018 0.205945 2.6801e-05 0 0.00813609
15 20480 7.26947 0.219212 110.697 5.2641e-05 0 0.0324127
15 20480 7.66419 0.225684 110.778 4.7128e-05 0 0.0331132
15 20480 8.50506 0.232655 110.73 8.4093e-05 0 0.041221
15 20480 9.18718 0.266721 110.759 4.4526e-05 0 0.0340054
15 20480 9.31444 0.253022 110.742 4.7097e-05 0 0.0331792
15 24576 11.2298 0.41687 191.958 4.502e-05 0 0.0726341
15 24576 11.3405 0.364389 192.022 5.0656e-05 0 0.045049
15 24576 11.3468 0.35766 192.045 4.4189e-05 0 0.0395877
15 24576 11.4285 0.348551 192.023 4.1601e-05 0 0.0437666
15 24576 12.2355 0.388339 191.997 6.0605e-05 0 0.0620124
15 28672 14.0317 0.477264 302.928 4.6592e-05 0 0.0453229
15 28672 15.2172 0.478707 302.967 5.6713e-05 0 0.0494285
15 28672 16.3053 0.488534 302.921 8.5393e-05 0 0.0472139
15 28672 17.7038 0.439698 302.883 0.00010325 0 0.0450487
15 28672 17.8971 0.463878 302.93 4.1243e-05 0 0.0459916
15 4096 0.31493 0.0177173 1.12156 4.6826e-05 0 0.0092523
15 4096 0.33909 0.025782 1.08543 5.144e-05 0 0.0174281
15 4096 0.353873 0.0234725 1.09031 3.6279e-05 0 0.0139719
15 4096 0.368304 0.0231414 1.13293 5.0216e-05 0 0.0139453
15 4096 0.373368 0.0171535 1.09573 4.1308e-05 0 0.00833586
15 8192 1.14622 0.0446628 7.52594 4.4049e-05 0 0.0142566
15 8192 1.15608 0.0837472 7.48503 5.0621e-05 0 0.0158097
15 8192 1.21329 0.0872059 7.52496 5.0905e-05 0 0.0260531
15 8192 1.33322 0.0613306 7.49565 4.3174e-05 0 0.0158449
15 8192 1.42747 0.0571394 7.51928 4.1176e-05 0 0.0262279
16 1024 0.018058 0.00705435 0.0250679 3.2311e-05 0 0.00525284
16 1024 0.018735 0.00780459 0.0246568 2.5327e-05 0 0.00574723
16 1024 0.0192737 0.00679666 0.0249923 2.2741e-05 0 0.00532022
16 1024 0.0215749 0.00683937 0.025013 2.3435e-05 0 0.00501188
16 1024 0.0264108 0.00637034 0.0247005 2.4665e-05 0 0.00442317
16 12288 2.85428 0.0952901 22.4339 9.0783e-05 0 0.0281517
16 12288 2.9323 0.0979191 22.6021 8.6757e-05 0 0.0299791
16 12288 2.94347 0.17306 22.6537 4.6926e-05 0 0.0352455
16 12288 3.04971 0.0934075 22.6191 8.4954e-05 0 0.0274354
16 12288 3.93247 0.12379 22.5815 4.23e-05 0 0.019702
16 16384 5.14732 0.2264 53.3402 3.7637e-05 0 0.0299162
16 16384 5.24943 0.278606 52.8332 4.3181e-05 0 0.0466409
16 16384 5.76606 0.248045 53.3304 3.6644e-05 0 0.0257925
16 16384 6.06234 0.233675 52.7527 4.5991e-05 0 0.0275103
16 16384 6.98808 0.141606 52.8713 5.4073e-05 0 0.0251837
16 2048 0.0771676 0.00914446 0.16877 3.2521e-05 0 0.00465447
16 2048 0.0798271 0.0160488 0.168401 4.3771e-05 0 0.012127
16 2048 0.0938543 0.0148762 0.164851 4.3308e-05 0 0.00874862
16 2048 0.0956234 0.0140983 0.168052 3.4697e-05 0 0.00994559
16 2048 0.0975838 0.0129819 0.166474 2.737e-05 0 0.00812585
16 20480 7.8668 0.335799 102.982 4.2521e-05 0 0.0322092
16 20480 8.50305 0.225248 102.989 4.1094e-05 0 0.0345455
16 20480 9.07232 0.317549 104.203 5.0536e-05 0 0.0380492
16 20480 9.28583 0.296744 103.055 3.6887e-05 0 0.0386388
16 20480 9.3435 0.245702 103.106 5.4204e-05 0 0.0576071
16 24576 11.1379 0.364032 178.27 5.5852e-05 0 0.0477894
16 24576 12.8786 0.377563 178.221 6.3939e-05 0 0.0402393
16 24576 12.8906 0.383655 180.318 4.606e-05 0 0.0502214
16 24576 13.5936 0.352214 178.271 5.0589e-05 0 0.0391314
16 24576 13.6519 0.329328 178.277 5.2739e-05 0 0.0358353
16 28672 15.0012 0.523574 286.157 6.0828e-05 0 0.0457185
16 28672 15.0777 0.9295 282.83 5.0592e-05 0 0.051158
16 28672 16.8141 0.736723 282.896 5.9178e-05 0 0.0440097
16 28672 17.6807 0.451817 282.931 4.3026e-05 0 0.0504648
16 28672 20.4175 0.440182 282.896 5.8651e-05 0 0.0489988
16 4096 0.301531 0.0233709 1.01224 7.2057e-05 0 0.0142224
16 4096 0.348398 0.0231625 1.01983 5.1015e-05 0 0.0136973
16 4096 0.352424 0.0211441 1.01501 3.8069e-05 0 0.00844014
16 4096 0.363361 0.0217679 1.01581 3.5375e-05 0 0.0127513
16 4096 0.464541 0.0171848 1.03325 3.336e-05 0 0.00761018
16 8192 1.25849 0.07746 6.79874 6.0624e-05 0 0.0164414
16 8192 1.29372 0.0517142 6.81189 4.5141e-05 0 0.0237433
16 8192 1.32563 0.0491464 6.82101 3.7653e-05 0 0.0191644
16 8192 1.59578 0.0464798 6.78245 3.6346e-05 0 0.013551
16 8192 1.63527 0.0659677 6.78988 5.3697e-05 0 0.0150587
@codecircuit
Copy link
Author

Matrix Multiplication Benchmark

The kernel's analysis was done by hand, as polly fails on a
matrix multiplication kernel written in openCL.
The data file contains the following columns:

gpus = number of used GPUs

N = width of the matrix (N^2 elements in total)

app_htod_time[s] = time the user measures for the host to device
memory copy in his single GPU code. The user copied 2N^2 floating point
numbers to the GPU.

double timestamp = Clock::now();  
cuMemcpyHtoD(..., N*N*sizeof(float));  
cuMemcpyHtoD(..., N*N*sizeof(float));  
double app_htod_time = Clock::now() timestamp;

app_kernel_time[s] = time the user measures for the kernel launch
in his single GPU code.

double timestamp = Clock::now();  
for (int i = 0; i < 1000; ++i) {  
  cuLaunchKernel(...);  
  ...  
}  
double app_kernel_time = Clock::now() timestamp;  

app_dtoh_time[s] = time the user measures for the device to host
memory copy in his single GPU code. The user copied N^2 floating point
numbers from device to the host.

double timestamp = Clock::now();  
cuMemcpyDtoH(..., N*N*sizeof(float));  
double app_dtoh_time = Clock::now() timestamp;

dependency_resolution_calculation_time[s] = time to calculate the accessed intervals
and create the memory copies, which are mandatory to resolve inter kernel dependencies.
A dependency resolution is not necessary for this application, thus the measured time
is a small overhead. This time is a subset of app_kernel_time.

dependency_resolution_memcpy_time[s] = zero as there are no inter kernel dependencies.

linearization_time[s] = time to linearize the 2D access patterns. This time is a
subset of app_dtoh_time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment