Last active
June 7, 2016 22:24
-
-
Save OXPHOS/b1085451731f85f1aef27ae4a74fab00 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[==========] Running 4 benchmarks. | |
[ RUN ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (10 runs, 1000 iterations per run) | |
[ DONE ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (0.000550 ms) | |
[ RUNS ] Average time: 0.055 us | |
Fastest: 0.000 us (-0.055 us / -100.000 %) | |
Slowest: 0.313 us (+0.258 us / +469.091 %) | |
Average performance: 18181818.18182 runs/s | |
Best performance: inf runs/s (+inf runs/s / +inf %) | |
Worst performance: 3194888.17891 runs/s (-14986930.00290 runs/s / -82.42812 %) | |
[ITERATIONS] Average time: 0.000 us | |
Fastest: 0.000 us (-0.000 us / -100.000 %) | |
Slowest: 0.000 us (+0.000 us / +469.091 %) | |
Average performance: 18181818181.81818 iterations/s | |
Best performance: inf iterations/s (+inf iterations/s / +inf %) | |
Worst performance: 3194888178.91374 iterations/s (-14986930002.90444 iterations/s / -82.42812 %) | |
[ RUN ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (10 runs, 1000 iterations per run) | |
[ DONE ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (0.160564 ms) | |
[ RUNS ] Average time: 16.056 us | |
Fastest: 13.424 us (-2.632 us / -16.395 %) | |
Slowest: 25.153 us (+9.097 us / +56.654 %) | |
Average performance: 62280.46137 runs/s | |
Best performance: 74493.44458 runs/s (+12212.98320 runs/s / +19.60965 %) | |
Worst performance: 39756.68906 runs/s (-22523.77231 runs/s / -36.16507 %) | |
[ITERATIONS] Average time: 0.016 us | |
Fastest: 0.013 us (-0.003 us / -16.395 %) | |
Slowest: 0.025 us (+0.009 us / +56.654 %) | |
Average performance: 62280461.37366 iterations/s | |
Best performance: 74493444.57688 iterations/s (+12212983.20322 iterations/s / +19.60965 %) | |
Worst performance: 39756689.06293 iterations/s (-22523772.31072 iterations/s / -36.16507 %) | |
[ RUN ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (10 runs, 1000 iterations per run) | |
[ DONE ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (0.008346 ms) | |
[ RUNS ] Average time: 0.835 us | |
Fastest: 0.000 us (-0.835 us / -100.000 %) | |
Slowest: 8.186 us (+7.351 us / +880.829 %) | |
Average performance: 1198178.76827 runs/s | |
Best performance: inf runs/s (+inf runs/s / +inf %) | |
Worst performance: 122159.78500 runs/s (-1076018.98327 runs/s / -89.80454 %) | |
[ITERATIONS] Average time: 0.001 us | |
Fastest: 0.000 us (-0.001 us / -100.000 %) | |
Slowest: 0.008 us (+0.007 us / +880.829 %) | |
Average performance: 1198178768.27223 iterations/s | |
Best performance: inf iterations/s (+inf iterations/s / +inf %) | |
Worst performance: 122159784.99878 iterations/s (-1076018983.27345 iterations/s / -89.80454 %) | |
[ RUN ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (10 runs, 1000 iterations per run) | |
[ DONE ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (3027.772602 ms) | |
[ RUNS ] Average time: 302777.260 us | |
Fastest: 281885.242 us (-20892.018 us / -6.900 %) | |
Slowest: 323467.212 us (+20689.952 us / +6.833 %) | |
Average performance: 3.30276 runs/s | |
Best performance: 3.54754 runs/s (+0.24479 runs/s / +7.41153 %) | |
Worst performance: 3.09150 runs/s (-0.21125 runs/s / -6.39631 %) | |
[ITERATIONS] Average time: 302.777 us | |
Fastest: 281.885 us (-20.892 us / -6.900 %) | |
Slowest: 323.467 us (+20.690 us / +6.833 %) | |
Average performance: 3302.75794 iterations/s | |
Best performance: 3547.54294 iterations/s (+244.78500 iterations/s / +7.41153 %) | |
Worst performance: 3091.50344 iterations/s (-211.25450 iterations/s / -6.39631 %) | |
[==========] Ran 4 benchmarks. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[==========] Running 4 benchmarks. | |
[ RUN ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (10 runs, 1000 iterations per run) | |
[ DONE ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (0.065981 ms) | |
[ RUNS ] Average time: 6.598 us | |
Fastest: 1.345 us (-5.253 us / -79.615 %) | |
Slowest: 37.185 us (+30.587 us / +463.571 %) | |
Average performance: 151558.78207 runs/s | |
Best performance: 743494.42379 runs/s (+591935.64172 runs/s / +390.56506 %) | |
Worst performance: 26892.56421 runs/s (-124666.21787 runs/s / -82.25602 %) | |
[ITERATIONS] Average time: 0.007 us | |
Fastest: 0.001 us (-0.005 us / -79.615 %) | |
Slowest: 0.037 us (+0.031 us / +463.571 %) | |
Average performance: 151558782.07363 iterations/s | |
Best performance: 743494423.79182 iterations/s (+591935641.71819 iterations/s / +390.56506 %) | |
Worst performance: 26892564.20600 iterations/s (-124666217.86763 iterations/s / -82.25602 %) | |
[ RUN ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (10 runs, 1000 iterations per run) | |
[ DONE ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (4707.436846 ms) | |
[ RUNS ] Average time: 470743.685 us | |
Fastest: 450431.075 us (-20312.610 us / -4.315 %) | |
Slowest: 518870.429 us (+48126.744 us / +10.224 %) | |
Average performance: 2.12430 runs/s | |
Best performance: 2.22010 runs/s (+0.09580 runs/s / +4.50959 %) | |
Worst performance: 1.92726 runs/s (-0.19703 runs/s / -9.27529 %) | |
[ITERATIONS] Average time: 470.744 us | |
Fastest: 450.431 us (-20.313 us / -4.315 %) | |
Slowest: 518.870 us (+48.127 us / +10.224 %) | |
Average performance: 2124.29828 iterations/s | |
Best performance: 2220.09549 iterations/s (+95.79721 iterations/s / +4.50959 %) | |
Worst performance: 1927.26342 iterations/s (-197.03486 iterations/s / -9.27529 %) | |
[ RUN ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (10 runs, 1000 iterations per run) | |
[ DONE ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (0.002278 ms) | |
[ RUNS ] Average time: 0.228 us | |
Fastest: 0.144 us (-0.084 us / -36.787 %) | |
Slowest: 0.334 us (+0.106 us / +46.620 %) | |
Average performance: 4389815.62774 runs/s | |
Best performance: 6944444.44444 runs/s (+2554628.81670 runs/s / +58.19444 %) | |
Worst performance: 2994011.97605 runs/s (-1395803.65170 runs/s / -31.79641 %) | |
[ITERATIONS] Average time: 0.000 us | |
Fastest: 0.000 us (-0.000 us / -36.787 %) | |
Slowest: 0.000 us (+0.000 us / +46.620 %) | |
Average performance: 4389815627.74364 iterations/s | |
Best performance: 6944444444.44444 iterations/s (+2554628816.70081 iterations/s / +58.19444 %) | |
Worst performance: 2994011976.04790 iterations/s (-1395803651.69573 iterations/s / -31.79641 %) | |
[ RUN ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (10 runs, 1000 iterations per run) | |
[ DONE ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (5877.549632 ms) | |
[ RUNS ] Average time: 587754.963 us | |
Fastest: 551186.885 us (-36568.078 us / -6.222 %) | |
Slowest: 664480.472 us (+76725.509 us / +13.054 %) | |
Average performance: 1.70139 runs/s | |
Best performance: 1.81427 runs/s (+0.11288 runs/s / +6.63442 %) | |
Worst performance: 1.50494 runs/s (-0.19645 runs/s / -11.54669 %) | |
[ITERATIONS] Average time: 587.755 us | |
Fastest: 551.187 us (-36.568 us / -6.222 %) | |
Slowest: 664.480 us (+76.726 us / +13.054 %) | |
Average performance: 1701.38929 iterations/s | |
Best performance: 1814.26668 iterations/s (+112.87739 iterations/s / +6.63442 %) | |
Worst performance: 1504.93512 iterations/s (-196.45417 iterations/s / -11.54669 %) | |
[==========] Ran 4 benchmarks. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment