Skip to content

Instantly share code, notes, and snippets.

@OXPHOS
Last active June 7, 2016 22:24
Show Gist options
  • Save OXPHOS/b1085451731f85f1aef27ae4a74fab00 to your computer and use it in GitHub Desktop.
Save OXPHOS/b1085451731f85f1aef27ae4a74fab00 to your computer and use it in GitHub Desktop.
[==========] Running 4 benchmarks.
[ RUN ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (10 runs, 1000 iterations per run)
[ DONE ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (0.000550 ms)
[ RUNS ] Average time: 0.055 us
Fastest: 0.000 us (-0.055 us / -100.000 %)
Slowest: 0.313 us (+0.258 us / +469.091 %)
Average performance: 18181818.18182 runs/s
Best performance: inf runs/s (+inf runs/s / +inf %)
Worst performance: 3194888.17891 runs/s (-14986930.00290 runs/s / -82.42812 %)
[ITERATIONS] Average time: 0.000 us
Fastest: 0.000 us (-0.000 us / -100.000 %)
Slowest: 0.000 us (+0.000 us / +469.091 %)
Average performance: 18181818181.81818 iterations/s
Best performance: inf iterations/s (+inf iterations/s / +inf %)
Worst performance: 3194888178.91374 iterations/s (-14986930002.90444 iterations/s / -82.42812 %)
[ RUN ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (10 runs, 1000 iterations per run)
[ DONE ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (0.160564 ms)
[ RUNS ] Average time: 16.056 us
Fastest: 13.424 us (-2.632 us / -16.395 %)
Slowest: 25.153 us (+9.097 us / +56.654 %)
Average performance: 62280.46137 runs/s
Best performance: 74493.44458 runs/s (+12212.98320 runs/s / +19.60965 %)
Worst performance: 39756.68906 runs/s (-22523.77231 runs/s / -36.16507 %)
[ITERATIONS] Average time: 0.016 us
Fastest: 0.013 us (-0.003 us / -16.395 %)
Slowest: 0.025 us (+0.009 us / +56.654 %)
Average performance: 62280461.37366 iterations/s
Best performance: 74493444.57688 iterations/s (+12212983.20322 iterations/s / +19.60965 %)
Worst performance: 39756689.06293 iterations/s (-22523772.31072 iterations/s / -36.16507 %)
[ RUN ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (10 runs, 1000 iterations per run)
[ DONE ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (0.008346 ms)
[ RUNS ] Average time: 0.835 us
Fastest: 0.000 us (-0.835 us / -100.000 %)
Slowest: 8.186 us (+7.351 us / +880.829 %)
Average performance: 1198178.76827 runs/s
Best performance: inf runs/s (+inf runs/s / +inf %)
Worst performance: 122159.78500 runs/s (-1076018.98327 runs/s / -89.80454 %)
[ITERATIONS] Average time: 0.001 us
Fastest: 0.000 us (-0.001 us / -100.000 %)
Slowest: 0.008 us (+0.007 us / +880.829 %)
Average performance: 1198178768.27223 iterations/s
Best performance: inf iterations/s (+inf iterations/s / +inf %)
Worst performance: 122159784.99878 iterations/s (-1076018983.27345 iterations/s / -89.80454 %)
[ RUN ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (10 runs, 1000 iterations per run)
[ DONE ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (3027.772602 ms)
[ RUNS ] Average time: 302777.260 us
Fastest: 281885.242 us (-20892.018 us / -6.900 %)
Slowest: 323467.212 us (+20689.952 us / +6.833 %)
Average performance: 3.30276 runs/s
Best performance: 3.54754 runs/s (+0.24479 runs/s / +7.41153 %)
Worst performance: 3.09150 runs/s (-0.21125 runs/s / -6.39631 %)
[ITERATIONS] Average time: 302.777 us
Fastest: 281.885 us (-20.892 us / -6.900 %)
Slowest: 323.467 us (+20.690 us / +6.833 %)
Average performance: 3302.75794 iterations/s
Best performance: 3547.54294 iterations/s (+244.78500 iterations/s / +7.41153 %)
Worst performance: 3091.50344 iterations/s (-211.25450 iterations/s / -6.39631 %)
[==========] Ran 4 benchmarks.
[==========] Running 4 benchmarks.
[ RUN ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (10 runs, 1000 iterations per run)
[ DONE ] CPUVector.dot_explict_eigen3(const SGVector<T> &A = data.A, const SGVector<T> &B = data.B) (0.065981 ms)
[ RUNS ] Average time: 6.598 us
Fastest: 1.345 us (-5.253 us / -79.615 %)
Slowest: 37.185 us (+30.587 us / +463.571 %)
Average performance: 151558.78207 runs/s
Best performance: 743494.42379 runs/s (+591935.64172 runs/s / +390.56506 %)
Worst performance: 26892.56421 runs/s (-124666.21787 runs/s / -82.25602 %)
[ITERATIONS] Average time: 0.007 us
Fastest: 0.001 us (-0.005 us / -79.615 %)
Slowest: 0.037 us (+0.031 us / +463.571 %)
Average performance: 151558782.07363 iterations/s
Best performance: 743494423.79182 iterations/s (+591935641.71819 iterations/s / +390.56506 %)
Worst performance: 26892564.20600 iterations/s (-124666217.86763 iterations/s / -82.25602 %)
[ RUN ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (10 runs, 1000 iterations per run)
[ DONE ] CPUVector.dot_eigen3(BaseVector<T> *A = data.Ac.get(), BaseVector<T> *B = data.Bc.get()) (4707.436846 ms)
[ RUNS ] Average time: 470743.685 us
Fastest: 450431.075 us (-20312.610 us / -4.315 %)
Slowest: 518870.429 us (+48126.744 us / +10.224 %)
Average performance: 2.12430 runs/s
Best performance: 2.22010 runs/s (+0.09580 runs/s / +4.50959 %)
Worst performance: 1.92726 runs/s (-0.19703 runs/s / -9.27529 %)
[ITERATIONS] Average time: 470.744 us
Fastest: 450.431 us (-20.313 us / -4.315 %)
Slowest: 518.870 us (+48.127 us / +10.224 %)
Average performance: 2124.29828 iterations/s
Best performance: 2220.09549 iterations/s (+95.79721 iterations/s / +4.50959 %)
Worst performance: 1927.26342 iterations/s (-197.03486 iterations/s / -9.27529 %)
[ RUN ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (10 runs, 1000 iterations per run)
[ DONE ] GPU_Vector.dot_explict_viennacl(const VCLVectorBase &A = data.Av, const VCLVectorBase &B = data.Bv) (0.002278 ms)
[ RUNS ] Average time: 0.228 us
Fastest: 0.144 us (-0.084 us / -36.787 %)
Slowest: 0.334 us (+0.106 us / +46.620 %)
Average performance: 4389815.62774 runs/s
Best performance: 6944444.44444 runs/s (+2554628.81670 runs/s / +58.19444 %)
Worst performance: 2994011.97605 runs/s (-1395803.65170 runs/s / -31.79641 %)
[ITERATIONS] Average time: 0.000 us
Fastest: 0.000 us (-0.000 us / -36.787 %)
Slowest: 0.000 us (+0.000 us / +46.620 %)
Average performance: 4389815627.74364 iterations/s
Best performance: 6944444444.44444 iterations/s (+2554628816.70081 iterations/s / +58.19444 %)
Worst performance: 2994011976.04790 iterations/s (-1395803651.69573 iterations/s / -31.79641 %)
[ RUN ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (10 runs, 1000 iterations per run)
[ DONE ] GPU_Vector.dot_viennacl(BaseVector<T> *A = data.Ag.get(), BaseVector<T> *B = data.Bg.get()) (5877.549632 ms)
[ RUNS ] Average time: 587754.963 us
Fastest: 551186.885 us (-36568.078 us / -6.222 %)
Slowest: 664480.472 us (+76725.509 us / +13.054 %)
Average performance: 1.70139 runs/s
Best performance: 1.81427 runs/s (+0.11288 runs/s / +6.63442 %)
Worst performance: 1.50494 runs/s (-0.19645 runs/s / -11.54669 %)
[ITERATIONS] Average time: 587.755 us
Fastest: 551.187 us (-36.568 us / -6.222 %)
Slowest: 664.480 us (+76.726 us / +13.054 %)
Average performance: 1701.38929 iterations/s
Best performance: 1814.26668 iterations/s (+112.87739 iterations/s / +6.63442 %)
Worst performance: 1504.93512 iterations/s (-196.45417 iterations/s / -11.54669 %)
[==========] Ran 4 benchmarks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment