Skip to content

Instantly share code, notes, and snippets.

@alexcrichton
Created October 14, 2018 20:42
Show Gist options
  • Save alexcrichton/8a444db57633b5164ecd16a21a9f5a79 to your computer and use it in GitHub Desktop.
Save alexcrichton/8a444db57633b5164ecd16a21a9f5a79 to your computer and use it in GitHub Desktop.
+ [[ '' != 1 ]]
+ hash hyperfine
+ echo ''
+ grep -q ispc
+ for dir in examples/*/
+ dir=examples/aobench
+ cd examples/aobench
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ export WIDTH=800
+ WIDTH=800
+ export HEIGHT=600
+ HEIGHT=600
+ [[ '' != 1 ]]
+ hash hyperfine
+ ALGS=("scalar" "scalar_par" "vector" "vector_par" "tiled" "tiled_par")
+ echo ''
+ grep -q ispc
+ echo 'Benchmark 256-bit wide vectors'
Benchmark 256-bit wide vectors
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=,256bit
Finished release [optimized] target(s) in 0.07s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar'
Benchmark #1: target/release/aobench 800 600 --algo scalar
Time (mean ± σ): 3.085 s ± 0.176 s [User: 3.082 s, System: 0.003 s]
Range (min … max): 2.941 s … 3.359 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar_par'
Benchmark #1: target/release/aobench 800 600 --algo scalar_par
Time (mean ± σ): 178.5 ms ± 5.4 ms [User: 4.300 s, System: 0.059 s]
Range (min … max): 172.5 ms … 192.4 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector'
Benchmark #1: target/release/aobench 800 600 --algo vector
Time (mean ± σ): 1.069 s ± 0.017 s [User: 1.065 s, System: 0.004 s]
Range (min … max): 1.041 s … 1.089 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector_par'
Benchmark #1: target/release/aobench 800 600 --algo vector_par
Time (mean ± σ): 81.3 ms ± 2.7 ms [User: 1.700 s, System: 0.059 s]
Range (min … max): 79.4 ms … 94.3 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (94.3 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled'
Benchmark #1: target/release/aobench 800 600 --algo tiled
Time (mean ± σ): 1.029 s ± 0.020 s [User: 1.025 s, System: 0.004 s]
Range (min … max): 1.004 s … 1.074 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled_par'
Benchmark #1: target/release/aobench 800 600 --algo tiled_par
Time (mean ± σ): 81.7 ms ± 2.9 ms [User: 1.725 s, System: 0.030 s]
Range (min … max): 79.5 ms … 95.5 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (95.5 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ echo 'Benchmark 128-bit wide vectors'
Benchmark 128-bit wide vectors
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=
Finished release [optimized] target(s) in 0.06s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar'
Benchmark #1: target/release/aobench 800 600 --algo scalar
Time (mean ± σ): 2.905 s ± 0.040 s [User: 2.899 s, System: 0.006 s]
Range (min … max): 2.859 s … 3.003 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar_par'
Benchmark #1: target/release/aobench 800 600 --algo scalar_par
Time (mean ± σ): 176.2 ms ± 4.9 ms [User: 4.276 s, System: 0.046 s]
Range (min … max): 171.6 ms … 188.6 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector'
Benchmark #1: target/release/aobench 800 600 --algo vector
Time (mean ± σ): 1.207 s ± 0.022 s [User: 1.203 s, System: 0.003 s]
Range (min … max): 1.169 s … 1.238 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector_par'
Benchmark #1: target/release/aobench 800 600 --algo vector_par
Time (mean ± σ): 91.4 ms ± 3.7 ms [User: 1.985 s, System: 0.045 s]
Range (min … max): 88.8 ms … 108.6 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (108.6 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled'
Benchmark #1: target/release/aobench 800 600 --algo tiled
Time (mean ± σ): 1.135 s ± 0.024 s [User: 1.131 s, System: 0.004 s]
Range (min … max): 1.106 s … 1.190 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled_par'
Benchmark #1: target/release/aobench 800 600 --algo tiled_par
Time (mean ± σ): 90.5 ms ± 2.9 ms [User: 1.974 s, System: 0.030 s]
Range (min … max): 88.2 ms … 103.2 ms
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/dot_product
+ cd examples/dot_product
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/fannkuch_redux
+ cd examples/fannkuch_redux
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/mandelbrot
+ cd examples/mandelbrot
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ WIDTH=800
+ HEIGHT=800
+ [[ '' != 1 ]]
+ hash hyperfine
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --features=
Updating crates.io index
Compiling version_check v0.1.5
Compiling proc-macro2 v0.4.20
Compiling cfg-if v0.1.5
Compiling unicode-xid v0.1.0
Compiling nodrop v0.1.12
Compiling memoffset v0.2.1
Compiling scopeguard v0.3.3
Compiling rayon-core v1.4.1
Compiling unicode-width v0.1.5
Compiling libc v0.2.43
Compiling rayon v1.0.2
Compiling vec_map v0.8.1
Compiling ansi_term v0.11.0
Compiling strsim v0.7.0
Compiling bitflags v1.0.4
Compiling mandelbrot v0.1.0 (/home/alex/code/packed_simd/examples/mandelbrot)
Compiling either v1.5.0
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling textwrap v0.10.0
Compiling num_cpus v1.8.0
Compiling atty v0.2.11
Compiling clap v2.32.0
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Compiling quote v0.6.8
Compiling syn v0.15.11
Compiling structopt-derive v0.2.12
Compiling structopt v0.2.12
Finished release [optimized] target(s) in 36.22s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ hyperfine 'target/release/mandelbrot 800 800 --algo scalar'
Benchmark #1: target/release/mandelbrot 800 800 --algo scalar
Time (mean ± σ): 11.8 ms ± 1.4 ms [User: 194.5 ms, System: 20.6 ms]
Range (min … max): 8.7 ms … 16.6 ms
+ hyperfine 'target/release/mandelbrot 800 800 --algo simd'
Benchmark #1: target/release/mandelbrot 800 800 --algo simd
Time (mean ± σ): 8.9 ms ± 0.4 ms [User: 82.6 ms, System: 27.2 ms]
Range (min … max): 8.3 ms … 11.5 ms
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
+ echo ''
+ grep -q ispc
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/matrix_inverse
+ cd examples/matrix_inverse
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/nbody
+ cd examples/nbody
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/options_pricing
+ cd examples/options_pricing
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ NUM_OPTIONS_BLACK_SCHOLES=10000000
+ [[ '' != 1 ]]
+ hash hyperfine
+ ALGS=("black_scholes_scalar" "black_scholes_simd" "black_scholes_simd_par")
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --features=
Updating crates.io index
Compiling version_check v0.1.5
Compiling cfg-if v0.1.5
Compiling nodrop v0.1.12
Compiling memoffset v0.2.1
Compiling scopeguard v0.3.3
Compiling rayon-core v1.4.1
Compiling libc v0.2.43
Compiling rayon v1.0.2
Compiling either v1.5.0
Compiling options_pricing v0.1.0 (/home/alex/code/packed_simd/examples/options_pricing)
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling num_cpus v1.8.0
Compiling time v0.1.40
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Finished release [optimized] target(s) in 29.48s
+ [[ '' == \1 ]]
+ ALGS=("binomial_put_scalar" "binomial_put_simd" "binomial_put_simd_par")
+ echo ''
+ grep -q ispc
+ NUM_OPTIONS_BINOMIAL_PUT=500000
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_scalar'
Benchmark #1: target/release/options_pricing 500000 binomial_put_scalar
Time (mean ± σ): 1.240 s ± 0.017 s [User: 1.232 s, System: 0.008 s]
Range (min … max): 1.207 s … 1.264 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_simd'
Benchmark #1: target/release/options_pricing 500000 binomial_put_simd
Time (mean ± σ): 379.1 ms ± 18.4 ms [User: 372.6 ms, System: 6.3 ms]
Range (min … max): 356.5 ms … 426.0 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_simd_par'
Benchmark #1: target/release/options_pricing 500000 binomial_put_simd_par
Time (mean ± σ): 30.9 ms ± 3.4 ms [User: 716.5 ms, System: 13.8 ms]
Range (min … max): 29.4 ms … 54.1 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (54.1 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/slice_sum
+ cd examples/slice_sum
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/spectral_norm
+ cd examples/spectral_norm
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/stencil
+ cd examples/stencil
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ [[ '' != 1 ]]
+ hash hyperfine
+ algs=("0" "1" "2")
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=
Updating crates.io index
Compiling version_check v0.1.5
Compiling nodrop v0.1.12
Compiling cfg-if v0.1.5
Compiling scopeguard v0.3.3
Compiling memoffset v0.2.1
Compiling rayon-core v1.4.1
Compiling libc v0.2.43
Compiling rayon v1.0.2
Compiling stencil v0.1.0 (/home/alex/code/packed_simd/examples/stencil)
Compiling either v1.5.0
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling num_cpus v1.8.0
Compiling time v0.1.40
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Finished release [optimized] target(s) in 29.72s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 0'
Benchmark #1: target/release/stencil 0
Time (mean ± σ): 1.280 s ± 0.014 s [User: 1.216 s, System: 0.063 s]
Range (min … max): 1.257 s … 1.298 s
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 1'
Benchmark #1: target/release/stencil 1
Time (mean ± σ): 282.5 ms ± 5.7 ms [User: 210.6 ms, System: 71.7 ms]
Range (min … max): 277.1 ms … 297.1 ms
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 2'
Benchmark #1: target/release/stencil 2
Time (mean ± σ): 122.5 ms ± 5.2 ms [User: 1.361 s, System: 0.075 s]
Range (min … max): 119.0 ms … 144.1 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (144.1 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/triangle_xform
+ cd examples/triangle_xform
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ [[ '' != 1 ]]
+ hash hyperfine
+ echo ''
+ grep -q ispc
+ for dir in examples/*/
+ dir=examples/aobench
+ cd examples/aobench
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ export WIDTH=800
+ WIDTH=800
+ export HEIGHT=600
+ HEIGHT=600
+ [[ '' != 1 ]]
+ hash hyperfine
+ ALGS=("scalar" "scalar_par" "vector" "vector_par" "tiled" "tiled_par")
+ echo ''
+ grep -q ispc
+ echo 'Benchmark 256-bit wide vectors'
Benchmark 256-bit wide vectors
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=,256bit
Compiling lazy_static v1.1.0
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling backtrace-sys v0.1.24
Compiling atty v0.2.11
Compiling num_cpus v1.8.0
Compiling time v0.1.40
Compiling proc-macro2 v0.4.20
Compiling num-integer v0.1.39
Compiling clap v2.32.0
Compiling crossbeam-epoch v0.3.1
Compiling num-iter v0.1.37
Compiling png v0.12.0
Compiling crossbeam-deque v0.2.0
Compiling rayon-core v1.4.1
Compiling backtrace v0.3.9
Compiling rayon v1.0.2
Compiling quote v0.6.8
Compiling syn v0.14.9
Compiling syn v0.15.11
Compiling structopt-derive v0.2.12
Compiling synstructure v0.9.0
Compiling structopt v0.2.12
Compiling failure_derive v0.1.2
Compiling failure v0.1.2
Compiling aobench v0.1.0 (/home/alex/code/packed_simd/examples/aobench)
Finished release [optimized] target(s) in 45.04s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar'
Benchmark #1: target/release/aobench 800 600 --algo scalar
Time (mean ± σ): 2.979 s ± 0.032 s [User: 2.973 s, System: 0.007 s]
Range (min … max): 2.921 s … 3.044 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar_par'
Benchmark #1: target/release/aobench 800 600 --algo scalar_par
Time (mean ± σ): 181.5 ms ± 8.2 ms [User: 4.336 s, System: 0.052 s]
Range (min … max): 173.6 ms … 205.1 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector'
Benchmark #1: target/release/aobench 800 600 --algo vector
Time (mean ± σ): 1.079 s ± 0.018 s [User: 1.074 s, System: 0.005 s]
Range (min … max): 1.054 s … 1.118 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector_par'
Benchmark #1: target/release/aobench 800 600 --algo vector_par
Time (mean ± σ): 81.5 ms ± 2.7 ms [User: 1.714 s, System: 0.044 s]
Range (min … max): 79.6 ms … 94.8 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (94.8 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled'
Benchmark #1: target/release/aobench 800 600 --algo tiled
Time (mean ± σ): 1.046 s ± 0.015 s [User: 1.042 s, System: 0.003 s]
Range (min … max): 1.024 s … 1.065 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled_par'
Benchmark #1: target/release/aobench 800 600 --algo tiled_par
Time (mean ± σ): 81.4 ms ± 2.5 ms [User: 1.714 s, System: 0.035 s]
Range (min … max): 79.4 ms … 93.4 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (93.4 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ echo 'Benchmark 128-bit wide vectors'
Benchmark 128-bit wide vectors
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=
Compiling aobench v0.1.0 (/home/alex/code/packed_simd/examples/aobench)
Finished release [optimized] target(s) in 13.06s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar'
Benchmark #1: target/release/aobench 800 600 --algo scalar
Time (mean ± σ): 2.896 s ± 0.050 s [User: 2.890 s, System: 0.006 s]
Range (min … max): 2.824 s … 2.960 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo scalar_par'
Benchmark #1: target/release/aobench 800 600 --algo scalar_par
Time (mean ± σ): 181.8 ms ± 8.1 ms [User: 4.332 s, System: 0.059 s]
Range (min … max): 173.9 ms … 198.2 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector'
Benchmark #1: target/release/aobench 800 600 --algo vector
Time (mean ± σ): 1.201 s ± 0.032 s [User: 1.197 s, System: 0.004 s]
Range (min … max): 1.161 s … 1.261 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo vector_par'
Benchmark #1: target/release/aobench 800 600 --algo vector_par
Time (mean ± σ): 91.6 ms ± 3.5 ms [User: 1.985 s, System: 0.048 s]
Range (min … max): 89.0 ms … 106.9 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (106.9 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled'
Benchmark #1: target/release/aobench 800 600 --algo tiled
Time (mean ± σ): 1.126 s ± 0.018 s [User: 1.122 s, System: 0.004 s]
Range (min … max): 1.102 s … 1.160 s
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/aobench 800 600 --algo tiled_par'
Benchmark #1: target/release/aobench 800 600 --algo tiled_par
Time (mean ± σ): 90.7 ms ± 3.7 ms [User: 1.966 s, System: 0.030 s]
Range (min … max): 87.8 ms … 107.6 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (107.6 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/dot_product
+ cd examples/dot_product
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/fannkuch_redux
+ cd examples/fannkuch_redux
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/mandelbrot
+ cd examples/mandelbrot
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ WIDTH=800
+ HEIGHT=800
+ [[ '' != 1 ]]
+ hash hyperfine
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --features=
Compiling version_check v0.1.5
Compiling proc-macro2 v0.4.20
Compiling nodrop v0.1.12
Compiling unicode-xid v0.1.0
Compiling cfg-if v0.1.5
Compiling memoffset v0.2.1
Compiling scopeguard v0.3.3
Compiling libc v0.2.43
Compiling unicode-width v0.1.5
Compiling rayon-core v1.4.1
Compiling vec_map v0.8.1
Compiling ansi_term v0.11.0
Compiling rayon v1.0.2
Compiling strsim v0.7.0
Compiling bitflags v1.0.4
Compiling either v1.5.0
Compiling mandelbrot v0.1.0 (/home/alex/code/packed_simd/examples/mandelbrot)
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling textwrap v0.10.0
Compiling atty v0.2.11
Compiling num_cpus v1.8.0
Compiling clap v2.32.0
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Compiling quote v0.6.8
Compiling syn v0.15.11
Compiling structopt-derive v0.2.12
Compiling structopt v0.2.12
Finished release [optimized] target(s) in 36.96s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ hyperfine 'target/release/mandelbrot 800 800 --algo scalar'
Benchmark #1: target/release/mandelbrot 800 800 --algo scalar
Time (mean ± σ): 12.3 ms ± 1.3 ms [User: 197.9 ms, System: 21.9 ms]
Range (min … max): 9.5 ms … 15.4 ms
+ hyperfine 'target/release/mandelbrot 800 800 --algo simd'
Benchmark #1: target/release/mandelbrot 800 800 --algo simd
Time (mean ± σ): 9.0 ms ± 0.6 ms [User: 77.9 ms, System: 29.1 ms]
Range (min … max): 7.4 ms … 13.0 ms
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
+ echo ''
+ grep -q ispc
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/matrix_inverse
+ cd examples/matrix_inverse
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/nbody
+ cd examples/nbody
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/options_pricing
+ cd examples/options_pricing
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ NUM_OPTIONS_BLACK_SCHOLES=10000000
+ [[ '' != 1 ]]
+ hash hyperfine
+ ALGS=("black_scholes_scalar" "black_scholes_simd" "black_scholes_simd_par")
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --features=
Compiling version_check v0.1.5
Compiling cfg-if v0.1.5
Compiling nodrop v0.1.12
Compiling scopeguard v0.3.3
Compiling memoffset v0.2.1
Compiling libc v0.2.43
Compiling rayon-core v1.4.1
Compiling rayon v1.0.2
Compiling options_pricing v0.1.0 (/home/alex/code/packed_simd/examples/options_pricing)
Compiling either v1.5.0
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling num_cpus v1.8.0
Compiling time v0.1.40
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Finished release [optimized] target(s) in 30.18s
+ [[ '' == \1 ]]
+ ALGS=("binomial_put_scalar" "binomial_put_simd" "binomial_put_simd_par")
+ echo ''
+ grep -q ispc
+ NUM_OPTIONS_BINOMIAL_PUT=500000
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_scalar'
Benchmark #1: target/release/options_pricing 500000 binomial_put_scalar
Time (mean ± σ): 1.263 s ± 0.089 s [User: 1.257 s, System: 0.006 s]
Range (min … max): 1.211 s … 1.511 s
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_simd'
Benchmark #1: target/release/options_pricing 500000 binomial_put_simd
Time (mean ± σ): 376.8 ms ± 16.3 ms [User: 370.0 ms, System: 6.7 ms]
Range (min … max): 349.1 ms … 412.2 ms
+ for alg in "${ALGS[@]}"
+ hyperfine 'target/release/options_pricing 500000 binomial_put_simd_par'
Benchmark #1: target/release/options_pricing 500000 binomial_put_simd_par
Time (mean ± σ): 31.4 ms ± 3.3 ms [User: 723.1 ms, System: 12.3 ms]
Range (min … max): 29.5 ms … 53.8 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (53.8 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/slice_sum
+ cd examples/slice_sum
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/spectral_norm
+ cd examples/spectral_norm
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/stencil
+ cd examples/stencil
+ '[' -f benchmark.sh ']'
+ ./benchmark.sh
+ [[ '' != 1 ]]
+ hash hyperfine
+ algs=("0" "1" "2")
+ echo ''
+ grep -q ispc
+ RUSTFLAGS='-C target-cpu=native '
+ cargo build --release --no-default-features --features=
Compiling version_check v0.1.5
Compiling cfg-if v0.1.5
Compiling nodrop v0.1.12
Compiling scopeguard v0.3.3
Compiling memoffset v0.2.1
Compiling libc v0.2.43
Compiling rayon-core v1.4.1
Compiling rayon v1.0.2
Compiling either v1.5.0
Compiling stencil v0.1.0 (/home/alex/code/packed_simd/examples/stencil)
Compiling crossbeam-utils v0.2.2
Compiling packed_simd v0.3.0 (/home/alex/code/packed_simd)
Compiling arrayvec v0.4.7
Compiling num_cpus v1.8.0
Compiling time v0.1.40
Compiling lazy_static v1.1.0
Compiling crossbeam-epoch v0.3.1
Compiling crossbeam-deque v0.2.0
Finished release [optimized] target(s) in 30.53s
+ [[ '' == \1 ]]
+ [[ '' == \1 ]]
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 0'
Benchmark #1: target/release/stencil 0
Time (mean ± σ): 1.270 s ± 0.015 s [User: 1.197 s, System: 0.072 s]
Range (min … max): 1.245 s … 1.291 s
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 1'
Benchmark #1: target/release/stencil 1
Time (mean ± σ): 282.9 ms ± 3.8 ms [User: 207.1 ms, System: 75.7 ms]
Range (min … max): 277.0 ms … 289.9 ms
+ for alg in "${algs[@]}"
+ hyperfine 'target/release/stencil 2'
Benchmark #1: target/release/stencil 2
Time (mean ± σ): 129.9 ms ± 4.2 ms [User: 1.343 s, System: 0.086 s]
Range (min … max): 126.3 ms … 146.7 ms
Warning: The first benchmarking run for this command was significantly slower than the rest (146.7 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
+ cd -
/home/alex/code/packed_simd
+ for dir in examples/*/
+ dir=examples/triangle_xform
+ cd examples/triangle_xform
+ '[' -f benchmark.sh ']'
+ cd -
/home/alex/code/packed_simd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment