- CPU: Ryzen 7 5700G (Zen3)
- RAM: 16 GiB
- DISK: SSD 470GiB (format: btrfs)
- SO: ArchLinux (last upd installed: 20-05-2024)
- D:
dub build -b release
- note using latest ldc2 (2.108.1) compiler! - Zig:
zig build -Doptimize=ReleaseFast
- (zig v0.12.0) - C++:
cmake -B build -DCMAKE_BUILD_TYPE=Release
- libstdc++ [shared]:
g++14
(archlinux) - libstdc++ [static]:
g++14 -static-libstdc++
(archlinux) [add:-DCMAKE_EXE_LINKER_FLAGS=-static-libstdc++
] - libc++ [shared]:
clang++17 -stdlib=libc++
(archlinux) - libc++ [static]:
zig c++
(a.k.a:clang++17 -stdlib=libc++ -static
[default])
- libstdc++ [shared]:
poop/perf bench
$ poop \
-d 100 \
'./sieve-cache-d/benchmark/benchmark' \
'./sieve-cache-cpp/build_libstdcpp/SieveCache_bench' \
'./sieve-cache-cpp/build_libcpp/SieveCache_bench' \
'./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench' \
'./sieve-cache-cpp/build_libcpp_static/SieveCache_bench' \
'./zig-sieve/zig-out/bin/sieve_bench'
Benchmark 1 (37 runs): ./sieve-cache-d/benchmark/benchmark
measurement mean ± σ min … max outliers delta
wall_time 1.42ms ± 151us 1.15ms … 1.94ms 2 ( 5%) 0%
peak_rss 3.19MB ± 481KB 2.74MB … 5.14MB 3 ( 8%) 0%
cpu_cycles 2.02M ± 101K 1.94M … 2.39M 3 ( 8%) 0%
instructions 4.50M ± 19.0K 4.46M … 4.53M 0 ( 0%) 0%
cache_references 56.1K ± 1.60K 53.9K … 61.0K 4 (11%) 0%
cache_misses 14.2K ± 1.06K 13.2K … 17.1K 4 (11%) 0%
branch_misses 16.7K ± 269 16.2K … 17.7K 1 ( 3%) 0%
Benchmark 2 (64 runs): ./sieve-cache-cpp/build_libstdcpp/SieveCache_bench
measurement mean ± σ min … max outliers delta
wall_time 1.54ms ± 128us 1.29ms … 1.88ms 1 ( 2%) 💩+ 8.2% ± 3.9%
peak_rss 3.79MB ± 35.4KB 3.67MB … 3.80MB 8 (13%) 💩+ 18.8% ± 3.8%
cpu_cycles 3.04M ± 161K 2.92M … 4.03M 4 ( 6%) 💩+ 50.2% ± 2.9%
instructions 7.20M ± 23.5K 7.15M … 7.25M 0 ( 0%) 💩+ 60.2% ± 0.2%
cache_references 63.5K ± 6.77K 60.3K … 101K 7 (11%) 💩+ 13.1% ± 4.0%
cache_misses 17.1K ± 622 16.1K … 19.3K 4 ( 6%) 💩+ 20.2% ± 2.3%
branch_misses 22.6K ± 159 22.3K … 23.1K 3 ( 5%) 💩+ 35.2% ± 0.5%
Benchmark 3 (15 runs): ./sieve-cache-cpp/build_libcpp/SieveCache_bench
measurement mean ± σ min … max outliers delta
wall_time 6.91ms ± 192us 6.64ms … 7.33ms 0 ( 0%) 💩+385.4% ± 7.1%
peak_rss 3.22MB ± 82.6KB 3.15MB … 3.41MB 0 ( 0%) + 1.1% ± 7.9%
cpu_cycles 26.5M ± 159K 26.3M … 26.9M 0 ( 0%) 💩+1211.6% ± 3.6%
instructions 37.5M ± 151K 37.2M … 37.7M 0 ( 0%) 💩+732.7% ± 1.1%
cache_references 139K ± 15.3K 121K … 167K 0 ( 0%) 💩+147.9% ± 9.0%
cache_misses 15.6K ± 852 14.0K … 17.1K 0 ( 0%) 💩+ 10.0% ± 4.3%
branch_misses 22.9K ± 445 22.3K … 24.0K 1 ( 7%) 💩+ 37.2% ± 1.2%
Benchmark 4 (25 runs): ./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench
measurement mean ± σ min … max outliers delta
wall_time 4.06ms ± 186us 3.74ms … 4.57ms 0 ( 0%) 💩+184.8% ± 6.0%
peak_rss 3.08MB ± 107KB 2.88MB … 3.28MB 0 ( 0%) - 3.4% ± 6.1%
cpu_cycles 14.2M ± 180K 14.0M … 14.8M 3 (12%) 💩+601.0% ± 3.5%
instructions 25.5M ± 136K 25.3M … 25.9M 0 ( 0%) 💩+467.2% ± 1.0%
cache_references 93.9K ± 8.80K 75.2K … 113K 0 ( 0%) 💩+ 67.2% ± 5.3%
cache_misses 11.4K ± 1.01K 10.2K … 13.1K 0 ( 0%) ⚡- 19.7% ± 3.8%
branch_misses 15.2K ± 340 14.7K … 16.0K 0 ( 0%) ⚡- 9.1% ± 0.9%
Benchmark 5 (16 runs): ./sieve-cache-cpp/build_libcpp_static/SieveCache_bench
measurement mean ± σ min … max outliers delta
wall_time 6.46ms ± 182us 6.13ms … 6.66ms 0 ( 0%) 💩+353.4% ± 6.8%
peak_rss 2.11MB ± 93.0KB 1.96MB … 2.22MB 0 ( 0%) ⚡- 33.9% ± 7.7%
cpu_cycles 25.3M ± 142K 25.0M … 25.6M 0 ( 0%) 💩+1150.1% ± 3.4%
instructions 35.0M ± 153K 34.7M … 35.3M 0 ( 0%) 💩+678.7% ± 1.1%
cache_references 161K ± 4.37K 152K … 167K 0 ( 0%) 💩+187.4% ± 2.9%
cache_misses 10.5K ± 1.21K 8.60K … 12.3K 0 ( 0%) ⚡- 25.8% ± 4.7%
branch_misses 15.5K ± 320 14.9K … 16.1K 0 ( 0%) ⚡- 7.3% ± 1.0%
Benchmark 6 (195 runs): ./zig-sieve/zig-out/bin/sieve_bench
measurement mean ± σ min … max outliers delta
wall_time 484us ± 75.0us 356us … 925us 1 ( 1%) ⚡- 66.0% ± 2.2%
peak_rss 909KB ± 1.17KB 893KB … 909KB 1 ( 1%) ⚡- 71.5% ± 2.1%
cpu_cycles 512K ± 60.8K 425K … 728K 14 ( 7%) ⚡- 74.7% ± 1.2%
instructions 1.39M ± 1.36K 1.39M … 1.39M 5 ( 3%) ⚡- 69.2% ± 0.1%
cache_references 5.50K ± 600 4.97K … 11.8K 7 ( 4%) ⚡- 90.2% ± 0.5%
cache_misses 663 ± 73.6 559 … 1.22K 11 ( 6%) ⚡- 95.3% ± 1.0%
branch_misses 5.16K ± 954 3.37K … 6.28K 0 ( 0%) ⚡- 69.1% ± 1.9%
runtime test
$ time ./sieve-cache-d/benchmark/benchmark;\
time ./sieve-cache-cpp/build_libstdcpp/SieveCache_bench;\
time ./sieve-cache-cpp/build_libcpp/SieveCache_bench;\
time ./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench;\
time ./sieve-cache-cpp/build_libcpp_static/SieveCache_bench;\
time ./zig-sieve/zig-out/bin/sieve_bench
Sequence: 136 μs and 5 hnsecs
Composite: 123 μs and 4 hnsecs
CompositeNormal: 182 μs and 7 hnsecs
real 0m0,002s
user 0m0,002s
sys 0m0,000s
Sequence: 101us
Composite: 240us
Composite Normal: 393us
real 0m0,002s
user 0m0,000s
sys 0m0,002s
Sequence: 1.28e+03us
Composite: 3.57e+03us
Composite Normal: 5.9e+03us
real 0m0,007s
user 0m0,007s
sys 0m0,000s
Sequence: 922us
Composite: 1.82e+03us
Composite Normal: 3.06e+03us
real 0m0,004s
user 0m0,004s
sys 0m0,000s
Sequence: 1.4e+03us
Composite: 3.71e+03us
Composite Normal: 5.96e+03us
real 0m0,007s
user 0m0,007s
sys 0m0,000s
Sequence: 36.618us
Composite: 45.895us
Composite Normal: 124.349us
real 0m0,001s
user 0m0,001s
sys 0m0,000
Hyperfine bench
$ hyperfine \
-M 100 \
-N \
-w 5 \
'./sieve-cache-d/benchmark/benchmark' \
'./sieve-cache-cpp/build_libstdcpp/SieveCache_bench' \
'./sieve-cache-cpp/build_libcpp/SieveCache_bench' \
'./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench' \
'./sieve-cache-cpp/build_libcpp_static/SieveCache_bench' \
'./zig-sieve/zig-out/bin/sieve_bench'
Benchmark 1: ./sieve-cache-d/benchmark/benchmark
Time (mean ± σ): 1.3 ms ± 0.1 ms [User: 0.9 ms, System: 0.3 ms]
Range (min … max): 1.1 ms … 1.7 ms 100 runs
Benchmark 2: ./sieve-cache-cpp/build_libstdcpp/SieveCache_bench
Time (mean ± σ): 1.4 ms ± 0.1 ms [User: 1.1 ms, System: 0.3 ms]
Range (min … max): 1.2 ms … 1.8 ms 100 runs
Benchmark 3: ./sieve-cache-cpp/build_libcpp/SieveCache_bench
Time (mean ± σ): 6.9 ms ± 0.2 ms [User: 6.1 ms, System: 0.7 ms]
Range (min … max): 6.6 ms … 7.4 ms 100 runs
Benchmark 4: ./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench
Time (mean ± σ): 4.0 ms ± 0.2 ms [User: 3.5 ms, System: 0.4 ms]
Range (min … max): 3.6 ms … 4.4 ms 100 runs
Benchmark 5: ./sieve-cache-cpp/build_libcpp_static/SieveCache_bench
Time (mean ± σ): 6.4 ms ± 0.2 ms [User: 5.7 ms, System: 0.5 ms]
Range (min … max): 6.0 ms … 7.2 ms 100 runs
Benchmark 6: ./zig-sieve/zig-out/bin/sieve_bench
Time (mean ± σ): 398.3 µs ± 52.9 µs [User: 323.0 µs, System: 29.6 µs]
Range (min … max): 298.9 µs … 559.2 µs 100 runs
Summary
./zig-sieve/zig-out/bin/sieve_bench ran
3.25 ± 0.52 times faster than ./sieve-cache-d/benchmark/benchmark
3.61 ± 0.56 times faster than ./sieve-cache-cpp/build_libstdcpp/SieveCache_bench
10.04 ± 1.41 times faster than ./sieve-cache-cpp/build_libstdcpp_static/SieveCache_bench
16.02 ± 2.19 times faster than ./sieve-cache-cpp/build_libcpp_static/SieveCache_bench
17.33 ± 2.36 times faster than ./sieve-cache-cpp/build_libcpp/SieveCache_bench
tools:
Brief analysis.
And Rust?
The
cargo bench
command does not show me how the underground really works.rust-sieve-cache
cc: @1a1a11a @kubo39