Minor improvement using a Hilbert curve when using -O3 + march=native, machine used AMD Ryzen 5800X
Built with
c++ -o hilbert-gcc hilbert.cc -O3 -std=c++17 -g -fno-omit-frame-pointer -DGOLDEN -DHILBERT
clang++ -o hilbert-clang hilbert.cc -O3 -std=c++17 -g -fno-omit-frame-pointer -DGOLDEN -DHILBERT
(base) [nimalan@localhost ctest]$ ./hilbert-gcc 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Time taken CPU Median filter: 73317 ms
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 73159 ms
(base) [nimalan@localhost ctest]$ ./hilbert-clang 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Time taken CPU Median filter: 74025 ms
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 71439 ms
# Event 'cycles:u'
#
# Baseline Delta Abs Shared Object Symbol
# ........ ......... .................... ..................................................................................................................................................
#
+84.96% hilbert-only-clang [.] std::__introselect<__gnu_cxx::__normal_iterator<float*, std::vector<float, std::allocator<float> > >, long, __gnu_cxx::__ops::_Iter_less_iter>
+14.98% hilbert-only-clang [.] hilbert_sliding_window
0.10% -0.04% [unknown] [k] 0xffffffffa3a06beb
+0.00% hilbert-only-clang [.] experiment
0.00% +0.00% ld-linux-x86-64.so.2 [.] handle_amd
84.17% hilbert-only [.] std::__introselect<__gnu_cxx::__normal_iterator<float*, std::vector<float, std::allocator<float> > >, long, __gnu_cxx::__ops::_Iter_less_iter>
15.73% hilbert-only [.] hilbert_sliding_window
0.00% hilbert-only [.] experiment
0.00% ld-linux-x86-64.so.2 [.] _dl_start
(base) [nimalan@localhost ctest]$ perf report -g -i perf.data
(base) [nimalan@localhost ctest]$ clang++ -o hilbert-only-clang hilbert.cc -O3 -std=c++17 -g -fno-omit-frame-pointer -DHILBERT -march=native
--------------------------------------------------------------------
Built with
c++ -o hilbert-gcc hilbert.cc -O3 -std=c++17 -g -fno-omit-frame-pointer -DGOLDEN -DHILBERT -march=native
clang++ -o hilbert-clang hilbert.cc -O3 -std=c++17 -g -fno-omit-frame-pointer -DGOLDEN -DHILBERT -march=native
(base) [nimalan@localhost ctest]$ ./hilbert-gcc 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Time taken CPU Median filter: 73458 ms
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 72654 ms
(base) [nimalan@localhost ctest]$ ./hilbert-clang 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Time taken CPU Median filter: 75060 ms
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 72768 ms
(base) [nimalan@localhost ctest]$ perf record -g -F99 ./hilbert-only 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 72190 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.485 MB perf.data (7154 samples) ]
(base) [nimalan@localhost ctest]$ perf record -g -F99 ./hilbert-only-clang 1124 50
Percent masked: 0.200070
Image size 1124 1124
Window size 50 50
Result size 1023 1023
Dim: 1024 1024
Size: 1048576
Res Size: 1023 1023
Time taken Hilbert Median filter: 71963 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.437 MB perf.data (7128 samples) ]
# Event 'cycles:u'
#
# Baseline Delta Abs Shared Object Symbol
# ........ ......... .................... ..................................................................................................................................................
#
+83.71% hilbert-only-clang [.] std::__introselect<__gnu_cxx::__normal_iterator<float*, std::vector<float, std::allocator<float> > >, long, __gnu_cxx::__ops::_Iter_less_iter>
+16.23% hilbert-only-clang [.] hilbert_sliding_window
0.07% -0.01% [unknown] [k] 0xffffffffa3a06beb
+0.00% hilbert-only-clang [.] experiment
0.00% +0.00% ld-linux-x86-64.so.2 [.] handle_amd
84.28% hilbert-only [.] std::__introselect<__gnu_cxx::__normal_iterator<float*, std::vector<float, std::allocator<float> > >, long, __gnu_cxx::__ops::_Iter_less_iter>
15.64% hilbert-only [.] hilbert_sliding_window
0.00% hilbert-only [.] experiment
0.00% ld-linux-x86-64.so.2 [.] _dl_start