Skip to content

Instantly share code, notes, and snippets.

@ellishg
Last active June 14, 2021 16:51
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ellishg/92a68cf82bfdeccd10225154425edc69 to your computer and use it in GitHub Desktop.
Save ellishg/92a68cf82bfdeccd10225154425edc69 to your computer and use it in GitHub Desktop.
Measure Size and Runtime Performance of MIP

Machine IR Profile (MIP)

This documents goes over the steps to measure the size and runtime performance of MIP instrumented binaries using llvm-test-suite. For details on MIP, please see the patches at https://reviews.llvm.org/D104060.

llvm-project

Checkout the MIP patches in https://github.com/ellishg/llvm-project. To measure binary size fairly in MIP, we first need to extract the __llvm_mipdata section. Since there isn't an obious way to extract this section in llvm-test-suite, apply ExtractMapSection.patch to llvm-project. Now we can build normally.

ninja clang compiler-rt

llvm-test-suite

Checkout the patches in https://github.com/ellishg/llvm-test-suite which includes flags to enable MIP and also a patch I needed to build locally. Then build with and without MIP.

# At build/
cmake -GNinja -DCMAKE_C_COMPILER=/path/to/llvm-project/build/bin/clang -C../cmake/caches/Oz.cmake ..
# At build-mip/
cmake -GNinja -DCMAKE_C_COMPILER=/path/to/llvm-project/build/bin/clang -DTEST_SUITE_MACHINE_PROFILE_GENERATE=On -DTEST_SUITE_MACHINE_FUNCTION_COVERAGE=On -DTEST_SUITE_MACHINE_PROFILE_RUNTIME=Off -C../cmake/caches/Oz.cmake ..
# At build-sancov/
cmake -GNinja -DCMAKE_C_COMPILER=/path/to/llvm-project/build/bin/clang -DTEST_SUITE_SANCOV=On -C../cmake/caches/Oz.cmake ..

Now we run the tests.

# At build/ build-mip/ and build-sancov/
ninja
llvm-lit -v -j 1 -o results.json MultiSource/Benchmarks/

And finally we can view the results.

python3.6 utils/compare.py build/results.json 'vs' build-mip/results.json -m size
python3.6 utils/compare.py build/results.json 'vs' build-mip/results.json -m exec_time --filter-short
diff --git a/llvm/lib/CodeGen/MIPSectionEmitter.cpp b/llvm/lib/CodeGen/MIPSectionEmitter.cpp
index 715b595ec1eb..865fb18c75b4 100644
--- a/llvm/lib/CodeGen/MIPSectionEmitter.cpp
+++ b/llvm/lib/CodeGen/MIPSectionEmitter.cpp
@@ -328,6 +328,8 @@ void MIPSectionEmitter::serializeToMIPRawSection() {
}
void MIPSectionEmitter::serializeToMIPMapSection() {
+ // Do not create the `__llvm_mipmap` section when measuring size and runtime performance.
+ return;
if (!MIRInstrumentation::EnableMachineInstrumentation)
return;

Base vs MIP

-fmachine-profile-generate -fno-machine-profile-runtime -fmachine-profile-function-coverage

Size

Program                                        lhs      rhs      diff
 test-suite...rks/tramp3d-v4/tramp3d-v4.test    1272304  1999496 57.2%
 test-suite...marks/7zip/7zip-benchmark.test    1126464  1441656 28.0%
 test-suite.../Benchmarks/Ptrdist/ft/ft.test    23488    29080   23.8%
 test-suite...yApps-C++/PENNANT/PENNANT.test    117032   143776  22.9%
 test-suite...nchmarks/McCat/09-vor/vor.test    28208    34344   21.8%
 test-suite...oxyApps-C++/miniFE/miniFE.test    116160   141320  21.7%
 test-suite...ProxyApps-C++/CLAMR/CLAMR.test    635600   762456  20.0%
 test-suite...chmarks/MallocBench/gs/gs.test    213248   252680  18.5%
 test-suite...-flt/LinearDependence-flt.test    31432    36488   16.1%
 test-suite...oops-dbl/ControlLoops-dbl.test    31384    36392   16.0%
 test-suite...pansion-dbl/Expansion-dbl.test    31368    36360   15.9%
 test-suite.../Benchmarks/Bullet/bullet.test    1653984  1909600 15.5%
 test-suite...lowfish/security-blowfish.test    36632    41264   12.6%
 test-suite...CI_Purple/SMG2000/smg2000.test    203416   228456  12.3%
 test-suite...s-C/unix-smail/unix-smail.test    50200    56144   11.8%
 Geomean difference                                               4.8%
                lhs           rhs        diff
count  1.740000e+02  1.740000e+02  174.000000
mean   7.232818e+04  8.306837e+04  0.049142
std    1.875064e+05  2.427410e+05  0.062169
min    1.791200e+04  1.820800e+04  0.005310
25%    2.277800e+04  2.360800e+04  0.019100
50%    3.113600e+04  3.181600e+04  0.028891
75%    4.415400e+04  4.599800e+04  0.049896
max    1.653984e+06  1.999496e+06  0.571555

Execution Time

Program                                        lhs    rhs    diff
 test-suite...ks/BitBench/five11/five11.test     1.23   1.37 11.7%
 test-suite.../Benchmarks/Bullet/bullet.test     6.32   6.82  7.8%
 test-suite...s-C/Pathfinder/PathFinder.test     2.56   2.73  6.8%
 test-suite...marks/7zip/7zip-benchmark.test     7.23   7.69  6.4%
 test-suite.../Benchmarks/Ptrdist/ft/ft.test     1.07   1.14  6.3%
 test-suite...ce/Benchmarks/Olden/bh/bh.test     1.07   1.13  5.9%
 test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test     2.67   2.83  5.9%
 test-suite...stones-3.1/fhourstones3.1.test     1.05   1.11  5.4%
 test-suite...oxyApps-C/miniGMG/miniGMG.test     0.68   0.72  5.4%
 test-suite...pps-C/SimpleMOC/SimpleMOC.test     1.83   1.93  5.2%
 test-suite...s/Fhourstones/fhourstones.test     0.74   0.78  4.7%
 test-suite...ProxyApps-C++/CLAMR/CLAMR.test     1.83   1.91  4.2%
 test-suite.../Trimaran/enc-pc1/enc-pc1.test     0.62   0.64  4.1%
 test-suite...ing-flt/Equivalencing-flt.test     1.33   1.39  3.9%
 test-suite...mbolics-dbl/Symbolics-dbl.test     2.89   3.00  3.7%
 Geomean difference                                           nan%
             lhs        rhs       diff
count  86.000000  85.000000  85.000000
mean   3.871901   3.843920  -0.008880
std    6.949257   6.679763   0.049787
min    0.615100   0.640100  -0.245520
25%    1.109200   1.369400  -0.035905
50%    2.731500   2.806400  -0.015810
75%    3.559450   3.604500   0.026691
max    51.471100  49.254200  0.116784

MIP vs SanitizerCoverage

-fsanitize-coverage=func,inline-bool-flag,pc-table

Size

Program                                        lhs    rhs      diff
 test-suite...hmarks/Prolangs-C++/NP/np.test    18584  3405176 18223.2%
 test-suite...rks/Olden/treeadd/treeadd.test    18208  3310592 18082.1%
 test-suite...rolangs-C++/primes/primes.test    18752  3405304 18059.7%
 test-suite...nia/pathfinder/pathfinder.test    18304  3310624 17986.9%
 test-suite...hmarks/VersaBench/bmm/bmm.test    18368  3310656 17924.0%
 test-suite.../Prolangs-C++/vcirc/vcirc.test    19016  3405432 17808.2%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test    18656  3311096 17648.2%
 test-suite...ks/VersaBench/8b10b/8b10b.test    19136  3311416 17204.6%
 test-suite...rks/FreeBench/mason/mason.test    22184  3314664 14841.7%
 test-suite...nchmarks/llubenchmark/llu.test    22272  3310624 14764.5%
 test-suite...comm-CRC32/telecomm-CRC32.test    22344  3314768 14735.2%
 test-suite...adpcm/rawcaudio/rawcaudio.test    22384  3310720 14690.6%
 test-suite...adpcm/rawdaudio/rawdaudio.test    22384  3310720 14690.6%
 test-suite...comm-adpcm/telecomm-adpcm.test    22384  3310720 14690.6%
 test-suite...chmarks/Rodinia/srad/srad.test    22416  3314760 14687.5%
 Geomean difference                                            8379.7%
                lhs           rhs        diff
count  1.740000e+02  1.740000e+02  174.000000
mean   8.306837e+04  3.382465e+06  101.201312
std    2.427410e+05  2.153891e+05  44.367202
min    1.820800e+04  3.310592e+06  1.389606
25%    2.360800e+04  3.315804e+06  71.566076
50%    3.181600e+04  3.324672e+06  103.471712
75%    4.599800e+04  3.358998e+06  141.420633
max    1.999496e+06  5.133680e+06  182.231597

Execution Time

Program                                        lhs    rhs    diff
 test-suite...rks/tramp3d-v4/tramp3d-v4.test     0.91   1.06 16.5%
 test-suite...ks/Prolangs-C++/life/life.test     0.95   1.08 13.5%
 test-suite...ks/BitBench/five11/five11.test     1.37   1.48  8.3%
 test-suite.../Benchmarks/Ptrdist/ks/ks.test     0.66   0.72  8.3%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test     0.84   0.91  7.6%
 test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test     2.83   3.02  6.9%
 test-suite...lFlow-dbl/ControlFlow-dbl.test     3.08   3.28  6.4%
 test-suite...quoia/CrystalMk/CrystalMk.test     4.44   4.72  6.4%
 test-suite...arching-flt/Searching-flt.test     3.14   3.32  5.8%
 test-suite...flt/LoopRestructuring-flt.test     4.07   4.29  5.3%
 test-suite...CI_Purple/SMG2000/smg2000.test     1.57   1.65  4.9%
 test-suite...arks/VersaBench/dbms/dbms.test     0.96   1.01  4.9%
 test-suite.../Trimaran/enc-md5/enc-md5.test     1.40   1.46  4.5%
 test-suite...-flt/LinearDependence-flt.test     2.65   2.76  4.3%
 test-suite...-dbl/LinearDependence-dbl.test     3.36   3.50  4.3%
 Geomean difference                                           0.8%
             lhs        rhs       diff
count  85.000000  85.000000  85.000000
mean   3.843920   3.870114   0.008829
std    6.679763   6.729640   0.045385
min    0.640100   0.646800  -0.170256
25%    1.369400   1.282500  -0.015408
50%    2.806400   2.771800   0.005737
75%    3.604500   3.586900   0.035476
max    49.254200  49.413000  0.164675
@ellishg
Copy link
Author

ellishg commented Jun 11, 2021

MIP vs -fcs-profile-generate

-fcs-profile-generate -mllvm -disable-vp=true

Size

Program                                        lhs    rhs     diff
 test-suite...rks/Olden/treeadd/treeadd.test    18208  227504 1149.5%
 test-suite...nia/pathfinder/pathfinder.test    18304  227720 1144.1%
 test-suite...hmarks/VersaBench/bmm/bmm.test    18368  227872 1140.6%
 test-suite...hmarks/Prolangs-C++/NP/np.test    18584  228024 1127.0%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test    18656  227968 1122.0%
 test-suite...rolangs-C++/primes/primes.test    18752  228192 1116.9%
 test-suite.../Prolangs-C++/vcirc/vcirc.test    19016  228592 1102.1%
 test-suite...ks/VersaBench/8b10b/8b10b.test    19136  228344 1093.3%
 test-suite...rks/FreeBench/mason/mason.test    22184  231688 944.4%
 test-suite...nchmarks/llubenchmark/llu.test    22272  231728 940.4%
 test-suite...comm-CRC32/telecomm-CRC32.test    22344  231560 936.3%
 test-suite...adpcm/rawcaudio/rawcaudio.test    22384  231800 935.6%
 test-suite...comm-adpcm/telecomm-adpcm.test    22384  231800 935.6%
 test-suite...adpcm/rawdaudio/rawdaudio.test    22384  231800 935.6%
 test-suite...chmarks/Rodinia/srad/srad.test    22416  231888 934.5%
 Geomean difference                                           581.6%
                lhs           rhs        diff
count  1.740000e+02  1.740000e+02  174.000000
mean   8.306837e+04  3.127011e+05  6.492764
std    2.427410e+05  3.395851e+05  2.725244
min    1.820800e+04  2.275040e+05  0.310679
25%    2.360800e+04  2.331420e+05  4.627070
50%    3.181600e+04  2.430160e+05  6.659682
75%    4.599800e+04  2.599580e+05  8.878372
max    1.999496e+06  3.327304e+06  11.494728

Execution Time

Program                                        lhs    rhs    diff
 test-suite...ing-flt/Equivalencing-flt.test     1.39   5.69 310.5%
 test-suite...enchmarks/Olden/em3d/em3d.test     2.13   8.27 287.8%
 test-suite...ing-dbl/Equivalencing-dbl.test     1.83   5.46 197.6%
 test-suite...arching-flt/Searching-flt.test     3.14   8.53 172.2%
 test-suite...arching-dbl/Searching-dbl.test     3.17   8.51 168.7%
 test-suite...mbolics-flt/Symbolics-flt.test     2.07   5.45 163.4%
 test-suite...C/Packing-flt/Packing-flt.test     3.24   7.36 126.9%
 test-suite.../Benchmarks/Ptrdist/ks/ks.test     0.66   1.46 120.9%
 test-suite...C/Packing-dbl/Packing-dbl.test     3.57   7.65 114.3%
 test-suite...ProxyApps-C++/HPCCG/HPCCG.test     0.64   1.29 101.7%
 test-suite...marks/Ptrdist/yacr2/yacr2.test     0.72   1.39 91.6%
 test-suite...mbolics-dbl/Symbolics-dbl.test     3.00   5.53 84.7%
 test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test     5.57  10.08 81.1%
 test-suite...lFlow-flt/ControlFlow-flt.test     2.60   4.65 79.0%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test     0.84   1.50 78.3%
 Geomean difference                                           nan%
             lhs        rhs       diff
count  85.000000  88.000000  85.000000
mean   3.843920   5.324403   0.391462
std    6.679763   10.985809  0.597457
min    0.640100   0.633600  -0.130966
25%    1.369400   1.394525   0.064466
50%    2.806400   3.348350   0.145232
75%    3.604500   5.451200   0.333469
max    49.254200  86.189600  3.104777

@ellishg
Copy link
Author

ellishg commented Jun 12, 2021

To make the results a little more clear, I filtered for only the tests in MultiSource/Benchmarks/FreeBench/. I also only show the size differences of the .text section. It wouldn't be fair to compare the .data section since MIP injects data into the __llvm_mipraw section rather than the .data section.

Base vs MIP

Text Size

Program             lhs    rhs    diff
 fourinarow          4629   4741   2.4%
 analyzer            6053   6165   1.9%
 pifft               33093  33621  1.6%
 pcompress2          4020   4084   1.6%
 neural              3381   3429   1.4%
 mason               2085   2101   0.8%
 distray             4325   4357   0.7%

Base vs -fcs-profile-generate

Text Size

Program             lhs    rhs    diff
 mason               2085   19346 827.9%
 neural              3381   20994 520.9%
 pcompress2          4020   21556 436.2%
 fourinarow          4629   24322 425.4%
 distray             4325   21490 396.9%
 analyzer            6053   23762 292.6%
 pifft               33093  52866 59.7%

@ellishg
Copy link
Author

ellishg commented Jun 13, 2021

Base vs XRay

https://www.llvm.org/docs/XRay.html#minimizing-binary-size

-fxray-instrument -fxray-instrumentation-bundle=function-entry

Text Size

Program             lhs    rhs     diff
 mason               2085   152098 7194.9%
 neural              3381   153426 4437.9%
 pcompress2          4020   154068 3732.5%
 distray             4325   154338 3468.5%
 fourinarow          4629   154802 3244.2%
 analyzer            6053   156178 2480.2%
 pifft               33093  183682 455.0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment