Last active
August 8, 2019 20:37
-
-
Save agostini01/0cd484a4023d14f0a4419907e2f088fd to your computer and use it in GitHub Desktop.
How to get fp events and fp ops per second in skylake?
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Followed this tutorial: | |
http://www.bnikolic.co.uk/blog/hpc-howto-measure-flops.html | |
Get the latest version of perfmon2/libpfm (h/t this developerworks article): | |
git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4 | |
cd libpfm4 | |
make | |
Run the showevtinfo program (in examples subdirectory) to get a list of all available events, and the masks and modifiers that are supported (see the output below for an example of the full output) | |
Figure out what events and what with masks and modifiers you want to use. The masks are prefixed by Umask and are given as hexadecimal numbers and also symbolic names in the square brackets. The modifiers are prefixed by Modif and their names are also in square brackets. | |
Use the check_events program (also in examples sub-directory) to convert the event, umask and modifiers into a raw code. You can do this by running the command as: | |
check_events <event name>:<umask>[(:modifers)*] | |
i.e., you supply the event name, the umask and multiple modifiers all separated by the colon character. The program will then print out, amongst other things, an raw event specification, for example: | |
Codes : 0x531003 | |
This hexadecimal code can be used as parameter to GNU.Linux perf tools, for example to perf stat by supplying it with -e r531003 option | |
FP events to profile: | |
FP_ARITH:SCALAR_DOUBLE | |
FP_ARITH:SCALAR_SINGLE | |
FP_ARITH:128B_PACKED_DOUBLE | |
FP_ARITH:128B_PACKED_SINGLE | |
FP_ARITH:256B_PACKED_DOUBLE | |
FP_ARITH:256B_PACKED_SINGLE | |
FP_ARITH:512B_PACKED_DOUBLE | |
FP_ARITH:512B_PACKED_SINGLE | |
Getting the codes | |
FP_ARITH:SCALAR_DOUBLE FP_ARITH:SCALAR_SINGLE FP_ARITH:128B_PACKED_DOUBLE FP_ARITH:128B_PACKED_SINGLE FP_ARITH:256B_PACKED_DOUBLE FP_ARITH:256B_PACKED_SINGLE FP_ARITH:512B_PACKED_DOUBLE FP_ARITH:512B_PACKED_SINGLE | |
./check_events FP_ARITH:SCALAR_DOUBLE FP_ARITH:SCALAR_SINGLE FP_ARITH:128B_PACKED_DOUBLE FP_ARITH:128B_PACKED_SINGLE FP_ARITH:256B_PACKED_DOUBLE FP_ARITH:256B_PACKED_SINGLE FP_ARITH:512B_PACKED_DOUBLE FP_ARITH:512B_PACKED_SINGLE | grep Codes | |
Requested Event: FP_ARITH:SCALAR_DOUBLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:SCALAR_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5301c7 | |
Requested Event: FP_ARITH:SCALAR_SINGLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:SCALAR_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5302c7 | |
Requested Event: FP_ARITH:128B_PACKED_DOUBLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5304c7 | |
Requested Event: FP_ARITH:128B_PACKED_SINGLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5308c7 | |
Requested Event: FP_ARITH:256B_PACKED_DOUBLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5310c7 | |
Requested Event: FP_ARITH:256B_PACKED_SINGLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5320c7 | |
Requested Event: FP_ARITH:512B_PACKED_DOUBLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5340c7 | |
Requested Event: FP_ARITH:512B_PACKED_SINGLE | |
Actual Event: skl::FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0 | |
PMU : Intel Skylake | |
IDX : 419430470 | |
Codes : 0x5380c7 | |
perf stat -e r5301c7 -e r5302c7 -e r5304c7 -e r5308c7 -e r5310c7 -e r5320c7 -e r5340c7 -e r5380c7 -- ./main | |
output: | |
Performance counter stats for './main': | |
202502044 r5301c7 (49.86%) | |
749 r5302c7 (49.94%) | |
2 r5304c7 (50.01%) | |
2 r5308c7 (50.09%) | |
0 r5310c7 (50.14%) | |
0 r5320c7 (50.06%) | |
0 r5340c7 (49.99%) | |
0 r5380c7 (49.91%) | |
5.268380671 seconds time elapsed | |
modified outuput: | |
Performance counter stats for './main': | |
202502044 FP_ARITH:SCALAR_DOUBLE | |
749 FP_ARITH:SCALAR_SINGLE | |
2 FP_ARITH:128B_PACKED_DOUBLE | |
2 FP_ARITH:128B_PACKED_SINGLE | |
0 FP_ARITH:256B_PACKED_DOUBLE | |
0 FP_ARITH:256B_PACKED_SINGLE | |
0 FP_ARITH:512B_PACKED_DOUBLE | |
0 FP_ARITH:512B_PACKED_SINGLE | |
202,502,797 total FP operation events | |
5.268380671 seconds time elapsed | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment