Skip to content

Instantly share code, notes, and snippets.

@agostini01
Last active August 8, 2019 20:37
Show Gist options
  • Save agostini01/0cd484a4023d14f0a4419907e2f088fd to your computer and use it in GitHub Desktop.
Save agostini01/0cd484a4023d14f0a4419907e2f088fd to your computer and use it in GitHub Desktop.
How to get fp events and fp ops per second in skylake?
Followed this tutorial:
http://www.bnikolic.co.uk/blog/hpc-howto-measure-flops.html
Get the latest version of perfmon2/libpfm (h/t this developerworks article):
git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4
cd libpfm4
make
Run the showevtinfo program (in examples subdirectory) to get a list of all available events, and the masks and modifiers that are supported (see the output below for an example of the full output)
Figure out what events and what with masks and modifiers you want to use. The masks are prefixed by Umask and are given as hexadecimal numbers and also symbolic names in the square brackets. The modifiers are prefixed by Modif and their names are also in square brackets.
Use the check_events program (also in examples sub-directory) to convert the event, umask and modifiers into a raw code. You can do this by running the command as:
check_events <event name>:<umask>[(:modifers)*]
i.e., you supply the event name, the umask and multiple modifiers all separated by the colon character. The program will then print out, amongst other things, an raw event specification, for example:
Codes : 0x531003
This hexadecimal code can be used as parameter to GNU.Linux perf tools, for example to perf stat by supplying it with -e r531003 option
FP events to profile:
FP_ARITH:SCALAR_DOUBLE
FP_ARITH:SCALAR_SINGLE
FP_ARITH:128B_PACKED_DOUBLE
FP_ARITH:128B_PACKED_SINGLE
FP_ARITH:256B_PACKED_DOUBLE
FP_ARITH:256B_PACKED_SINGLE
FP_ARITH:512B_PACKED_DOUBLE
FP_ARITH:512B_PACKED_SINGLE
Getting the codes
FP_ARITH:SCALAR_DOUBLE FP_ARITH:SCALAR_SINGLE FP_ARITH:128B_PACKED_DOUBLE FP_ARITH:128B_PACKED_SINGLE FP_ARITH:256B_PACKED_DOUBLE FP_ARITH:256B_PACKED_SINGLE FP_ARITH:512B_PACKED_DOUBLE FP_ARITH:512B_PACKED_SINGLE
./check_events FP_ARITH:SCALAR_DOUBLE FP_ARITH:SCALAR_SINGLE FP_ARITH:128B_PACKED_DOUBLE FP_ARITH:128B_PACKED_SINGLE FP_ARITH:256B_PACKED_DOUBLE FP_ARITH:256B_PACKED_SINGLE FP_ARITH:512B_PACKED_DOUBLE FP_ARITH:512B_PACKED_SINGLE | grep Codes
Requested Event: FP_ARITH:SCALAR_DOUBLE
Actual Event: skl::FP_ARITH_INST_RETIRED:SCALAR_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5301c7
Requested Event: FP_ARITH:SCALAR_SINGLE
Actual Event: skl::FP_ARITH_INST_RETIRED:SCALAR_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5302c7
Requested Event: FP_ARITH:128B_PACKED_DOUBLE
Actual Event: skl::FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5304c7
Requested Event: FP_ARITH:128B_PACKED_SINGLE
Actual Event: skl::FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5308c7
Requested Event: FP_ARITH:256B_PACKED_DOUBLE
Actual Event: skl::FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5310c7
Requested Event: FP_ARITH:256B_PACKED_SINGLE
Actual Event: skl::FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5320c7
Requested Event: FP_ARITH:512B_PACKED_DOUBLE
Actual Event: skl::FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5340c7
Requested Event: FP_ARITH:512B_PACKED_SINGLE
Actual Event: skl::FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0
PMU : Intel Skylake
IDX : 419430470
Codes : 0x5380c7
perf stat -e r5301c7 -e r5302c7 -e r5304c7 -e r5308c7 -e r5310c7 -e r5320c7 -e r5340c7 -e r5380c7 -- ./main
output:
Performance counter stats for './main':
202502044 r5301c7 (49.86%)
749 r5302c7 (49.94%)
2 r5304c7 (50.01%)
2 r5308c7 (50.09%)
0 r5310c7 (50.14%)
0 r5320c7 (50.06%)
0 r5340c7 (49.99%)
0 r5380c7 (49.91%)
5.268380671 seconds time elapsed
modified outuput:
Performance counter stats for './main':
202502044 FP_ARITH:SCALAR_DOUBLE
749 FP_ARITH:SCALAR_SINGLE
2 FP_ARITH:128B_PACKED_DOUBLE
2 FP_ARITH:128B_PACKED_SINGLE
0 FP_ARITH:256B_PACKED_DOUBLE
0 FP_ARITH:256B_PACKED_SINGLE
0 FP_ARITH:512B_PACKED_DOUBLE
0 FP_ARITH:512B_PACKED_SINGLE
202,502,797 total FP operation events
5.268380671 seconds time elapsed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment