-
-
Save alannnna/f82c9fb94510b2424effb1df3486c8e5 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Intel(R) VTune(TM) Profiler Self Check Utility | |
Copyright (C) 2009-2020 Intel Corporation. All rights reserved. | |
Build Number: 613804 | |
Ignored warnings: ['To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.', 'To enable hardware event-based sampling, PRODUCT_LEGAL_SHORT_NAME has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.'] | |
Check of files: Ok | |
================================================================================ | |
Context values: | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/amplxe-runss --context-value-list | |
Stdout: | |
targetOS: Linux | |
OS: Linux | |
OSBuildNumber: 9 | |
OSBitness: 64 | |
RootPrivileges: false | |
isPtraceScopeLimited: false | |
isCATSupportedByCPU: true | |
isL3CATAvailable: true | |
L3CATDetails: COS=16;ways=11 | |
isL2CATAvailable: false | |
isL3MonitoringSupportedByCPU: true | |
LLCSize: 28835840 | |
cacheMonitoringUpscalingFactor: 81920 | |
isL3CacheOccupancyAvailable: true | |
isL3TotalBWAvailable: true | |
isL3LocalBWAvailable: true | |
isTSXAvailable: true | |
isPTAvailable: true | |
isHTEnabled: true | |
fpgaOnBoard: None | |
omniPathOnBoard: None | |
genArchOnBoard: 0 | |
pciClassParts: | |
isSGXAvailable: false | |
LinuxRelease: 5.2.9-229_fbk15_hardened_4185_g357f49b36602 | |
is3DXPPresent: false | |
is3DXP2LMMode: false | |
is3DXPAppDirectMode: false | |
IsNUMANodeWithoutCPUsPresent: false | |
Hypervisor: None | |
PerfmonVersion: 4 | |
isMaxDRAMBandwidthMeasurementSupported: true | |
tidValuesForIO: 0x1d8;0x1f0;0x1f8 | |
preferedGpuAdapter: none | |
isPtraceAvailable: true | |
i915Status: MissingDriver | |
isFtraceAvailable: ftraceUnknownError | |
isMdfEtwAvailable: false | |
isCSwitchAvailable: no | |
isGpuBusynessAvailable: unsupportedHardware | |
isGpuWaitAvailable: no | |
isFunctionTracingAvailable: no | |
isIowaitTracingAvailable: no | |
isVSyncAvailable: no | |
HypervisorType: None | |
isDeviceOrCredentialGuardEnabled: false | |
isSEPDriverAvailable: false | |
isPAXDriverLoaded: false | |
platformType: 111 | |
CPU_NAME: Intel(R) Xeon(R) Processor code named Skylake | |
PMU: skylake_server | |
availablePmuTypes: core,cha,imc,pcu,qpi,r3qpi,ubox,m2pcie,m2m,irp,iio,power,hfi_rxe,hfi_txe | |
referenceFrequency: 2000000000 | |
isPStateAvailable: true | |
isVTSSPPDriverAvailable: false | |
isNMIWatchDogTimerRunning: true | |
LinuxPerfCredentials: Unlimited | |
LinuxPerfCapabilities: breakpoint:raw;cpu:raw,format,events,ldlat,frontend;cstate_core:raw,format,events;cstate_pkg:raw,format,events;intel_pt:raw,format;kprobe:raw,format;msr:raw,format,events;power:raw,format,events;software:raw;tracepoint:raw;uncore_cha:20,raw,format;uncore_iio:6,raw,format;uncore_iio_free_running:6,raw,format,events;uncore_imc:6,raw,format,events;uncore_irp:6,raw,format;uncore_m2m:2,raw,format;uncore_m3upi:3,raw,format;uncore_pcu:raw,format;uncore_ubox:raw,format;uncore_upi:3,raw,format;uprobe:raw,format | |
LinuxPerfStackCapabilities: fp,dwarf,lbr | |
areKernelPtrsRestricted: no | |
LinuxPerfMuxIntervalMs: 1 | |
isAOCLAvailable: false | |
isTPSSAvailable: true | |
isPytraceAvailable: true | |
forceShowInlines: false | |
isSTTAvailable: no | |
isNnpiHwTraceToolAvailable: false | |
isNnpiTraceToolAvailable: false | |
isEnergyCollectionSupported: true | |
isSocwatchDriverLoaded: false | |
isCPUSupportedBySocwatch: true | |
isIPMWatchReady: false | |
isNvdimmAvailable: false | |
isOsCountersCollectorAvailable: true | |
Getting context values: OK | |
================================================================================ | |
Check driver: | |
isSEPDriverAvailable: false | |
isPAXDriverLoaded: false | |
Command line: | |
lsmod | |
Is SEP in lsmod: False | |
The SEP driver is not available. | |
================================================================================ | |
SEP version: | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/sep -version | |
Stdout: | |
Sampling Enabling Product Version: 5.22 Beta built on Nov 10 2020 18:15:43 | |
SEP Driver Version: PAX Driver Version: Platform type: 111 | |
CPU name: Intel(R) Xeon(R) Processor code named Skylake | |
PMU: skylake_server | |
Stderr: | |
Error retrieving SEP driver version | |
Error retrieving PAX driver version | |
Check driver with sep -version: Fail | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect performance-snapshot -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fac0e55c010 | |
Offs of buf1 = 0x7fac0e55c180 | |
Addr of buf2 = 0x7fac0c55b010 | |
Offs of buf2 = 0x7fac0c55b1c0 | |
Addr of buf3 = 0x7fac0a55a010 | |
Offs of buf3 = 0x7fac0a55a100 | |
Addr of buf4 = 0x7fac08559010 | |
Offs of buf4 = 0x7fac08559140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.310 seconds | |
Stderr: | |
vtune: Peak bandwidth measurement started. | |
vtune: Peak bandwidth measurement finished. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis (counting mode) (Perf) | |
Example of analysis types: Performance Snapshot | |
Collection: Ok | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-699845-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-699845-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '699871.stat.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving thread name information | |
vtune: Executing actions 43 % Resolving thread name information | |
vtune: Executing actions 43 % Resolving call target names for dynamic code | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 48 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 59 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Preparing output tree | |
vtune: Executing actions 63 % Parsing columns in input tree | |
vtune: Executing actions 64 % Parsing columns in input tree | |
vtune: Executing actions 64 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level rows | |
vtune: Executing actions 67 % Creating top-level rows | |
vtune: Executing actions 67 % Preparing output tree | |
vtune: Executing actions 67 % Parsing columns in input tree | |
vtune: Executing actions 67 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level rows | |
vtune: Executing actions 70 % Creating top-level rows | |
vtune: Executing actions 70 % Setting data model parameters | |
vtune: Executing actions 71 % Setting data model parameters | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 73 % Precomputing frequently used data | |
vtune: Executing actions 74 % Precomputing frequently used data | |
vtune: Executing actions 75 % Precomputing frequently used data | |
vtune: Executing actions 76 % Precomputing frequently used data | |
vtune: Executing actions 77 % Precomputing frequently used data | |
vtune: Executing actions 78 % Precomputing frequently used data | |
vtune: Executing actions 79 % Precomputing frequently used data | |
vtune: Executing actions 80 % Precomputing frequently used data | |
vtune: Executing actions 81 % Precomputing frequently used data | |
vtune: Executing actions 82 % Precomputing frequently used data | |
vtune: Executing actions 83 % Precomputing frequently used data | |
vtune: Executing actions 83 % Updating precomputed scalar metrics | |
vtune: Executing actions 85 % Updating precomputed scalar metrics | |
vtune: Executing actions 85 % Discarding redundant overtime data | |
vtune: Executing actions 89 % Discarding redundant overtime data | |
vtune: Executing actions 89 % Saving the result | |
vtune: Executing actions 92 % Saving the result | |
vtune: Executing actions 96 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -R summary -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps | |
Stdout: | |
Elapsed Time: 3.407s | |
IPC: 0.456 | |
| The IPC may be too low. This could be caused by issues such as memory | |
| stalls, instruction starvation, branch misprediction or long latency | |
| instructions. Explore the other hardware-related metrics to identify what | |
| is causing low IPC. | |
| | |
DP GFLOPS: 4.703 | |
Average CPU Frequency: 2.933 GHz | |
Effective Logical Core Utilization: 19.0% (15.193 out of 80) | |
| The metric value is low, which may signal a poor logical CPU cores | |
| utilization. Consider improving physical core utilization as the first step | |
| and then look at opportunities to utilize logical cores, which in some cases | |
| can improve processor throughput and overall performance of multi-threaded | |
| applications. | |
| | |
Effective Physical Core Utilization: 19.0% (7.596 out of 40) | |
| The metric value is low, which may signal a poor physical CPU cores | |
| utilization caused by: | |
| - load imbalance | |
| - threading runtime overhead | |
| - contended synchronization | |
| - thread/process underutilization | |
| - incorrect affinity that utilizes logical cores instead of physical | |
| cores | |
| Explore sub-metrics to estimate the efficiency of MPI and OpenMP | |
| parallelism or run the Locks and Waits analysis to identify parallel | |
| bottlenecks for other parallel runtimes. | |
| | |
Microarchitecture Usage: 0.0% of Pipeline Slots | |
| You code efficiency on this platform is too low. | |
| | |
| Possible cause: memory stalls, instruction starvation, branch misprediction | |
| or long latency instructions. | |
| | |
| Next steps: Run Microarchitecture Exploration analysis to identify the cause | |
| of the low microarchitecture usage efficiency. | |
| | |
Retiring: 0.0% of Pipeline Slots | |
Front-End Bound: 0.0% of Pipeline Slots | |
Back-End Bound: 100.0% of Pipeline Slots | |
| A significant portion of pipeline slots are remaining empty. When | |
| operations take too long in the back-end, they introduce bubbles in the | |
| pipeline that ultimately cause fewer pipeline slots containing useful | |
| work to be retired per cycle than the machine is capable to support. This | |
| opportunity cost results in slower execution. Long-latency operations | |
| like divides and memory operations can cause this, as can too many | |
| operations being directed to a single execution port (for example, more | |
| multiply operations arriving in the back-end per cycle than the execution | |
| unit can support). | |
| | |
Memory Bound: 0.0% of Pipeline Slots | |
Core Bound: 100.0% of Pipeline Slots | |
| This metric represents how much Core non-memory issues were of a | |
| bottleneck. Shortage in hardware compute resources, or dependencies | |
| software's instructions are both categorized under Core Bound. Hence | |
| it may indicate the machine ran out of an OOO resources, certain | |
| execution units are overloaded or dependencies in program's data- or | |
| instruction- flow are limiting the performance (e.g. FP-chained long- | |
| latency arithmetic operations). | |
| | |
Bad Speculation: 0.0% of Pipeline Slots | |
Memory Bound: 0.0% of Pipeline Slots | |
L1 Bound: 0.0% of Clockticks | |
L2 Bound: 0.0% of Clockticks | |
L3 Bound: 0.0% of Clockticks | |
DRAM Bound: 0.0% of Clockticks | |
DRAM Bandwidth Bound: 0.0% of Elapsed Time | |
Store Bound: 0.0% of Clockticks | |
NUMA: % of Remote Accesses: 0.0% | |
Vectorization: 0.0% of Packed FP Operations | |
Instruction Mix | |
SP FLOPs: 0.0% of uOps | |
Packed: 0.0% from SP FP | |
128-bit: 0.0% from SP FP | |
256-bit: 0.0% from SP FP | |
512-bit: 0.0% from SP FP | |
Scalar: 0.0% from SP FP | |
DP FLOPs | |
Packed: 0.0% from DP FP | |
128-bit: 0.0% from DP FP | |
256-bit: 0.0% from DP FP | |
512-bit: 0.0% from DP FP | |
Scalar: 100.0% from DP FP | |
x87 FLOPs: 0.0% of uOps | |
Non-FP | |
FP Arith/Mem Rd Instr. Ratio | |
FP Arith/Mem Wr Instr. Ratio | |
Collection and Platform Info | |
Application Command Line: /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Operating System: 5.2.9-229_fbk15_hardened_4185_g357f49b36602 \S Kernel \r on an \m Rotor 2021-04-09 15:07 | |
Computer Name: devbig055.ftw5.facebook.com | |
Result Size: 3.7 MB | |
Collection start time: 22:11:01 09/04/2021 UTC | |
Collection stop time: 22:11:04 09/04/2021 UTC | |
Collector Type: Driverless Perf per-process counting | |
CPU | |
Name: Intel(R) Xeon(R) Processor code named Skylake | |
Frequency: 1.995 GHz | |
Logical CPU Count: 80 | |
Max DRAM Single-Package Bandwidth: 77.000 GB/s | |
Cache Allocation Technology | |
Level 2 capability: not detected | |
Level 3 capability: available | |
Recommendations: | |
Hotspots: Start with Hotspots analysis to understand the efficiency of your algorithm. | |
| Use Hotspots analysis to identify the most time consuming functions. | |
| Drill down to see the time spent on every line of code. | |
Microarchitecture Exploration: There is low microarchitecture usage (0.0%) of available hardware resources. | |
| Run Microarchitecture Exploration analysis to analyze CPU | |
| microarchitecture bottlenecks that can affect application performance. | |
Threading: There is poor utilization of logical CPU cores (19.0%) in your application. | |
| Use Threading to explore more opportunities to increase parallelism in | |
| your application. | |
If you want to skip descriptions of detected performance issues in the report, | |
enter: vtune -report summary -report-knob show-issues=false -r <my_result_dir>. | |
Alternatively, you may view the report in the csv format: vtune -report | |
<report_name> -format=csv. | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stderr: | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
Instrumentation based analysis checkIntel(R) VTune(TM) Profiler Self Check Utility | |
Copyright (C) 2009-2020 Intel Corporation. All rights reserved. | |
Build Number: 613804 | |
Ignored warnings: ['To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.', 'To enable hardware event-based sampling, PRODUCT_LEGAL_SHORT_NAME has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.'] | |
Check of files: Ok | |
================================================================================ | |
Context values: | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/amplxe-runss --context-value-list | |
Stdout: | |
targetOS: Linux | |
OS: Linux | |
OSBuildNumber: 9 | |
OSBitness: 64 | |
RootPrivileges: false | |
isPtraceScopeLimited: false | |
isCATSupportedByCPU: true | |
isL3CATAvailable: true | |
L3CATDetails: COS=16;ways=11 | |
isL2CATAvailable: false | |
isL3MonitoringSupportedByCPU: true | |
LLCSize: 28835840 | |
cacheMonitoringUpscalingFactor: 81920 | |
isL3CacheOccupancyAvailable: true | |
isL3TotalBWAvailable: true | |
isL3LocalBWAvailable: true | |
isTSXAvailable: true | |
isPTAvailable: true | |
isHTEnabled: true | |
fpgaOnBoard: None | |
omniPathOnBoard: None | |
genArchOnBoard: 0 | |
pciClassParts: | |
isSGXAvailable: false | |
LinuxRelease: 5.2.9-229_fbk15_hardened_4185_g357f49b36602 | |
is3DXPPresent: false | |
is3DXP2LMMode: false | |
is3DXPAppDirectMode: false | |
IsNUMANodeWithoutCPUsPresent: false | |
Hypervisor: None | |
PerfmonVersion: 4 | |
isMaxDRAMBandwidthMeasurementSupported: true | |
tidValuesForIO: 0x1d8;0x1f0;0x1f8 | |
preferedGpuAdapter: none | |
isPtraceAvailable: true | |
i915Status: MissingDriver | |
isFtraceAvailable: ftraceUnknownError | |
isMdfEtwAvailable: false | |
isCSwitchAvailable: no | |
isGpuBusynessAvailable: unsupportedHardware | |
isGpuWaitAvailable: no | |
isFunctionTracingAvailable: no | |
isIowaitTracingAvailable: no | |
isVSyncAvailable: no | |
HypervisorType: None | |
isDeviceOrCredentialGuardEnabled: false | |
isSEPDriverAvailable: false | |
isPAXDriverLoaded: false | |
platformType: 111 | |
CPU_NAME: Intel(R) Xeon(R) Processor code named Skylake | |
PMU: skylake_server | |
availablePmuTypes: core,cha,imc,pcu,qpi,r3qpi,ubox,m2pcie,m2m,irp,iio,power,hfi_rxe,hfi_txe | |
referenceFrequency: 2000000000 | |
isPStateAvailable: true | |
isVTSSPPDriverAvailable: false | |
isNMIWatchDogTimerRunning: true | |
LinuxPerfCredentials: Unlimited | |
LinuxPerfCapabilities: breakpoint:raw;cpu:raw,format,events,ldlat,frontend;cstate_core:raw,format,events;cstate_pkg:raw,format,events;intel_pt:raw,format;kprobe:raw,format;msr:raw,format,events;power:raw,format,events;software:raw;tracepoint:raw;uncore_cha:20,raw,format;uncore_iio:6,raw,format;uncore_iio_free_running:6,raw,format,events;uncore_imc:6,raw,format,events;uncore_irp:6,raw,format;uncore_m2m:2,raw,format;uncore_m3upi:3,raw,format;uncore_pcu:raw,format;uncore_ubox:raw,format;uncore_upi:3,raw,format;uprobe:raw,format | |
LinuxPerfStackCapabilities: fp,dwarf,lbr | |
areKernelPtrsRestricted: no | |
LinuxPerfMuxIntervalMs: 1 | |
isAOCLAvailable: false | |
isTPSSAvailable: true | |
isPytraceAvailable: true | |
forceShowInlines: false | |
isSTTAvailable: no | |
isNnpiHwTraceToolAvailable: false | |
isNnpiTraceToolAvailable: false | |
isEnergyCollectionSupported: true | |
isSocwatchDriverLoaded: false | |
isCPUSupportedBySocwatch: true | |
isIPMWatchReady: false | |
isNvdimmAvailable: false | |
isOsCountersCollectorAvailable: true | |
Getting context values: OK | |
================================================================================ | |
Check driver: | |
isSEPDriverAvailable: false | |
isPAXDriverLoaded: false | |
Command line: | |
lsmod | |
Is SEP in lsmod: False | |
The SEP driver is not available. | |
================================================================================ | |
SEP version: | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/sep -version | |
Stdout: | |
Sampling Enabling Product Version: 5.22 Beta built on Nov 10 2020 18:15:43 | |
SEP Driver Version: PAX Driver Version: Platform type: 111 | |
CPU name: Intel(R) Xeon(R) Processor code named Skylake | |
PMU: skylake_server | |
Stderr: | |
Error retrieving SEP driver version | |
Error retrieving PAX driver version | |
Check driver with sep -version: Fail | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect performance-snapshot -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fac0e55c010 | |
Offs of buf1 = 0x7fac0e55c180 | |
Addr of buf2 = 0x7fac0c55b010 | |
Offs of buf2 = 0x7fac0c55b1c0 | |
Addr of buf3 = 0x7fac0a55a010 | |
Offs of buf3 = 0x7fac0a55a100 | |
Addr of buf4 = 0x7fac08559010 | |
Offs of buf4 = 0x7fac08559140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.310 seconds | |
Stderr: | |
vtune: Peak bandwidth measurement started. | |
vtune: Peak bandwidth measurement finished. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis (counting mode) (Perf) | |
Example of analysis types: Performance Snapshot | |
Collection: Ok | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-699845-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-699845-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '699871.stat.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving thread name information | |
vtune: Executing actions 43 % Resolving thread name information | |
vtune: Executing actions 43 % Resolving call target names for dynamic code | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 48 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 59 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Preparing output tree | |
vtune: Executing actions 63 % Parsing columns in input tree | |
vtune: Executing actions 64 % Parsing columns in input tree | |
vtune: Executing actions 64 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level rows | |
vtune: Executing actions 67 % Creating top-level rows | |
vtune: Executing actions 67 % Preparing output tree | |
vtune: Executing actions 67 % Parsing columns in input tree | |
vtune: Executing actions 67 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level rows | |
vtune: Executing actions 70 % Creating top-level rows | |
vtune: Executing actions 70 % Setting data model parameters | |
vtune: Executing actions 71 % Setting data model parameters | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 73 % Precomputing frequently used data | |
vtune: Executing actions 74 % Precomputing frequently used data | |
vtune: Executing actions 75 % Precomputing frequently used data | |
vtune: Executing actions 76 % Precomputing frequently used data | |
vtune: Executing actions 77 % Precomputing frequently used data | |
vtune: Executing actions 78 % Precomputing frequently used data | |
vtune: Executing actions 79 % Precomputing frequently used data | |
vtune: Executing actions 80 % Precomputing frequently used data | |
vtune: Executing actions 81 % Precomputing frequently used data | |
vtune: Executing actions 82 % Precomputing frequently used data | |
vtune: Executing actions 83 % Precomputing frequently used data | |
vtune: Executing actions 83 % Updating precomputed scalar metrics | |
vtune: Executing actions 85 % Updating precomputed scalar metrics | |
vtune: Executing actions 85 % Discarding redundant overtime data | |
vtune: Executing actions 89 % Discarding redundant overtime data | |
vtune: Executing actions 89 % Saving the result | |
vtune: Executing actions 92 % Saving the result | |
vtune: Executing actions 96 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -R summary -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps | |
Stdout: | |
Elapsed Time: 3.407s | |
IPC: 0.456 | |
| The IPC may be too low. This could be caused by issues such as memory | |
| stalls, instruction starvation, branch misprediction or long latency | |
| instructions. Explore the other hardware-related metrics to identify what | |
| is causing low IPC. | |
| | |
DP GFLOPS: 4.703 | |
Average CPU Frequency: 2.933 GHz | |
Effective Logical Core Utilization: 19.0% (15.193 out of 80) | |
| The metric value is low, which may signal a poor logical CPU cores | |
| utilization. Consider improving physical core utilization as the first step | |
| and then look at opportunities to utilize logical cores, which in some cases | |
| can improve processor throughput and overall performance of multi-threaded | |
| applications. | |
| | |
Effective Physical Core Utilization: 19.0% (7.596 out of 40) | |
| The metric value is low, which may signal a poor physical CPU cores | |
| utilization caused by: | |
| - load imbalance | |
| - threading runtime overhead | |
| - contended synchronization | |
| - thread/process underutilization | |
| - incorrect affinity that utilizes logical cores instead of physical | |
| cores | |
| Explore sub-metrics to estimate the efficiency of MPI and OpenMP | |
| parallelism or run the Locks and Waits analysis to identify parallel | |
| bottlenecks for other parallel runtimes. | |
| | |
Microarchitecture Usage: 0.0% of Pipeline Slots | |
| You code efficiency on this platform is too low. | |
| | |
| Possible cause: memory stalls, instruction starvation, branch misprediction | |
| or long latency instructions. | |
| | |
| Next steps: Run Microarchitecture Exploration analysis to identify the cause | |
| of the low microarchitecture usage efficiency. | |
| | |
Retiring: 0.0% of Pipeline Slots | |
Front-End Bound: 0.0% of Pipeline Slots | |
Back-End Bound: 100.0% of Pipeline Slots | |
| A significant portion of pipeline slots are remaining empty. When | |
| operations take too long in the back-end, they introduce bubbles in the | |
| pipeline that ultimately cause fewer pipeline slots containing useful | |
| work to be retired per cycle than the machine is capable to support. This | |
| opportunity cost results in slower execution. Long-latency operations | |
| like divides and memory operations can cause this, as can too many | |
| operations being directed to a single execution port (for example, more | |
| multiply operations arriving in the back-end per cycle than the execution | |
| unit can support). | |
| | |
Memory Bound: 0.0% of Pipeline Slots | |
Core Bound: 100.0% of Pipeline Slots | |
| This metric represents how much Core non-memory issues were of a | |
| bottleneck. Shortage in hardware compute resources, or dependencies | |
| software's instructions are both categorized under Core Bound. Hence | |
| it may indicate the machine ran out of an OOO resources, certain | |
| execution units are overloaded or dependencies in program's data- or | |
| instruction- flow are limiting the performance (e.g. FP-chained long- | |
| latency arithmetic operations). | |
| | |
Bad Speculation: 0.0% of Pipeline Slots | |
Memory Bound: 0.0% of Pipeline Slots | |
L1 Bound: 0.0% of Clockticks | |
L2 Bound: 0.0% of Clockticks | |
L3 Bound: 0.0% of Clockticks | |
DRAM Bound: 0.0% of Clockticks | |
DRAM Bandwidth Bound: 0.0% of Elapsed Time | |
Store Bound: 0.0% of Clockticks | |
NUMA: % of Remote Accesses: 0.0% | |
Vectorization: 0.0% of Packed FP Operations | |
Instruction Mix | |
SP FLOPs: 0.0% of uOps | |
Packed: 0.0% from SP FP | |
128-bit: 0.0% from SP FP | |
256-bit: 0.0% from SP FP | |
512-bit: 0.0% from SP FP | |
Scalar: 0.0% from SP FP | |
DP FLOPs | |
Packed: 0.0% from DP FP | |
128-bit: 0.0% from DP FP | |
256-bit: 0.0% from DP FP | |
512-bit: 0.0% from DP FP | |
Scalar: 100.0% from DP FP | |
x87 FLOPs: 0.0% of uOps | |
Non-FP | |
FP Arith/Mem Rd Instr. Ratio | |
FP Arith/Mem Wr Instr. Ratio | |
Collection and Platform Info | |
Application Command Line: /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Operating System: 5.2.9-229_fbk15_hardened_4185_g357f49b36602 \S Kernel \r on an \m Rotor 2021-04-09 15:07 | |
Computer Name: devbig055.ftw5.facebook.com | |
Result Size: 3.7 MB | |
Collection start time: 22:11:01 09/04/2021 UTC | |
Collection stop time: 22:11:04 09/04/2021 UTC | |
Collector Type: Driverless Perf per-process counting | |
CPU | |
Name: Intel(R) Xeon(R) Processor code named Skylake | |
Frequency: 1.995 GHz | |
Logical CPU Count: 80 | |
Max DRAM Single-Package Bandwidth: 77.000 GB/s | |
Cache Allocation Technology | |
Level 2 capability: not detected | |
Level 3 capability: available | |
Recommendations: | |
Hotspots: Start with Hotspots analysis to understand the efficiency of your algorithm. | |
| Use Hotspots analysis to identify the most time consuming functions. | |
| Drill down to see the time spent on every line of code. | |
Microarchitecture Exploration: There is low microarchitecture usage (0.0%) of available hardware resources. | |
| Run Microarchitecture Exploration analysis to analyze CPU | |
| microarchitecture bottlenecks that can affect application performance. | |
Threading: There is poor utilization of logical CPU cores (19.0%) in your application. | |
| Use Threading to explore more opportunities to increase parallelism in | |
| your application. | |
If you want to skip descriptions of detected performance issues in the report, | |
enter: vtune -report summary -report-knob show-issues=false -r <my_result_dir>. | |
Alternatively, you may view the report in the csv format: vtune -report | |
<report_name> -format=csv. | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ps' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stderr: | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_tpss' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
Instrumentation based analysis check | |
Example of analysis types: Hotspots with default knob sampling-mode=sw, Threading with default knob sampling-and-waits=sw | |
Collection: Fail | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -knob sampling-mode=hw -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7f83ae032010 | |
Offs of buf1 = 0x7f83ae032180 | |
Addr of buf2 = 0x7f83ac031010 | |
Offs of buf2 = 0x7f83ac0311c0 | |
Addr of buf3 = 0x7f83aa030010 | |
Offs of buf3 = 0x7f83aa030100 | |
Addr of buf4 = 0x7f83a802f010 | |
Offs of buf4 = 0x7f83a802f140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.337 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis check (Perf) | |
Example of analysis types: Hotspots with knob sampling-mode=hw, HPC Performance Characterization, etc. | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-700676-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-700676-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '700685.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Instructions Retired,Microarchitecture Usage(%),Microarchitecture Usage:Microarchitecture Usage(%),Microarchitecture Usage:CPI Rate,Module,Function (Full),Source File,Start Address | |
multiply1,50.849268,50.849268,0.0,50.849268,0.0,0.0,0.0,0.0,0.0,67940000000,0.0,0.0,2.183397,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.095223,0.095223,0.0,0.095223,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,35.000000,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_seqcount_begin,0.045106,0.045106,0.0,0.045106,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
prepare_exit_to_usermode,0.045106,0.045106,0.0,0.045106,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,9.000000,vmlinux,prepare_exit_to_usermode,common.c,0xffffffff810024a0 | |
interrupt_entry,0.035082,0.035082,0.0,0.035082,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,11.000000,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect uarch-exploration -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fa935421010 | |
Offs of buf1 = 0x7fa935421180 | |
Addr of buf2 = 0x7fa933420010 | |
Offs of buf2 = 0x7fa9334201c0 | |
Addr of buf3 = 0x7fa93141f010 | |
Offs of buf3 = 0x7fa93141f100 | |
Addr of buf4 = 0x7fa92f41e010 | |
Offs of buf4 = 0x7fa92f41e140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.735 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis check (Perf) | |
Example of analysis types: Microarchitecture Exploration | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-701369-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-701369-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'system-wide.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge | |
Stdout: | |
Function,CPU Time,Clockticks,Instructions Retired,CPI Rate,Retiring(%),Retiring:Light Operations(%),Retiring:Light Operations:FP Arithmetic(%),Retiring:Light Operations:FP Arithmetic:FP x87(%),Retiring:Light Operations:FP Arithmetic:FP Scalar(%),Retiring:Light Operations:FP Arithmetic:FP Vector(%),Retiring:Light Operations:Other(%),Retiring:Heavy Operations(%),Retiring:Heavy Operations:Microcode Sequencer(%),Retiring:Heavy Operations:Microcode Sequencer:Assists(%),Front-End Bound(%),Front-End Bound:Front-End Latency(%),Front-End Bound:Front-End Latency:ICache Misses(%),Front-End Bound:Front-End Latency:ITLB Overhead(%),Front-End Bound:Front-End Latency:Branch Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Mispredicts Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Clears Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Unknown Branches(%),Front-End Bound:Front-End Latency:DSB Switches(%),Front-End Bound:Front-End Latency:Length Changing Prefixes(%),Front-End Bound:Front-End Latency:MS Switches(%),Front-End Bound:Front-End Bandwidth(%),Front-End Bound:Front-End Bandwidth:Front-End Bandwidth MITE(%),Front-End Bound:Front-End Bandwidth:Front-End Bandwidth DSB(%),Front-End Bound:Front-End Bandwidth:(Info) DSB Coverage(%),Bad Speculation(%),Bad Speculation:Branch Mispredict(%),Bad Speculation:Machine Clears(%),Back-End Bound(%),Back-End Bound:Memory Bound(%),Back-End Bound:Memory Bound:L1 Bound(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead:Load STLB Hit(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead:Load STLB Miss(%),Back-End Bound:Memory Bound:L1 Bound:Loads Blocked by Store Forwarding(%),Back-End Bound:Memory Bound:L1 Bound:Lock Latency(%),Back-End Bound:Memory Bound:L1 Bound:Split Loads(%),Back-End Bound:Memory Bound:L1 Bound:4K Aliasing(%),Back-End Bound:Memory Bound:L1 Bound:FB Full(%),Back-End Bound:Memory Bound:L2 Bound(%),Back-End Bound:Memory Bound:L3 Bound(%),Back-End Bound:Memory Bound:L3 Bound:Contested Accesses(%),Back-End Bound:Memory Bound:L3 Bound:Data Sharing(%),Back-End Bound:Memory Bound:L3 Bound:L3 Latency(%),Back-End Bound:Memory Bound:L3 Bound:SQ Full(%),Back-End Bound:Memory Bound:DRAM Bound(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Bandwidth(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Local DRAM(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Remote DRAM(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Remote Cache(%),Back-End Bound:Memory Bound:Store Bound(%),Back-End Bound:Memory Bound:Store Bound:Store Latency(%),Back-End Bound:Memory Bound:Store Bound:False Sharing(%),Back-End Bound:Memory Bound:Store Bound:Split Stores(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead:Store STLB Hit(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead:Store STLB Hit(%),Back-End Bound:Core Bound(%),Back-End Bound:Core Bound:Divider(%),Back-End Bound:Core Bound:Port Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized:Serializing Operations(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized:Mixing Vectors(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 1 Port Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 2 Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 0(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 1(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 5(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 6(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization:Port 2(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization:Port 3(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization:Port 4(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization:Port 7(%),Back-End Bound:Core Bound:Port Utilization:Vector Capacity Usage (FPU)(%),Average CPU Frequency,Module,Function (Full),Source File,Start Address | |
multiply1,55.259611,148800000000,68760000000,2.164049,11.0,11.4,17.0,0.0,17.0,0.0,83.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,100.0,0.0,-0.0,0.0,89.2,73.1,1.0,98.7,0.0,98.7,0.0,0.0,0.0,0.0,14.3,0.0,63.1,0.0,4.1,100.0,0.0,1.8,44.6,53.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,14.5,33.2,0.0,0.0,8.1,5.3,2.3,22.7,37.9,34.9,6.4,11.4,4.7,5.9,6.7,3.2,3.2,0.0,12.5,2692744237.678215,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.180423,490000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,38.2,0.0,38.2,61.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,61.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2715837254.083333,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_seqcount_begin,0.040094,140000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3491790755.250000,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
interrupt_entry,0.035082,110000000,10000000,11.000000,0.0,4.5,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,33.3,0.0,33.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3135485576.142857,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
__read_once_size,0.030071,120000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3990618006.000000,vmlinux,__read_once_size,compiler.h,0xffffffff8110f5d9 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect memory-access -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7f60e1486010 | |
Offs of buf1 = 0x7f60e1486180 | |
Addr of buf2 = 0x7f60df485010 | |
Offs of buf2 = 0x7f60df4851c0 | |
Addr of buf3 = 0x7f60dd484010 | |
Offs of buf3 = 0x7f60dd484100 | |
Addr of buf4 = 0x7f60db483010 | |
Offs of buf4 = 0x7f60db483140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.688 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Peak bandwidth measurement started. | |
vtune: Peak bandwidth measurement finished. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with uncore events (Perf) | |
Example of analysis types: Memory Access | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-702713-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-702713-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'system-wide.perf' file | |
vtune: Executing actions 25 % Loading 'system-wide.stat.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for dangling locations | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 43 % Resolving information for `matrix' | |
vtune: Executing actions 43 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Preparing output tree | |
vtune: Executing actions 63 % Parsing columns in input tree | |
vtune: Executing actions 64 % Parsing columns in input tree | |
vtune: Executing actions 64 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level rows | |
vtune: Executing actions 67 % Creating top-level rows | |
vtune: Executing actions 67 % Preparing output tree | |
vtune: Executing actions 67 % Parsing columns in input tree | |
vtune: Executing actions 67 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level rows | |
vtune: Executing actions 70 % Creating top-level rows | |
vtune: Executing actions 71 % Creating top-level rows | |
vtune: Executing actions 72 % Creating top-level rows | |
vtune: Executing actions 74 % Creating top-level rows | |
vtune: Executing actions 74 % Preparing output tree | |
vtune: Executing actions 74 % Parsing columns in input tree | |
vtune: Executing actions 74 % Creating top-level columns | |
vtune: Executing actions 76 % Creating top-level columns | |
vtune: Executing actions 76 % Creating top-level rows | |
vtune: Executing actions 77 % Creating top-level rows | |
vtune: Executing actions 78 % Creating top-level rows | |
vtune: Executing actions 78 % Preparing output tree | |
vtune: Executing actions 78 % Parsing columns in input tree | |
vtune: Executing actions 78 % Creating top-level columns | |
vtune: Executing actions 79 % Creating top-level columns | |
vtune: Executing actions 79 % Creating top-level rows | |
vtune: Executing actions 81 % Creating top-level rows | |
vtune: Executing actions 81 % Preparing output tree | |
vtune: Executing actions 81 % Parsing columns in input tree | |
vtune: Executing actions 81 % Creating top-level columns | |
vtune: Executing actions 83 % Creating top-level columns | |
vtune: Executing actions 83 % Creating top-level rows | |
vtune: Executing actions 84 % Creating top-level rows | |
vtune: Executing actions 85 % Creating top-level rows | |
vtune: Executing actions 85 % Preparing output tree | |
vtune: Executing actions 85 % Parsing columns in input tree | |
vtune: Executing actions 85 % Creating top-level columns | |
vtune: Executing actions 86 % Creating top-level columns | |
vtune: Executing actions 86 % Creating top-level rows | |
vtune: Executing actions 88 % Creating top-level rows | |
vtune: Executing actions 88 % Preparing output tree | |
vtune: Executing actions 88 % Parsing columns in input tree | |
vtune: Executing actions 89 % Parsing columns in input tree | |
vtune: Executing actions 89 % Creating top-level columns | |
vtune: Executing actions 90 % Creating top-level columns | |
vtune: Executing actions 90 % Creating top-level rows | |
vtune: Executing actions 91 % Creating top-level rows | |
vtune: Executing actions 92 % Creating top-level rows | |
vtune: Executing actions 92 % Preparing output tree | |
vtune: Executing actions 92 % Parsing columns in input tree | |
vtune: Executing actions 92 % Creating top-level columns | |
vtune: Executing actions 93 % Creating top-level columns | |
vtune: Executing actions 93 % Creating top-level rows | |
vtune: Executing actions 95 % Creating top-level rows | |
vtune: Executing actions 95 % Preparing output tree | |
vtune: Executing actions 95 % Parsing columns in input tree | |
vtune: Executing actions 96 % Parsing columns in input tree | |
vtune: Executing actions 96 % Creating top-level columns | |
vtune: Executing actions 97 % Creating top-level columns | |
vtune: Executing actions 97 % Creating top-level rows | |
vtune: Executing actions 98 % Creating top-level rows | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Setting data model parameters | |
vtune: Executing actions 99 % Precomputing frequently used data | |
vtune: Executing actions 99 % Precomputing frequently used data | |
vtune: Executing actions 99 % Updating precomputed scalar metrics | |
vtune: Executing actions 99 % Discarding redundant overtime data | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma | |
Stdout: | |
Function,CPU Time,Memory Bound(%),Memory Bound:L1 Bound(%),Memory Bound:L2 Bound(%),Memory Bound:L3 Bound(%),Memory Bound:DRAM Bound(%),Memory Bound:Store Bound(%),Loads,Stores,LLC Miss Count,LLC Miss Count:Local DRAM Access Count,LLC Miss Count:Remote DRAM Access Count,LLC Miss Count:Remote Cache Access Count,Average Latency (cycles),Module,Function (Full),Source File,Start Address | |
multiply1,55.084203,71.9,0.0,0.0,63.5,1.8,0.0,21040417712,12816713749,29757138,0,0,20047399,151.517598,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.200470,,0.0,0.0,0.0,0.0,0.0,9979007,0,0,0,0,0,0.0,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_once_size,0.055129,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,__read_once_size,compiler.h,0xffffffff8110f5d9 | |
__read_seqcount_begin,0.055129,100.0,100.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
prepare_exit_to_usermode,0.055129,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,prepare_exit_to_usermode,common.c,0xffffffff810024a0 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=true -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fa0aca00010 | |
Offs of buf1 = 0x7fa0aca00180 | |
Addr of buf2 = 0x7fa0aa9ff010 | |
Offs of buf2 = 0x7fa0aa9ff1c0 | |
Addr of buf3 = 0x7fa0a89fe010 | |
Offs of buf3 = 0x7fa0a89fe100 | |
Addr of buf4 = 0x7fa0a69fd010 | |
Offs of buf4 = 0x7fa0a69fd140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.513 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with stacks (Perf) | |
Example of analysis types: Hotspots with knob sampling-mode=hw and knob enable-stack-collection=true, etc. | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-704202-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-704202-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '704356.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for dangling locations | |
vtune: Executing actions 39 % Resolving information for `libpthread-2.28.so' | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 39 % Resolving information for `cls_bpf.ko' | |
vtune: Warning: Cannot locate debugging information for file `/usr/lib64/libpthread-2.28.so'. | |
vtune: Executing actions 40 % Resolving information for `cls_bpf.ko' | |
vtune: Executing actions 40 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 41 % Resolving information for `libc-2.28.so' | |
vtune: Warning: Cannot locate debugging information for file `/usr/lib64/libc-2.28.so'. | |
vtune: Executing actions 42 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 43 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 44 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 47 % Resolving bottom user stack information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 50 % Resolving call target names for dynamic code | |
vtune: Executing actions 50 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Instructions Retired,Microarchitecture Usage(%),Microarchitecture Usage:Microarchitecture Usage(%),Microarchitecture Usage:CPI Rate,Module,Function (Full),Source File,Start Address | |
multiply1,53.550606,53.550606,0.010024,53.540583,0.0,0.0,0.0,0.0,0.0,67750000000,0.0,0.0,2.231439,matrix,multiply1,multiply.c,0x401550 | |
page_fault,0.320752,0.320752,0.005012,0.315741,0.0,0.0,0.0,0.0,0.0,80000000,0.0,0.0,12.875000,vmlinux,page_fault,entry_64.S,0xffffffff81c00ea0 | |
apic_timer_interrupt,0.220517,0.220517,0.0,0.220517,0.0,0.0,0.0,0.0,0.0,240000000,0.0,0.0,3.125000,vmlinux,apic_timer_interrupt,entry_64.S,0xffffffff81c01800 | |
call_function_interrupt,0.065153,0.065153,0.0,0.065153,0.0,0.0,0.0,0.0,0.0,20000000,0.0,0.0,15.500000,vmlinux,call_function_interrupt,entry_64.S,0xffffffff81c018c0 | |
pthread_create,0.030071,0.0,0.0,0.0,0.0,0.0,0.0,0.030071,0.0,50000000,0.0,0.0,1.800000,libpthread-2.28.so,pthread_create,[Unknown],0x8430 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect threading -knob sampling-and-waits=hw -knob enable-stack-collection=false -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7ff9cd995010 | |
Offs of buf1 = 0x7ff9cd995180 | |
Addr of buf2 = 0x7ff9cb994010 | |
Offs of buf2 = 0x7ff9cb9941c0 | |
Addr of buf3 = 0x7ff9c9993010 | |
Offs of buf3 = 0x7ff9c9993100 | |
Addr of buf4 = 0x7ff9c7992010 | |
Offs of buf4 = 0x7ff9c7992140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.377 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with context switches (Perf) | |
Example of analysis types: Threading with knob sampling-and-waits=hw | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-705829-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-705829-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '705874.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Setting data model parameters | |
vtune: Executing actions 64 % Setting data model parameters | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 70 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 73 % Precomputing frequently used data | |
vtune: Executing actions 74 % Precomputing frequently used data | |
vtune: Executing actions 76 % Precomputing frequently used data | |
vtune: Executing actions 76 % Updating precomputed scalar metrics | |
vtune: Executing actions 78 % Updating precomputed scalar metrics | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 82 % Discarding redundant overtime data | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 89 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Inactive Wait Time,Inactive Wait Time:Inactive Sync Wait Time,Inactive Wait Time:Inactive Sync Wait Time:Idle,Inactive Wait Time:Inactive Sync Wait Time:Poor,Inactive Wait Time:Inactive Sync Wait Time:Ok,Inactive Wait Time:Inactive Sync Wait Time:Ideal,Inactive Wait Time:Inactive Sync Wait Time:Over,Inactive Wait Time:Preemption Wait Time,Inactive Wait Time:Preemption Wait Time:Idle,Inactive Wait Time:Preemption Wait Time:Poor,Inactive Wait Time:Preemption Wait Time:Ok,Inactive Wait Time:Preemption Wait Time:Ideal,Inactive Wait Time:Preemption Wait Time:Over,Inactive Wait Count,Inactive Wait Count:Inactive Sync Wait Count,Inactive Wait Count:Inactive Sync Wait Count:Idle,Inactive Wait Count:Inactive Sync Wait Count:Poor,Inactive Wait Count:Inactive Sync Wait Count:Ok,Inactive Wait Count:Inactive Sync Wait Count:Ideal,Inactive Wait Count:Inactive Sync Wait Count:Over,Inactive Wait Count:Preemption Wait Count,Inactive Wait Count:Preemption Wait Count:Idle,Inactive Wait Count:Preemption Wait Count:Poor,Inactive Wait Count:Preemption Wait Count:Ok,Inactive Wait Count:Preemption Wait Count:Ideal,Inactive Wait Count:Preemption Wait Count:Over,Module,Function (Full),Source File,Start Address | |
multiply1,51.049737,51.049737,0.005012,51.044726,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.130306,0.130306,0.0,0.130306,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
perf_iterate_ctx,0.050118,0.050118,0.0,0.050118,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,perf_iterate_ctx,core.c,0xffffffff811e3dc0 | |
interrupt_entry,0.040094,0.040094,0.0,0.040094,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
apic_timer_interrupt,0.030071,0.030071,0.0,0.030071,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,apic_timer_interrupt,entry_64.S,0xffffffff81c01800 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
The check observed a product failure on your system. | |
Review errors in the output above to fix a problem or contact Intel technical support. | |
Log location: /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/log.txt | |
Example of analysis types: Hotspots with default knob sampling-mode=sw, Threading with default knob sampling-and-waits=sw | |
Collection: Fail | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -knob sampling-mode=hw -r /tmp/vtune-tmp-user/self-checker-2021.04.09_15.10.52/result_ah -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7f83ae032010 | |
Offs of buf1 = 0x7f83ae032180 | |
Addr of buf2 = 0x7f83ac031010 | |
Offs of buf2 = 0x7f83ac0311c0 | |
Addr of buf3 = 0x7f83aa030010 | |
Offs of buf3 = 0x7f83aa030100 | |
Addr of buf4 = 0x7f83a802f010 | |
Offs of buf4 = 0x7f83a802f140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.337 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis check (Perf) | |
Example of analysis types: Hotspots with knob sampling-mode=hw, HPC Performance Characterization, etc. | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-700676-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-700676-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '700685.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Instructions Retired,Microarchitecture Usage(%),Microarchitecture Usage:Microarchitecture Usage(%),Microarchitecture Usage:CPI Rate,Module,Function (Full),Source File,Start Address | |
multiply1,50.849268,50.849268,0.0,50.849268,0.0,0.0,0.0,0.0,0.0,67940000000,0.0,0.0,2.183397,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.095223,0.095223,0.0,0.095223,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,35.000000,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_seqcount_begin,0.045106,0.045106,0.0,0.045106,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
prepare_exit_to_usermode,0.045106,0.045106,0.0,0.045106,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,9.000000,vmlinux,prepare_exit_to_usermode,common.c,0xffffffff810024a0 | |
interrupt_entry,0.035082,0.035082,0.0,0.035082,0.0,0.0,0.0,0.0,0.0,10000000,0.0,0.0,11.000000,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect uarch-exploration -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fa935421010 | |
Offs of buf1 = 0x7fa935421180 | |
Addr of buf2 = 0x7fa933420010 | |
Offs of buf2 = 0x7fa9334201c0 | |
Addr of buf3 = 0x7fa93141f010 | |
Offs of buf3 = 0x7fa93141f100 | |
Addr of buf4 = 0x7fa92f41e010 | |
Offs of buf4 = 0x7fa92f41e140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.735 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis check (Perf) | |
Example of analysis types: Microarchitecture Exploration | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-701369-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-701369-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'system-wide.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge | |
Stdout: | |
Function,CPU Time,Clockticks,Instructions Retired,CPI Rate,Retiring(%),Retiring:Light Operations(%),Retiring:Light Operations:FP Arithmetic(%),Retiring:Light Operations:FP Arithmetic:FP x87(%),Retiring:Light Operations:FP Arithmetic:FP Scalar(%),Retiring:Light Operations:FP Arithmetic:FP Vector(%),Retiring:Light Operations:Other(%),Retiring:Heavy Operations(%),Retiring:Heavy Operations:Microcode Sequencer(%),Retiring:Heavy Operations:Microcode Sequencer:Assists(%),Front-End Bound(%),Front-End Bound:Front-End Latency(%),Front-End Bound:Front-End Latency:ICache Misses(%),Front-End Bound:Front-End Latency:ITLB Overhead(%),Front-End Bound:Front-End Latency:Branch Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Mispredicts Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Clears Resteers(%),Front-End Bound:Front-End Latency:Branch Resteers:Unknown Branches(%),Front-End Bound:Front-End Latency:DSB Switches(%),Front-End Bound:Front-End Latency:Length Changing Prefixes(%),Front-End Bound:Front-End Latency:MS Switches(%),Front-End Bound:Front-End Bandwidth(%),Front-End Bound:Front-End Bandwidth:Front-End Bandwidth MITE(%),Front-End Bound:Front-End Bandwidth:Front-End Bandwidth DSB(%),Front-End Bound:Front-End Bandwidth:(Info) DSB Coverage(%),Bad Speculation(%),Bad Speculation:Branch Mispredict(%),Bad Speculation:Machine Clears(%),Back-End Bound(%),Back-End Bound:Memory Bound(%),Back-End Bound:Memory Bound:L1 Bound(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead:Load STLB Hit(%),Back-End Bound:Memory Bound:L1 Bound:DTLB Overhead:Load STLB Miss(%),Back-End Bound:Memory Bound:L1 Bound:Loads Blocked by Store Forwarding(%),Back-End Bound:Memory Bound:L1 Bound:Lock Latency(%),Back-End Bound:Memory Bound:L1 Bound:Split Loads(%),Back-End Bound:Memory Bound:L1 Bound:4K Aliasing(%),Back-End Bound:Memory Bound:L1 Bound:FB Full(%),Back-End Bound:Memory Bound:L2 Bound(%),Back-End Bound:Memory Bound:L3 Bound(%),Back-End Bound:Memory Bound:L3 Bound:Contested Accesses(%),Back-End Bound:Memory Bound:L3 Bound:Data Sharing(%),Back-End Bound:Memory Bound:L3 Bound:L3 Latency(%),Back-End Bound:Memory Bound:L3 Bound:SQ Full(%),Back-End Bound:Memory Bound:DRAM Bound(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Bandwidth(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Local DRAM(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Remote DRAM(%),Back-End Bound:Memory Bound:DRAM Bound:Memory Latency:Remote Cache(%),Back-End Bound:Memory Bound:Store Bound(%),Back-End Bound:Memory Bound:Store Bound:Store Latency(%),Back-End Bound:Memory Bound:Store Bound:False Sharing(%),Back-End Bound:Memory Bound:Store Bound:Split Stores(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead:Store STLB Hit(%),Back-End Bound:Memory Bound:Store Bound:DTLB Store Overhead:Store STLB Hit(%),Back-End Bound:Core Bound(%),Back-End Bound:Core Bound:Divider(%),Back-End Bound:Core Bound:Port Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized:Serializing Operations(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 0 Ports Utilized:Mixing Vectors(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 1 Port Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 2 Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 0(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 1(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 5(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:ALU Operation Utilization:Port 6(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization:Port 2(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Load Operation Utilization:Port 3(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization:Port 4(%),Back-End Bound:Core Bound:Port Utilization:Cycles of 3+ Ports Utilized:Store Operation Utilization:Port 7(%),Back-End Bound:Core Bound:Port Utilization:Vector Capacity Usage (FPU)(%),Average CPU Frequency,Module,Function (Full),Source File,Start Address | |
multiply1,55.259611,148800000000,68760000000,2.164049,11.0,11.4,17.0,0.0,17.0,0.0,83.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,100.0,0.0,-0.0,0.0,89.2,73.1,1.0,98.7,0.0,98.7,0.0,0.0,0.0,0.0,14.3,0.0,63.1,0.0,4.1,100.0,0.0,1.8,44.6,53.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.0,0.0,14.5,33.2,0.0,0.0,8.1,5.3,2.3,22.7,37.9,34.9,6.4,11.4,4.7,5.9,6.7,3.2,3.2,0.0,12.5,2692744237.678215,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.180423,490000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,38.2,0.0,38.2,61.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,61.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2715837254.083333,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_seqcount_begin,0.040094,140000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3491790755.250000,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
interrupt_entry,0.035082,110000000,10000000,11.000000,0.0,4.5,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,33.3,0.0,33.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3135485576.142857,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
__read_once_size,0.030071,120000000,0,,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3990618006.000000,vmlinux,__read_once_size,compiler.h,0xffffffff8110f5d9 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ge' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect memory-access -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7f60e1486010 | |
Offs of buf1 = 0x7f60e1486180 | |
Addr of buf2 = 0x7f60df485010 | |
Offs of buf2 = 0x7f60df4851c0 | |
Addr of buf3 = 0x7f60dd484010 | |
Offs of buf3 = 0x7f60dd484100 | |
Addr of buf4 = 0x7f60db483010 | |
Offs of buf4 = 0x7f60db483140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.688 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Peak bandwidth measurement started. | |
vtune: Peak bandwidth measurement finished. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with uncore events (Perf) | |
Example of analysis types: Memory Access | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-702713-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-702713-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'system-wide.perf' file | |
vtune: Executing actions 25 % Loading 'system-wide.stat.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for dangling locations | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 43 % Resolving information for `matrix' | |
vtune: Executing actions 43 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Preparing output tree | |
vtune: Executing actions 63 % Parsing columns in input tree | |
vtune: Executing actions 64 % Parsing columns in input tree | |
vtune: Executing actions 64 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level columns | |
vtune: Executing actions 65 % Creating top-level rows | |
vtune: Executing actions 67 % Creating top-level rows | |
vtune: Executing actions 67 % Preparing output tree | |
vtune: Executing actions 67 % Parsing columns in input tree | |
vtune: Executing actions 67 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level columns | |
vtune: Executing actions 69 % Creating top-level rows | |
vtune: Executing actions 70 % Creating top-level rows | |
vtune: Executing actions 71 % Creating top-level rows | |
vtune: Executing actions 72 % Creating top-level rows | |
vtune: Executing actions 74 % Creating top-level rows | |
vtune: Executing actions 74 % Preparing output tree | |
vtune: Executing actions 74 % Parsing columns in input tree | |
vtune: Executing actions 74 % Creating top-level columns | |
vtune: Executing actions 76 % Creating top-level columns | |
vtune: Executing actions 76 % Creating top-level rows | |
vtune: Executing actions 77 % Creating top-level rows | |
vtune: Executing actions 78 % Creating top-level rows | |
vtune: Executing actions 78 % Preparing output tree | |
vtune: Executing actions 78 % Parsing columns in input tree | |
vtune: Executing actions 78 % Creating top-level columns | |
vtune: Executing actions 79 % Creating top-level columns | |
vtune: Executing actions 79 % Creating top-level rows | |
vtune: Executing actions 81 % Creating top-level rows | |
vtune: Executing actions 81 % Preparing output tree | |
vtune: Executing actions 81 % Parsing columns in input tree | |
vtune: Executing actions 81 % Creating top-level columns | |
vtune: Executing actions 83 % Creating top-level columns | |
vtune: Executing actions 83 % Creating top-level rows | |
vtune: Executing actions 84 % Creating top-level rows | |
vtune: Executing actions 85 % Creating top-level rows | |
vtune: Executing actions 85 % Preparing output tree | |
vtune: Executing actions 85 % Parsing columns in input tree | |
vtune: Executing actions 85 % Creating top-level columns | |
vtune: Executing actions 86 % Creating top-level columns | |
vtune: Executing actions 86 % Creating top-level rows | |
vtune: Executing actions 88 % Creating top-level rows | |
vtune: Executing actions 88 % Preparing output tree | |
vtune: Executing actions 88 % Parsing columns in input tree | |
vtune: Executing actions 89 % Parsing columns in input tree | |
vtune: Executing actions 89 % Creating top-level columns | |
vtune: Executing actions 90 % Creating top-level columns | |
vtune: Executing actions 90 % Creating top-level rows | |
vtune: Executing actions 91 % Creating top-level rows | |
vtune: Executing actions 92 % Creating top-level rows | |
vtune: Executing actions 92 % Preparing output tree | |
vtune: Executing actions 92 % Parsing columns in input tree | |
vtune: Executing actions 92 % Creating top-level columns | |
vtune: Executing actions 93 % Creating top-level columns | |
vtune: Executing actions 93 % Creating top-level rows | |
vtune: Executing actions 95 % Creating top-level rows | |
vtune: Executing actions 95 % Preparing output tree | |
vtune: Executing actions 95 % Parsing columns in input tree | |
vtune: Executing actions 96 % Parsing columns in input tree | |
vtune: Executing actions 96 % Creating top-level columns | |
vtune: Executing actions 97 % Creating top-level columns | |
vtune: Executing actions 97 % Creating top-level rows | |
vtune: Executing actions 98 % Creating top-level rows | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Preparing output tree | |
vtune: Executing actions 99 % Parsing columns in input tree | |
vtune: Executing actions 99 % Creating top-level columns | |
vtune: Executing actions 99 % Creating top-level rows | |
vtune: Executing actions 99 % Setting data model parameters | |
vtune: Executing actions 99 % Precomputing frequently used data | |
vtune: Executing actions 99 % Precomputing frequently used data | |
vtune: Executing actions 99 % Updating precomputed scalar metrics | |
vtune: Executing actions 99 % Discarding redundant overtime data | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma | |
Stdout: | |
Function,CPU Time,Memory Bound(%),Memory Bound:L1 Bound(%),Memory Bound:L2 Bound(%),Memory Bound:L3 Bound(%),Memory Bound:DRAM Bound(%),Memory Bound:Store Bound(%),Loads,Stores,LLC Miss Count,LLC Miss Count:Local DRAM Access Count,LLC Miss Count:Remote DRAM Access Count,LLC Miss Count:Remote Cache Access Count,Average Latency (cycles),Module,Function (Full),Source File,Start Address | |
multiply1,55.084203,71.9,0.0,0.0,63.5,1.8,0.0,21040417712,12816713749,29757138,0,0,20047399,151.517598,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.200470,,0.0,0.0,0.0,0.0,0.0,9979007,0,0,0,0,0,0.0,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
__read_once_size,0.055129,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,__read_once_size,compiler.h,0xffffffff8110f5d9 | |
__read_seqcount_begin,0.055129,100.0,100.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,__read_seqcount_begin,seqlock.h,0xffffffff81132053 | |
prepare_exit_to_usermode,0.055129,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0.0,vmlinux,prepare_exit_to_usermode,common.c,0xffffffff810024a0 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ma' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=true -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7fa0aca00010 | |
Offs of buf1 = 0x7fa0aca00180 | |
Addr of buf2 = 0x7fa0aa9ff010 | |
Offs of buf2 = 0x7fa0aa9ff1c0 | |
Addr of buf3 = 0x7fa0a89fe010 | |
Offs of buf3 = 0x7fa0a89fe100 | |
Addr of buf4 = 0x7fa0a69fd010 | |
Offs of buf4 = 0x7fa0a69fd140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.513 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with stacks (Perf) | |
Example of analysis types: Hotspots with knob sampling-mode=hw and knob enable-stack-collection=true, etc. | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-704202-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-704202-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '704356.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for dangling locations | |
vtune: Executing actions 39 % Resolving information for `libpthread-2.28.so' | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 39 % Resolving information for `cls_bpf.ko' | |
vtune: Warning: Cannot locate debugging information for file `/usr/lib64/libpthread-2.28.so'. | |
vtune: Executing actions 40 % Resolving information for `cls_bpf.ko' | |
vtune: Executing actions 40 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 41 % Resolving information for `libc-2.28.so' | |
vtune: Warning: Cannot locate debugging information for file `/usr/lib64/libc-2.28.so'. | |
vtune: Executing actions 42 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 43 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 44 % Resolving information for `libc-2.28.so' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving information for `vmlinux' | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 47 % Resolving bottom user stack information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 50 % Resolving call target names for dynamic code | |
vtune: Executing actions 50 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Setting data model parameters | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 60 % Precomputing frequently used data | |
vtune: Executing actions 62 % Precomputing frequently used data | |
vtune: Executing actions 63 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 65 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 72 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Updating precomputed scalar metrics | |
vtune: Executing actions 75 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 78 % Saving the result | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Instructions Retired,Microarchitecture Usage(%),Microarchitecture Usage:Microarchitecture Usage(%),Microarchitecture Usage:CPI Rate,Module,Function (Full),Source File,Start Address | |
multiply1,53.550606,53.550606,0.010024,53.540583,0.0,0.0,0.0,0.0,0.0,67750000000,0.0,0.0,2.231439,matrix,multiply1,multiply.c,0x401550 | |
page_fault,0.320752,0.320752,0.005012,0.315741,0.0,0.0,0.0,0.0,0.0,80000000,0.0,0.0,12.875000,vmlinux,page_fault,entry_64.S,0xffffffff81c00ea0 | |
apic_timer_interrupt,0.220517,0.220517,0.0,0.220517,0.0,0.0,0.0,0.0,0.0,240000000,0.0,0.0,3.125000,vmlinux,apic_timer_interrupt,entry_64.S,0xffffffff81c01800 | |
call_function_interrupt,0.065153,0.065153,0.0,0.065153,0.0,0.0,0.0,0.0,0.0,20000000,0.0,0.0,15.500000,vmlinux,call_function_interrupt,entry_64.S,0xffffffff81c018c0 | |
pthread_create,0.030071,0.0,0.0,0.0,0.0,0.0,0.0,0.030071,0.0,50000000,0.0,0.0,1.800000,libpthread-2.28.so,pthread_create,[Unknown],0x8430 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_ah_with_stacks' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
================================================================================ | |
Running collection... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -collect threading -knob sampling-and-waits=hw -knob enable-stack-collection=false -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th -data-limit 0 -finalization-mode none -source-search-dir /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/src -- /opt/intel/oneapi/vtune/2021.1.1/samples/en/C++/matrix/matrix | |
Stdout: | |
Addr of buf1 = 0x7ff9cd995010 | |
Offs of buf1 = 0x7ff9cd995180 | |
Addr of buf2 = 0x7ff9cb994010 | |
Offs of buf2 = 0x7ff9cb9941c0 | |
Addr of buf3 = 0x7ff9c9993010 | |
Offs of buf3 = 0x7ff9c9993100 | |
Addr of buf4 = 0x7ff9c7992010 | |
Offs of buf4 = 0x7ff9c7992140 | |
Threads #: 16 Pthreads | |
Matrix size: 2048 | |
Using multiply kernel: multiply1 | |
Execution time = 3.377 seconds | |
Stderr: | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th -command stop. | |
vtune: Collection stopped. | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 100 % | |
vtune: Executing actions 100 % done | |
HW event-based analysis with context switches (Perf) | |
Example of analysis types: Threading with knob sampling-and-waits=hw | |
Collection: Ok | |
vtune: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location. | |
-------------------------------------------------------------------------------- | |
Running finalization... | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -finalize -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 0 % Finalizing the result | |
vtune: Executing actions 0 % Clearing the database | |
vtune: Executing actions 14 % Clearing the database | |
vtune: Executing actions 14 % Loading raw data to the database | |
vtune: Executing actions 14 % Loading 'systemcollector-705829-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading 'systemcollector-705829-devbig055.ftw5.fa | |
vtune: Executing actions 25 % Loading '705874.perf' file | |
vtune: Executing actions 25 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Updating precomputed scalar metrics | |
vtune: Executing actions 28 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Processing profile metrics and debug information | |
vtune: Executing actions 39 % Setting data model parameters | |
vtune: Executing actions 39 % Resolving module symbols | |
vtune: Executing actions 39 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `matrix' | |
vtune: Executing actions 41 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving information for `vmlinux' | |
vtune: Executing actions 44 % Resolving bottom user stack information | |
vtune: Executing actions 45 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving bottom user stack information | |
vtune: Executing actions 46 % Resolving thread name information | |
vtune: Executing actions 47 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving thread name information | |
vtune: Executing actions 48 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving call target names for dynamic code | |
vtune: Executing actions 49 % Resolving interrupt name information | |
vtune: Executing actions 53 % Resolving interrupt name information | |
vtune: Executing actions 53 % Processing profile metrics and debug information | |
vtune: Executing actions 54 % Processing profile metrics and debug information | |
vtune: Executing actions 55 % Processing profile metrics and debug information | |
vtune: Executing actions 56 % Processing profile metrics and debug information | |
vtune: Executing actions 57 % Processing profile metrics and debug information | |
vtune: Executing actions 58 % Processing profile metrics and debug information | |
vtune: Executing actions 60 % Processing profile metrics and debug information | |
vtune: Executing actions 62 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Processing profile metrics and debug information | |
vtune: Executing actions 63 % Setting data model parameters | |
vtune: Executing actions 64 % Setting data model parameters | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 64 % Precomputing frequently used data | |
vtune: Executing actions 66 % Precomputing frequently used data | |
vtune: Executing actions 67 % Precomputing frequently used data | |
vtune: Executing actions 68 % Precomputing frequently used data | |
vtune: Executing actions 69 % Precomputing frequently used data | |
vtune: Executing actions 70 % Precomputing frequently used data | |
vtune: Executing actions 71 % Precomputing frequently used data | |
vtune: Executing actions 72 % Precomputing frequently used data | |
vtune: Executing actions 73 % Precomputing frequently used data | |
vtune: Executing actions 74 % Precomputing frequently used data | |
vtune: Executing actions 76 % Precomputing frequently used data | |
vtune: Executing actions 76 % Updating precomputed scalar metrics | |
vtune: Executing actions 78 % Updating precomputed scalar metrics | |
vtune: Executing actions 78 % Discarding redundant overtime data | |
vtune: Executing actions 82 % Discarding redundant overtime data | |
vtune: Executing actions 82 % Saving the result | |
vtune: Executing actions 85 % Saving the result | |
vtune: Executing actions 89 % Saving the result | |
vtune: Executing actions 99 % Saving the result | |
vtune: Executing actions 100 % Saving the result | |
vtune: Executing actions 100 % done | |
Finalization: Ok | |
-------------------------------------------------------------------------------- | |
Command line: | |
/opt/intel/oneapi/vtune/2021.1.1/bin64/vtune -limit 5 -format csv -csv-delimiter comma -report hotspots -group-by function -r /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th | |
Stdout: | |
Function,CPU Time,CPU Time:Effective Time,CPU Time:Effective Time:Idle,CPU Time:Effective Time:Poor,CPU Time:Effective Time:Ok,CPU Time:Effective Time:Ideal,CPU Time:Effective Time:Over,CPU Time:Spin Time,CPU Time:Overhead Time,Inactive Wait Time,Inactive Wait Time:Inactive Sync Wait Time,Inactive Wait Time:Inactive Sync Wait Time:Idle,Inactive Wait Time:Inactive Sync Wait Time:Poor,Inactive Wait Time:Inactive Sync Wait Time:Ok,Inactive Wait Time:Inactive Sync Wait Time:Ideal,Inactive Wait Time:Inactive Sync Wait Time:Over,Inactive Wait Time:Preemption Wait Time,Inactive Wait Time:Preemption Wait Time:Idle,Inactive Wait Time:Preemption Wait Time:Poor,Inactive Wait Time:Preemption Wait Time:Ok,Inactive Wait Time:Preemption Wait Time:Ideal,Inactive Wait Time:Preemption Wait Time:Over,Inactive Wait Count,Inactive Wait Count:Inactive Sync Wait Count,Inactive Wait Count:Inactive Sync Wait Count:Idle,Inactive Wait Count:Inactive Sync Wait Count:Poor,Inactive Wait Count:Inactive Sync Wait Count:Ok,Inactive Wait Count:Inactive Sync Wait Count:Ideal,Inactive Wait Count:Inactive Sync Wait Count:Over,Inactive Wait Count:Preemption Wait Count,Inactive Wait Count:Preemption Wait Count:Idle,Inactive Wait Count:Preemption Wait Count:Poor,Inactive Wait Count:Preemption Wait Count:Ok,Inactive Wait Count:Preemption Wait Count:Ideal,Inactive Wait Count:Preemption Wait Count:Over,Module,Function (Full),Source File,Start Address | |
multiply1,51.049737,51.049737,0.005012,51.044726,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,matrix,multiply1,multiply.c,0x401550 | |
__read_once_size,0.130306,0.130306,0.0,0.130306,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,__read_once_size,compiler.h,0xffffffff8110f674 | |
perf_iterate_ctx,0.050118,0.050118,0.0,0.050118,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,perf_iterate_ctx,core.c,0xffffffff811e3dc0 | |
interrupt_entry,0.040094,0.040094,0.0,0.040094,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,interrupt_entry,entry_64.S,0xffffffff81c00910 | |
apic_timer_interrupt,0.030071,0.030071,0.0,0.030071,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,vmlinux,apic_timer_interrupt,entry_64.S,0xffffffff81c01800 | |
Stderr: | |
vtune: Using result path `/tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/result_th' | |
vtune: Executing actions 0 % | |
vtune: Executing actions 0 % Finalizing results | |
vtune: Executing actions 50 % Finalizing results | |
vtune: Executing actions 50 % Generating a report | |
vtune: Executing actions 50 % Setting data model parameters | |
vtune: Executing actions 75 % Setting data model parameters | |
vtune: Executing actions 75 % Generating a report | |
vtune: Executing actions 100 % Generating a report | |
vtune: Executing actions 100 % done | |
Report: Ok | |
The check observed a product failure on your system. | |
Review errors in the output above to fix a problem or contact Intel technical support. | |
Log location: /tmp/vtune-tmp-atem/self-checker-2021.04.09_15.10.52/log.txt |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment