Skip to content

Instantly share code, notes, and snippets.

@larryxiao
Created August 31, 2014 07:56
Show Gist options
  • Save larryxiao/c9eb872241075827cafd to your computer and use it in GitHub Desktop.
Save larryxiao/c9eb872241075827cafd to your computer and use it in GitHub Desktop.
SCC SJTU
# Performance Monitoring Events for Generation Intel Core Processors Code Name Knights Corner Core-KNC V4
# 8/9/2013 7:33:41 AM
CODE UMASK NAME DESCRIPTION COMMON COUNTER OVERFLOW OTHER DEFAULT EM_TRIGGER
0x00 0x00 DATA_READ Number of memory data reads which hit the internal data cache (L1). Cache accesses resulting from prefetch instructions are included. 20013 "0,1" 1000003 0x53 0 0
0x01 0x00 DATA_WRITE Number of memory data writes which hit the internal data cache (L1). 20013 "0,1" 1000003 0x53 0 0
0x02 0x00 DATA_PAGE_WALK Number of data page walks 20013 "0,1" 1000003 0x53 0 0
0x03 0x00 DATA_READ_MISS Number of memory read accesses that miss the internal data cache whether or not the access is cacheable or noncacheable. Cache accesses resulting from prefetch instructions are included. 20013 "0,1" 1000003 0x53 0 0
0x04 0x00 DATA_WRITE_MISS Number of memory write accesses that miss the internal data cache whether or not the access is cacheable or noncacheable 20013 "0,1" 1000003 0x53 0 0
0x06 0x00 DATA_CACHE_LINES_WRITTEN_BACK "Number of dirty lines (all) that are written back, regardless of the cause" 20013 "0,1" 1000003 0x53 0 0
0x09 0x00 MEMORY_ACCESSES_IN_BOTH_PIPES Number of data memory reads or writes that are paired in both pipes of the pipeline 20013 "0,1" 1000003 0x53 0 0
0x0A 0x00 BANK_CONFLICTS Number of actual bank conflicts 20013 "0,1" 1000003 0x53 0 0
0x0C 0x00 CODE_READ Number of instruction reads; whether the read is cacheable or noncacheable 20013 "0,1" 1000003 0x53 0 0
0x11 0x00 L1_DATA_PF1 Number of data vprefetch1 requests seen by the L1. 20013 "0,1" 1000003 0x53 0 0
0x12 0x00 BRANCHES "Number of taken and not taken branches, including: conditional branches, jumps, calls, returns, software interrupts, and interrupt returns" 20013 "0,1" 1000003 0x53 0 0
0x15 0x00 PIPELINE_FLUSHES Number of pipeline flushes that occur 20013 "0,1" 1000003 0x53 0 0
0x16 0x00 INSTRUCTIONS_EXECUTED Number of instructions executed (up to two per clock) 20013 "0,1" 2000003 0x53 1 0
0x17 0x00 INSTRUCTIONS_EXECUTED_V_PIPE Number of instructions executed in the V_pipe. The event indicates the number of instructions that were paired. 20013 "0,1" 1000003 0x53 0 0
0x1C 0x00 L1_DATA_PF1_MISS Number of data vprefetch1 requests seen by the L1 which missed L1. 20013 "0,1" 1000003 0x53 0 0
0x1F 0x00 PIPELINE_AGI_STALLS Number of address generation interlock (AGI) stalls. An AGI occurring in both the U- and V- pipelines in the same clock signals this event twice. 20013 "0,1" 1000003 0x53 0 0
0x20 0x00 L1_DATA_HIT_INFLIGHT_PF1 Number of data requests which hit an in-flight vprefetch1. The in-flight vprefetch1 was not necessarily issued from the same thread as the data request. 20013 "0,1" 1000003 0x53 0 0
0x21 0x00 PIPELINE_SG_AGI_STALLS Number of address generation interlock (AGI) stalls due to vscatter* and vgather* instructions. 20013 "0,1" 1000003 0x53 0 0
0x27 0x00 HARDWARE_INTERRUPTS Number of taken INTR and NMI interrupts 20013 "0,1" 1000003 0x53 0 0
0x28 0x00 DATA_READ_OR_WRITE Number of memory data reads and/or writes (internal data cache hit and miss combined) 20013 "0,1" 1000003 0x53 0 0
0x29 0x00 DATA_READ_MISS_OR_WRITE_MISS "Number of memory read and/or write accesses that miss the internal data cache, whether or not the access is cacheable or noncacheable" 20013 "0,1" 1000003 0x53 0 0
0x2A 0x00 CPU_CLK_UNHALTED Number of cycles during which the processor is not halted. 20013 "0,1" 2000003 0x53 1 2
0x2B 0x00 BRANCHES_MISPREDICTED Number of branch mispredictions that occurred on BTB hits. BTB misses are not considered branch mispredicts because no prediction exists for them yet. 20013 "0,1" 1000003 0x53 0 0
0x2C 0x00 MICROCODE_CYCLES "The number of cycles microcode is executing. While microcode is executing, all other threads are stalled." 20013 "0,1" 1000003 0x53 0 0
0x2D 0x00 FE_STALLED "Number of cycles where the front-end could not advance. Any multi-cycle instructions which delay pipeline advance and apply backpressure to the front-end will be included, e.g. read-modify-write instructions. Includes cycles when the front-end did not hav" 20013 "0,1" 1000003 0x53 0 0
0x2E 0x00 EXEC_STAGE_CYCLES "Number of E-stage cycles that were successfully completed. Includes cycles generated by multi-cycle E-stage instructions. For instructions destined for the FPU or VPU pipelines, this event only counts occupancy in the integer E-stage. " 20013 "0,1" 1000003 0x53 0 0
0x37 0x00 L1_DATA_PF2 Number of data vprefetch2 requests seen by the L1. This is not necessarily the same number as seen by the L2 because this count includes requests that are dropped by the core. A vprefetch2 can be dropped by the core if the requested address matches anothe 20013 "0,1" 1000003 0x53 0 0
0x3A 0x00 LONG_DATA_PAGE_WALK "Number of ''long'' data page walks, i.e. page walks that also missed the L2 uTLB. Subset of DATA_PAGE_WALK event" 20013 "0,1" 1000003 0x53 0 0
0xC4 0x10 HWP_L2MISS Hardware Prefetch L2 MISS 20013 0 1000003 0x53 0 0
0xC8 0x10 L2_READ_HIT_E L2 Read Hit E State 20013 0 1000003 0x53 0 0
0xC9 0x10 L2_READ_HIT_M L2 Read Hit M State 20013 0 1000003 0x53 0 0
0xCA 0x10 L2_READ_HIT_S L2 Read Hit S State 20013 0 1000003 0x53 0 0
0xCB 0x10 L2_READ_MISS L2 Read Miss 20013 0 1000003 0x53 0 0
0xCC 0x10 L2_WRITE_HIT L2 Write HIT 20013 0 1000003 0x53 0 0
0xD7 0x10 L2_VICTIM_REQ_WITH_DATA L2 received a victim request and responded with data 20013 0 1000003 0x53 0 0
0xE6 0x10 SNP_HIT_L2 Snoop HIT in L2 20013 0 1000003 0x53 0 0
0xE7 0x10 SNP_HITM_L2 Snoop HITM in L2 20013 0 1000003 0x53 0 0
0xF1 0x10 L2_DATA_READ_MISS_CACHE_FILL Number of data read accesses that missed the L2 cache and were satisfied by another L2 cache. Can include promoted read misses that started as CODE accesses. 20013 0 1000003 0x53 0 0
0xF2 0x10 L2_DATA_WRITE_MISS_CACHE_FILL Number of data write (RFO) accesses that missed the L2 cache and were satisfied by another L2 cache. 20013 0 1000003 0x53 0 0
0xF6 0x10 L2_DATA_READ_MISS_MEM_FILL Number of data read accesses that missed the L2 cache and were satisfied by main memory. Can include promoted read misses that started as CODE accesses. 20013 0 1000003 0x53 0 0
0xF7 0x10 L2_DATA_WRITE_MISS_MEM_FILL Number of data write (RFO) accesses that missed the L2 cache and were satisfied by main memory. 20013 0 1000003 0x53 0 0
0xFC 0x10 L2_DATA_PF2 Number of data vprefetch2 requests seen by the L2. 20013 0 1000003 0x53 0 0
0xFD 0x10 L2_DATA_PF2_MISS Number of data vprefetch2 requests seen by the L2 which missed L2. 20013 0 1000003 0x53 0 0
0x00 0x20 VPU_DATA_READ "Number of read transactions that were issued. In general each read transaction will read 1 64B cacheline. If there are alignment issues, then reads against multiple cache lines will each be counted individually." 20013 "0,1" 1000003 0x53 0 0
0x01 0x20 VPU_DATA_WRITE "Number of write transactions that were issued. In general each write transaction will write 1 64B cacheline. If there are alignment issues, then write against multiple cache lines will each be counted individually." 20013 "0,1" 1000003 0x53 0 0
0x03 0x20 VPU_DATA_READ_MISS VPU L1 data cache readmiss. Counts the number of occurrences. 20013 "0,1" 1000003 0x53 0 0
0x04 0x20 VPU_DATA_WRITE_MISS VPU L1 data cache write miss. Counts the number of occurrences. 20013 "0,1" 1000003 0x53 0 0
0x05 0x20 VPU_STALL_REG "VPU stall on Register Dependency. Counts the number of occurrences. Dependencies will include RAW, WAW, WAR." 20013 "0,1" 1000003 0x53 0 0
0x16 0x20 VPU_INSTRUCTIONS_EXECUTED Counts the number of VPU instructions executed in both u- and v-pipes. 20013 "0,1" 1000003 0x53 0 0
0x17 0x20 VPU_INSTRUCTIONS_EXECUTED_V_PIPE Counts the number of VPU instructions that paired and executed in the v-pipe. 20013 "0,1" 1000003 0x53 0 0
0x18 0x20 VPU_ELEMENTS_ACTIVE tbd 20013 0 1000003 0x53 0 0
0xCE 0x10 L2_STRONGLY_ORDERED_STREAMING_VSTORES_MISS Number of strongly ordered streaming vector stores that missed the L2 and were sent to the ring. 20013 0 1000000 0x53 0 0
0xCF 0x10 L2_WEAKLY_ORDERED_STREAMING_VSTORE_MISS Number of weakly ordered streaming vector stores that missed the L2 and were sent to the ring. 20013 0 1000000 0x53 0 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment