Skip to content

Instantly share code, notes, and snippets.

@thakkarV
Last active September 17, 2020 09:52
Show Gist options
  • Save thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3 to your computer and use it in GitHub Desktop.
Save thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3 to your computer and use it in GitHub Desktop.

PMC diff between Zen1 and Zen2:

Format Notes:

{CounterType} 0x{CounterCode} : Status according to the PPR {CounterName}
    {OptionalNotes}

FP Events:

✔ PMC 0x00 : Removed (FPU Pipe Assignment) @done(19-12-19 13:29)
    fpu_pipe_assignment.dual: Measureable but always samples to zero in zen2 for, removed in this patchset. Sub-counters by pipe missing in zen1, added in this patchset.
    fpu_pipe_assignment.total: Measurable and non-zero in zen2, not removed in this patchset. Sub-counters missing in zen1, added for both zen1 and zen2.
✔ PMC 0x01 : Removed (FP Scheduler Empty) @done(19-12-19 13:29)
    Measureable but always samples to zero in zen2, removed in this patchset. Confirmed working for zen1.
✔ PMC 0x02 : Removed (Retired x87 Floating Point Operations) @done(19-12-19 13:29)
    Measureable but always samples to zero in zen2, removed in this patchset.
✔ PMC 0x03 : Removed top 4 sub-counters for sp/dp split, merged into bottom for (SP and DP flops) @done(19-12-19 13:30)
    as per f17m71 PPR section 2.1.15.3 (Large Increment per Cycle Events), this uses merge counters PMCxFFF and requires an MSR write to MSR::PERF_CTL. Current mainline for zen1 does not seem to be using the MSR but I cannot test whether this produces actual output on zen1 systems. Confirmed working.
✔ PMC 0x04 : Removed (Number of Move Elimination and Scalar Op Optimization) @done(19-12-19 13:32)
    Measureable and non-zero in zen2, not removed in this patchset. Confirmed working.
✔ PMC 0x05 : Changed; reversed sub-counter order for bottom nibble (Retired Serializing Ops) @done(19-12-19 13:32)
✔ PMC 0x0E : Added with 4 sub-counters (FP Dispatch Faults) @done(19-12-19 13:33)

LS Events:

✔ PMC 0x24 : Added as one sub-counter (STLI other) @done(19-12-19 13:38)
✔ PMC 0x26 : Added (Retired CLFLUSH Instructions) @done(19-12-19 23:11)
✔ PMC 0x29 : Changed sub-counter names for more consistency (Load/Store Dispatch) @done(19-12-23 21:18)
✔ PMC 0x27 : Added (Retired CPUID Instructions) @done(19-12-19 23:11)
✔ PMC 0x2B : Added (SMIs Received) @done(19-12-19 23:12)
✔ PMC 0x2C : Added (Interrupts Taken) @done(19-12-19 23:12)
✔ PMC 0x2D : Added (Time Stamp Counter Reads) @done(19-12-19 23:12)
✔ PMC 0x41 : Added in zen1 and zen2 (MAB Allocation by Pipe) @done(19-12-19 23:17)
    Added for both zen1 and zen2 in this patchset. Confirmed working.
✔ PMC 0x43 : Added with 5 sub-counters (Data Cache Refills from System) @done(19-12-20 06:00)
✔ PMC 0x46 : Removed (Tablewalker Allocation) @done(19-12-20 06:01)
    Measureable and non-zero in zen2, not removed in this patchset. Confirmed working.
✔ PMC 0x52 : Changed 2 sub-counter brief descriptions @done(19-12-20 06:05)
✔ PMC 0x59 : Added with 5 sub-counters (Software Prefetch Data Cache Fills) @done(19-12-20 06:08)
✔ PMC 0x5A : Added with 5 sub-counters (Hardware Prefetch Data Cache Fills) @done(19-12-20 09:10)
✔ PMC 0x78 : Added (All TLB Flushes) @done(19-12-20 09:10)

IC and BP Events:

✔ PMC 0x80 : Removed (32 Byte Instruction Cache Fetch) @done(19-12-20 09:12)
    Measureable and non-zero in zen2, not removed in this patchset. Confirmed working.
✔ PMC 0x81 : Removed (32 Byte Instruction Cache Misses) @done(19-12-20 09:12)
    Measureable and non-zero in zen2, not removed in this patchset. Confirmed working.
✔ PMC 0x85 : Added with 3 sub-counters (L1 ITLB Miss, L2 ITLB Miss) @done(19-12-20 09:22)
    ✔ @TODO: Make sure the umask 0xFF works as a cumulative of sub-counters @done(19-12-22 05:05)
✔ PMC 0x86 : Removed (Pipeline Restart Due to Instruction Stream Probe) @done(19-12-20 09:26)
    Measureable and non-zero in zen2, but not in PPR for f17m01::B2 or f17m71::B1. Not removed in this patchset from either zen1 or zen2. Confirmed working.
✔ PMC 0x87 : Removed (Instruction Pipe Stall) @done(19-12-20 09:26)
    Measureable and non-zero in zen2, but not in PPR for f17m01::B2 or f17m71::B1. Not removed in this patchset from either zen1 or zen2. Confirmed working.
✔ PMC 0x8C : Removed (Instruction Cache Lines Invalidated) @done(19-12-20 09:26)
    Measureable and non-zero in zen2, not removed in this patchset. Confirmed working.
✔ PMC 0x8E : Added (Dynamic Indirect Predictions) @done(19-12-20 09:32)
✔ PMC 0x91 : Added (Decoder Overrides Existing Branch Prediction (speculative)) @done(19-12-20 09:34)
✔ PMC 0x94 : Added with 3 sub-counters (ITLB Instruction Fetch Hits) @done(19-12-20 09:40)
    ✔ @TODO: Make sure the umask 0xFF works as a cumulative of sub-counters @done(19-12-22 05:16)
✔ PMC 0x99 : Removed (ITLB Reloads) @done(19-12-20 09:27)
    Measureable and non-zero in zen2, not removed in this patchset but moved from cache.json to branch.json due to name prefix `bp`. Confirmed working.
✔ PMC 0x28A : Removed (IcOcModeSwitch) @done(19-12-21 01:57)
    Measureable and non-zero in zen2, not removed in this patchset but moved to cache.json due to semantic grouping. Confirmed working.
✔ PMC 0xA9 : Added (Micro-Op Queue Empty) @done(19-12-21 08:29)
✔ PMC 0xAA : Added with 2 sub-counters (UOps Dispatched From Decoder) @done(19-12-21 08:29)
    ✔ @TODO: Make sure the umask 0xFF works as a cumulative of sub-counters @done(19-12-22 05:11)
✔ PMC 0xAE : Added with 8 sub-counters (Dispatch Resource Stall Cycles 1) @done(19-12-21 02:04)

EX (SC) Events:

✔ PMC 0xC7 : Removed (Retired Branch Resyncs) @done(19-12-21 09:08)
    Measureable and non-zero in zen2, but not in PPR for f17m01::B2 or f17m71::B1. Not removed in this patchset from either zen1 or zen2. Confirmed working.
✔ PMC 0xD2 : Removed (Retired Conditional Branch Instructions Mispredicted) @done(19-12-21 08:44)
    Measureable but confirmed always zero for both zen1/zen2, but not in PPR for f17m01::B2 or f17m71::B1. Removed for both zen1/zen2 in this patchset.

Cache Events:

### L2 Events:
    ✔ PMC 0x60 : Changed; minor name change for sub-counter 2. @done(19-12-21 20:56)
        Not changed in this patchset, too minor of a change.
    ✔ PMC 0x62 : Removed (L2 Latency) @done(19-12-21 08:46)
        Measureable and non-zero in zen2, not removed in this patchset. COnfirmed working.
    ✔ PMC 0x63 : Removed (LS to L2 WBC requests) @done(19-12-21 20:50)
        Measureable and non-zero in zen2, not removed in this patchset. COnfirmed working.
    ✔ PMC 0x6D : Removed (Cycles with fill pending from L2) @done(19-12-21 08:48)
        Measureable and non-zero in zen2, not removed in this patchset. COnfirmed working.
    ✔ PMC 0x70 : Added (L2 Prefetch Hit in L2) @done(19-12-21 08:54)
    ✔ PMC 0x71 : Added (L2 Prefetcher Hits in L3) @done(19-12-21 08:54)
    ✔ PMC 0x72 : Added (L2 Prefetcher Misses in L3) @done(19-12-21 08:54)

### L3 Events:
    ✔ L3PMC 0x01 : Removed (L3 Cache Accesses) @done(19-12-21 08:59)
        Measureable and non-zero in zen2 with elevated privilages, not removed in this patchset. Confirmed working.
    ✔ L3PMC 0x06 : Removed (L3 Miss) @done(19-12-21 09:08)
        Measureable and non-zero in zen2 with elevated privilages, not removed in this patchset. Confirmed working.

Weirdness:

[] Why is ls_ret_cpuid (PMC 0x27) catagorized under load/store?
[] for PMC 0x29 why is the naming of sub-counters so inconsistent, should we change the counter names to be more consistent at the cost of diverging from the PPR:
    + 2: LdStDispatch
    + 1: StoreDispatch
    + 0: LdDispatch
[] Why have the public description be the same as brief description for some counters such as memory:0x45
[] PMCs 0x86,0x87,0xC7,0xD2 are only listed in the PPR for f17m1 stepping B1 from 2017, and not in the latest 2019 guide for B2, but are still sample-able and non-zero even in both zen1 and zen2 and implemented in the current perf version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment