Skip to content

Instantly share code, notes, and snippets.

@travisdowns
Last active November 30, 2019 19:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save travisdowns/c43442df327cb80e822d2839d7f09e0c to your computer and use it in GitHub Desktop.
Save travisdowns/c43442df327cb80e822d2839d7f09e0c to your computer and use it in GitHub Desktop.

MLP vs size for SKL, SKX and Zen2. Insprired by similar charts produced by Andrei at AnandTech.

Size here is in 8-byte units (sorry), so 10^7 means 80,000,000 bytes.

Data is generated using the console version of MemoryLanes:

MLP_CSV=2 MLP_START=1 MLP_STOP=20000 ./testingmlp > out

Charts generated like:

./plot-csv.py --no-legend --xscale log --ylabel Speedup --title "Zen2 MLP (no THP)" out

Skylake-S (i7-6700HQ)

4k pages:

SKL no THP

2MB pages:

SKL yes THP

Skylake-X (W-2104)

4k pages:

SXK no THP

2MB pages:

SKX yes THP

Zen2 (EPYC 7262)

4k pages:

Zen2 no THP

2MB pages:

Zen2 yes THP

Disas

Disassembly of core loop for the 10-chain method naked_access_10:

3c0:
add    rcx,0x1
mov    r14,QWORD PTR [rsi+r14*8]
mov    r13,QWORD PTR [rsi+r13*8]
cmp    rdx,rcx
mov    r12,QWORD PTR [rsi+r12*8]
mov    rbp,QWORD PTR [rsi+rbp*8]
mov    rbx,QWORD PTR [rsi+rbx*8]
mov    r11,QWORD PTR [rsi+r11*8]
mov    r10,QWORD PTR [rsi+r10*8]
mov    r9,QWORD PTR [rsi+r9*8]
mov    r8,QWORD PTR [rsi+r8*8]
mov    rdi,QWORD PTR [rsi+rdi*8]
jne    3c0 <naked_access_10(unsigned long const*, u
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment