-
-
Save travisdowns/9d85a159631146dd04c8e743a6ef2195 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Driver: intel_pstate, governor: performance | |
Vendor ID: GenuineIntel | |
Model name: Intel(R) Xeon(R) W-2104 CPU @ 3.20GHz | |
intel_pstate/no_turbo reports that turbo is already disabled | |
Using timer: libpfc | |
Reloading pfc.ko kernel module | |
USE_LIBPFC=1 | |
sudo sh -c "echo 2 > /sys/bus/event_source/devices/cpu/rdpmc" | |
! lsmod | grep -q pfc || sudo rmmod pfc | |
sudo insmod libpfc/pfc.ko | |
Welcome to uarch-bench (e9437bd-dirty) | |
Supported CPU features: SSE3 PCLMULQDQ VMX SMX EST TM2 SSSE3 FMA CX16 SSE4_1 SSE4_2 MOVBE POPCNT AES AVX RDRND TSC_ADJ BMI1 HLE AVX2 BMI2 ERMS RTM MPX PQE AVX512F AVX512DQ RDSEED ADX CLFLUSHOPT CLWB INTEL_PT AVX512CD AVX512BW AVX512VL | |
Pinned to CPU 0 | |
lipfc init OK | |
Running benchmarks groups using timer libpfc | |
** Inverse throughput for load/16-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
** Inverse throughput for load/32-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
16 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
32 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
48 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 1.0 | |
** Inverse throughput for load/64-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
16 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
32 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
48 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
** Inverse throughput for load/128-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
16 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
32 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
48 : 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
** Inverse throughput for load/256-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
16 : 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 | |
32 : 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
** Inverse throughput for load/512-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
------ STORE -------- | |
** Inverse throughput for store/16-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 | |
** Inverse throughput for store/32-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 | |
** Inverse throughput for store/64-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
** Inverse throughput for store/128-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
48 : 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
** Inverse throughput for store/256-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
16 : 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 | |
32 : 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
48 : 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
** Inverse throughput for store/512-bit ** | |
offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
0 : 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
16 : 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
32 : 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 | |
48 : 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Driver: intel_pstate, governor: performance | |
Vendor ID: GenuineIntel | |
Model name: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz | |
intel_pstate/no_turbo reports that turbo is already disabled | |
Using timer: libpfc | |
Reloading pfc.ko kernel module | |
USE_LIBPFC=1 | |
sudo sh -c "echo 2 > /sys/bus/event_source/devices/cpu/rdpmc" | |
! lsmod | grep -q pfc || sudo rmmod pfc | |
sudo insmod libpfc/pfc.ko | |
Welcome to uarch-bench (0a51d90-dirty) | |
Supported CPU features: SSE3 PCLMULQDQ VMX EST TM2 SSSE3 FMA CX16 SSE4_1 SSE4_2 MOVBE POPCNT AES AVX RDRND TSC_ADJ SGX BMI1 HLE AVX2 BMI2 ERMS RTM MPX RDSEED ADX CLFLUSHOPT INTEL_PT | |
libpfm4 initialized successfully | |
Event 'skl::MEM_INST_RETIRED.SPLIT_LOADS' resolved to 'skl::MEM_INST_RETIRED:SPLIT_LOADS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0, short name: 'MEM_IN' with code 0x5341d0 | |
Pinned to CPU 0 | |
lipfc init OK | |
Running benchmarks groups using timer libpfc | |
** Running group memory/load-serial : Serial loads from fixed-size regions ** | |
Benchmark Cycles MEM_IN | |
16-KiB serial loads 4.00 0.00 | |
24-KiB serial loads 4.00 0.00 | |
30-KiB serial loads 4.00 0.00 | |
31-KiB serial loads 4.00 0.00 | |
32-KiB serial loads 4.04 0.00 | |
33-KiB serial loads 6.08 0.00 | |
34-KiB serial loads 8.14 0.00 | |
35-KiB serial loads 10.07 0.00 | |
40-KiB serial loads 11.99 0.00 | |
48-KiB serial loads 12.00 0.00 | |
56-KiB serial loads 12.00 0.00 | |
64-KiB serial loads 12.00 0.00 | |
80-KiB serial loads 11.99 0.00 | |
96-KiB serial loads 11.99 0.00 | |
112-KiB serial loads 12.00 0.00 | |
128-KiB serial loads 12.00 0.00 | |
196-KiB serial loads 12.00 0.00 | |
252-KiB serial loads 12.02 0.00 | |
256-KiB serial loads 12.02 0.00 | |
260-KiB serial loads 12.92 0.00 | |
384-KiB serial loads 28.13 0.00 | |
512-KiB serial loads 30.28 0.00 | |
1024-KiB serial loads 34.04 0.00 | |
2048-KiB serial loads 35.54 0.00 | |
4096-KiB serial loads 36.26 0.00 | |
8192-KiB serial loads 103.14 0.00 | |
16384-KiB serial loads 141.98 0.00 | |
32768-KiB serial loads 96.49 0.00 | |
65536-KiB serial loads 94.72 0.00 | |
131072-KiB serial loads 135.22 0.00 | |
262144-KiB serial loads 163.91 0.00 | |
** Running group memory/load-serial-crossing : Cacheline crossing loads from fixed-size regions ** | |
Benchmark Cycles MEM_IN | |
8-KiB serial loads 11.00 1.00 | |
16-KiB serial loads 11.00 1.00 | |
32-KiB serial loads 11.08 1.00 | |
64-KiB serial loads 22.31 1.00 | |
128-KiB serial loads 24.16 1.00 | |
256-KiB serial loads 24.79 1.00 | |
512-KiB serial loads 40.31 1.00 | |
1024-KiB serial loads 43.80 1.00 | |
2048-KiB serial loads 45.36 1.00 | |
4096-KiB serial loads 46.18 1.00 | |
8192-KiB serial loads 137.21 1.00 | |
16384-KiB serial loads 188.85 1.00 | |
32768-KiB serial loads 211.99 1.00 | |
65536-KiB serial loads 219.83 1.00 | |
131072-KiB serial loads 203.40 1.00 | |
262144-KiB serial loads 212.85 1.00 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Driver: intel_pstate, governor: performance | |
Vendor ID: GenuineIntel | |
Model name: Intel(R) Xeon(R) W-2104 CPU @ 3.20GHz | |
intel_pstate/no_turbo reports that turbo is already disabled | |
Using timer: libpfc | |
Reloading pfc.ko kernel module | |
USE_LIBPFC=1 | |
sudo sh -c "echo 2 > /sys/bus/event_source/devices/cpu/rdpmc" | |
[sudo] password for travis: | |
! lsmod | grep -q pfc || sudo rmmod pfc | |
sudo insmod libpfc/pfc.ko | |
Welcome to uarch-bench (97c09a3) | |
Supported CPU features: SSE3 PCLMULQDQ VMX SMX EST TM2 SSSE3 FMA CX16 SSE4_1 SSE4_2 MOVBE POPCNT AES AVX RDRND TSC_ADJ BMI1 HLE AVX2 BMI2 ERMS RTM MPX PQE AVX512F AVX512DQ RDSEED ADX CLFLUSHOPT CLWB INTEL_PT AVX512CD AVX512BW AVX512VL | |
libpfm4 initialized successfully | |
WARNING: Event 'skl::MEM_INST_RETIRED.SPLIT_LOADS' could not be resolved and will be ignored. Reason: event not found | |
Use --list-events to list available events. | |
Pinned to CPU 0 | |
lipfc init OK | |
Running benchmarks groups using timer libpfc | |
** Running group memory/load-serial : Serial loads from fixed-size regions ** | |
Benchmark Cycles | |
16-KiB serial loads 4.00 | |
24-KiB serial loads 4.01 | |
30-KiB serial loads 4.01 | |
31-KiB serial loads 4.01 | |
32-KiB serial loads 4.01 | |
33-KiB serial loads 6.66 | |
34-KiB serial loads 9.23 | |
35-KiB serial loads 11.65 | |
40-KiB serial loads 13.96 | |
48-KiB serial loads 13.98 | |
56-KiB serial loads 14.00 | |
64-KiB serial loads 13.99 | |
80-KiB serial loads 13.99 | |
96-KiB serial loads 14.00 | |
112-KiB serial loads 14.01 | |
128-KiB serial loads 14.00 | |
196-KiB serial loads 14.00 | |
252-KiB serial loads 14.00 | |
256-KiB serial loads 14.00 | |
260-KiB serial loads 14.00 | |
384-KiB serial loads 14.01 | |
512-KiB serial loads 14.00 | |
1024-KiB serial loads 14.13 | |
2048-KiB serial loads 76.74 | |
4096-KiB serial loads 71.73 | |
8192-KiB serial loads 76.11 | |
16384-KiB serial loads 81.42 | |
32768-KiB serial loads 86.46 | |
65536-KiB serial loads 87.68 | |
131072-KiB serial loads 94.23 | |
262144-KiB serial loads 95.59 | |
** Running group memory/load-serial-crossing : Cacheline crossing loads from fixed-size regions ** | |
Benchmark Cycles | |
8-KiB serial loads 11.00 | |
16-KiB serial loads 11.00 | |
32-KiB serial loads 11.20 | |
64-KiB serial loads 22.24 | |
128-KiB serial loads 23.97 | |
256-KiB serial loads 24.50 | |
512-KiB serial loads 24.80 | |
1024-KiB serial loads 24.97 | |
2048-KiB serial loads 78.63 | |
4096-KiB serial loads 86.56 | |
8192-KiB serial loads 110.50 | |
16384-KiB serial loads 252.92 | |
32768-KiB serial loads 279.98 | |
65536-KiB serial loads 292.90 | |
131072-KiB serial loads 311.31 | |
262144-KiB serial loads 313.89 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment