Skip to content

Instantly share code, notes, and snippets.

@ilya-pirogov
Created April 1, 2019 16:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ilya-pirogov/8079ed7dca0185a1cb89ec20910e26c4 to your computer and use it in GitHub Desktop.
Save ilya-pirogov/8079ed7dca0185a1cb89ec20910e26c4 to your computer and use it in GitHub Desktop.
+ cat /proc/meminfo
MemTotal: 16380188 kB
MemFree: 5078732 kB
MemAvailable: 7298544 kB
Buffers: 197704 kB
Cached: 2314136 kB
SwapCached: 108680 kB
Active: 7717128 kB
Inactive: 2753152 kB
Active(anon): 6676140 kB
Inactive(anon): 1468604 kB
Active(file): 1040988 kB
Inactive(file): 1284548 kB
Unevictable: 288 kB
Mlocked: 288 kB
SwapTotal: 8850424 kB
SwapFree: 5169804 kB
Dirty: 616 kB
Writeback: 0 kB
AnonPages: 7824320 kB
Mapped: 691720 kB
Shmem: 186304 kB
Slab: 443820 kB
SReclaimable: 231288 kB
SUnreclaim: 212532 kB
KernelStack: 27168 kB
PageTables: 131932 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 17040516 kB
Committed_AS: 22429260 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 2871296 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 773584 kB
DirectMap2M: 15960064 kB
+ sudo bash -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
+ perf stat -dddd ./habr-tests -timeout 10m 1000000 10000000 100000000 1000000000
................................
+---+-------------------------------------------+-----------+------------+-------------+---------------+
| | IMPLEMENTATION | 1,000,000 | 10,000,000 | 100,000,000 | 1,000,000,000 |
+---+-------------------------------------------+-----------+------------+-------------+---------------+
| 1 | Rust: release; multi threads | 35ms | 608ms | 7.872s | 1m40.8s |
| 2 | Rust: release; single thread | 39ms | 623ms | 7.758s | 1m42.071s |
| 3 | Go: NumCPU/2 goroutines + chan | 50ms | 673ms | 8.462s | 1m49.417s |
| 4 | Go: sqrt(n) parallel goroutines | 42ms | 665ms | 8.249s | 1m49.959s |
| 5 | C++: gcc -O3 -march-native; 10 threads | 39ms | 604ms | 8.54s | 1m52.528s |
| 6 | C++: gcc -O3 -march-native; single thread | 49ms | 700ms | 10.028s | 2m48.328s |
| 7 | Haskell: ghc -O3; single thread | 54ms | 759ms | 10.359s | 2m50.296s |
| 8 | Go: single thread | 121ms | 1.82s | 23.854s | 5m9.563s |
+---+-------------------------------------------+-----------+------------+-------------+---------------+
Performance counter stats for './habr-tests -timeout 10m 1000000 10000000 100000000 1000000000':
4584149.785694 task-clock (msec) # 3.595 CPUs utilized
935,862 context-switches # 0.204 K/sec
15,653 cpu-migrations # 0.003 K/sec
27,188,953 page-faults # 0.006 M/sec
14,258,282,945,500 cycles # 3.110 GHz (24.68%)
12,766,890,240,056 stalled-cycles-frontend # 89.54% frontend cycles idle (25.29%)
10,965,485,737,118 stalled-cycles-backend # 76.91% backend cycles idle (25.24%)
2,557,049,503,069 instructions # 0.18 insn per cycle
# 4.99 stalled cycles per insn (31.91%)
434,911,088,770 branches # 94.873 M/sec (31.77%)
573,113,723 branch-misses # 0.13% of all branches (31.83%)
603,175,247,157 L1-dcache-loads # 131.578 M/sec (23.71%)
98,134,972,345 L1-dcache-load-misses # 16.27% of all L1-dcache hits (16.42%)
55,623,750,834 LLC-loads # 12.134 M/sec (15.35%)
<not supported> LLC-load-misses
<not supported> L1-icache-loads
904,308,773 L1-icache-load-misses (19.23%)
552,672,638,316 dTLB-loads # 120.562 M/sec (17.03%)
46,119,207,398 dTLB-load-misses # 8.34% of all dTLB cache hits (15.45%)
140,221,506 iTLB-loads # 0.031 M/sec (13.74%)
117,778,257 iTLB-load-misses # 83.99% of all iTLB cache hits (19.13%)
<not supported> L1-dcache-prefetches
25,092,389,819 L1-dcache-prefetch-misses # 5.474 M/sec (25.00%)
1274.973839883 seconds time elapsed
+ sudo bash -c 'echo always > /sys/kernel/mm/transparent_hugepage/enabled'
+ perf stat -dddd ./habr-tests -timeout 10m 1000000 10000000 100000000 1000000000
................................
+---+-------------------------------------------+-----------+------------+-------------+---------------+
| | IMPLEMENTATION | 1,000,000 | 10,000,000 | 100,000,000 | 1,000,000,000 |
+---+-------------------------------------------+-----------+------------+-------------+---------------+
| 1 | Rust: release; single thread | 30ms | 595ms | 7.64s | 1m40.789s |
| 2 | Rust: release; multi threads | 41ms | 615ms | 7.507s | 1m41.168s |
| 3 | Go: NumCPU/2 goroutines + chan | 36ms | 574ms | 8.329s | 1m45.345s |
| 4 | C++: gcc -O3 -march-native; 10 threads | 35ms | 541ms | 7.694s | 1m52.607s |
| 5 | Go: sqrt(n) parallel goroutines | 33ms | 651ms | 8.118s | 1m54.565s |
| 6 | C++: gcc -O3 -march-native; single thread | 25ms | 597ms | 8.766s | 2m40.784s |
| 7 | Haskell: ghc -O3; single thread | 33ms | 627ms | 9.265s | 2m46.827s |
| 8 | Go: single thread | 101ms | 1.498s | 21.732s | 5m6.502s |
+---+-------------------------------------------+-----------+------------+-------------+---------------+
Performance counter stats for './habr-tests -timeout 10m 1000000 10000000 100000000 1000000000':
4565563.131020 task-clock (msec) # 3.642 CPUs utilized
967,544 context-switches # 0.212 K/sec
15,057 cpu-migrations # 0.003 K/sec
23,943,284 page-faults # 0.005 M/sec
14,151,559,802,105 cycles # 3.100 GHz (24.57%)
12,667,369,983,515 stalled-cycles-frontend # 89.51% frontend cycles idle (25.22%)
10,878,467,375,902 stalled-cycles-backend # 76.87% backend cycles idle (25.14%)
2,539,516,856,089 instructions # 0.18 insn per cycle
# 4.99 stalled cycles per insn (31.81%)
431,151,465,699 branches # 94.436 M/sec (31.66%)
541,025,963 branch-misses # 0.13% of all branches (31.73%)
551,418,025,677 L1-dcache-loads # 120.778 M/sec (22.87%)
96,483,125,769 L1-dcache-load-misses # 17.50% of all L1-dcache hits (17.98%)
54,993,318,876 LLC-loads # 12.045 M/sec (15.43%)
<not supported> LLC-load-misses
<not supported> L1-icache-loads
852,026,173 L1-icache-load-misses (19.26%)
553,669,048,576 dTLB-loads # 121.271 M/sec (17.04%)
43,736,579,907 dTLB-load-misses # 7.90% of all dTLB cache hits (15.54%)
134,811,018 iTLB-loads # 0.030 M/sec (13.72%)
116,231,234 iTLB-load-misses # 86.22% of all iTLB cache hits (19.13%)
<not supported> L1-dcache-prefetches
25,461,940,406 L1-dcache-prefetch-misses # 5.577 M/sec (24.97%)
1253.674439726 seconds time elapsed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment