Last active
December 23, 2019 15:17
-
-
Save osiewicz/995146990d86c269726368c8fc03d091 to your computer and use it in GitHub Desktop.
CTRE issue 78 results
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Command line: | |
time g++ -std=c++17 -O3 min_repro.cpp | |
No forced inlining: | |
real 0m2,210s | |
user 0m2,094s | |
sys 0m0,104s | |
Binary size: 280064 | |
Forced inlining (current master): | |
real 1m1,159s | |
user 0m57,839s | |
sys 0m1,849s | |
Binary size: 5370904 | |
== WITH -O3 == | |
Command line: | |
time g++ -std=c++17 -O3 min_repro.cpp | |
No forced inlining: | |
real 0m4,076s | |
user 0m3,986s | |
sys 0m0,076s | |
Binary size: 47288 | |
Forced inlining (current master): | |
real 0m9,929s | |
user 0m9,458s | |
sys 0m0,173s | |
Binary size: 16624 | |
Bonus: optimized debug build (only for modified version): | |
real 0m5,381s | |
user 0m5,244s | |
sys 0m0,112s | |
Binary size: 1103872 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[hiro@hiro-pc ctre_issue_78]$ time g++ -g -ftime-report -O3 -std=c++17 ctre.cpp | |
^[ ^[ | |
Time variable usr sys wall GGC | |
phase setup : 0.02 ( 0%) 0.02 ( 0%) 0.08 ( 0%) 1366 kB ( 0%) | |
phase parsing : 5.81 ( 0%) 2.44 ( 2%) 8.38 ( 1%) 136206 kB ( 4%) | |
phase lang. deferred : 0.08 ( 0%) 0.04 ( 0%) 0.11 ( 0%) 4805 kB ( 0%) | |
phase opt and generate :1262.80 (100%) 117.71 ( 98%)1404.25 ( 99%) 3678493 kB ( 96%) | |
phase last asm : 0.06 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 1208 kB ( 0%) | |
|name lookup : 0.28 ( 0%) 0.10 ( 0%) 0.35 ( 0%) 2686 kB ( 0%) | |
|overload resolution : 5.07 ( 0%) 1.91 ( 2%) 7.07 ( 1%) 96445 kB ( 3%) | |
garbage collection : 7.98 ( 1%) 0.03 ( 0%) 8.09 ( 1%) 0 kB ( 0%) | |
dump files : 0.03 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
callgraph construction : 0.69 ( 0%) 0.01 ( 0%) 0.76 ( 0%) 8127 kB ( 0%) | |
callgraph optimization : 0.98 ( 0%) 0.01 ( 0%) 1.11 ( 0%) 11 kB ( 0%) | |
ipa function summary : 0.30 ( 0%) 0.00 ( 0%) 0.31 ( 0%) 22 kB ( 0%) | |
ipa cp : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 4 kB ( 0%) | |
ipa inlining heuristics : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 0 kB ( 0%) | |
ipa function splitting : 0.23 ( 0%) 0.00 ( 0%) 0.23 ( 0%) 263 kB ( 0%) | |
ipa pure const : 0.46 ( 0%) 0.00 ( 0%) 0.49 ( 0%) 1 kB ( 0%) | |
ipa icf : 0.06 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 0 kB ( 0%) | |
cfg construction : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 5 kB ( 0%) | |
cfg cleanup : 1.29 ( 0%) 0.00 ( 0%) 1.29 ( 0%) 29 kB ( 0%) | |
trivially dead code :1095.40 ( 86%) 0.25 ( 0%)1115.72 ( 79%) 0 kB ( 0%) | |
df scan insns : 0.27 ( 0%) 0.00 ( 0%) 0.27 ( 0%) 0 kB ( 0%) | |
df multiple defs : 0.12 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 0 kB ( 0%) | |
df reaching defs : 0.11 ( 0%) 0.00 ( 0%) 0.12 ( 0%) 0 kB ( 0%) | |
df live regs : 1.08 ( 0%) 0.00 ( 0%) 1.12 ( 0%) 0 kB ( 0%) | |
df live&initialized regs : 0.14 ( 0%) 0.00 ( 0%) 0.13 ( 0%) 0 kB ( 0%) | |
df must-initialized regs : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
df use-def / def-use chains : 0.08 ( 0%) 0.00 ( 0%) 0.10 ( 0%) 0 kB ( 0%) | |
df reg dead/unused notes : 0.43 ( 0%) 0.00 ( 0%) 0.45 ( 0%) 11 kB ( 0%) | |
register information : 0.04 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
alias analysis : 0.31 ( 0%) 0.00 ( 0%) 0.32 ( 0%) 49 kB ( 0%) | |
alias stmt walking : 5.38 ( 0%) 1.02 ( 1%) 6.41 ( 0%) 18792 kB ( 0%) | |
register scan : 0.20 ( 0%) 0.00 ( 0%) 0.21 ( 0%) 4 kB ( 0%) | |
rebuild jump labels : 0.15 ( 0%) 0.00 ( 0%) 0.16 ( 0%) 0 kB ( 0%) | |
preprocessing : 0.08 ( 0%) 0.17 ( 0%) 0.23 ( 0%) 1338 kB ( 0%) | |
parser (global) : 0.22 ( 0%) 0.18 ( 0%) 0.40 ( 0%) 14497 kB ( 0%) | |
parser struct body : 0.17 ( 0%) 0.06 ( 0%) 0.21 ( 0%) 9373 kB ( 0%) | |
parser function body : 0.00 ( 0%) 0.01 ( 0%) 0.06 ( 0%) 305 kB ( 0%) | |
parser inl. func. body : 0.01 ( 0%) 0.03 ( 0%) 0.13 ( 0%) 1264 kB ( 0%) | |
parser inl. meth. body : 0.11 ( 0%) 0.04 ( 0%) 0.12 ( 0%) 3505 kB ( 0%) | |
template instantiation : 3.21 ( 0%) 1.01 ( 1%) 4.16 ( 0%) 99016 kB ( 3%) | |
constant expression evaluation : 1.98 ( 0%) 0.94 ( 1%) 3.07 ( 0%) 2819 kB ( 0%) | |
early inlining heuristics : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 224 kB ( 0%) | |
inline parameters : 4.23 ( 0%) 0.00 ( 0%) 4.21 ( 0%) 1803 kB ( 0%) | |
integration : 36.96 ( 3%) 51.67 ( 43%) 91.90 ( 7%) 2311970 kB ( 60%) | |
tree gimplify : 0.01 ( 0%) 0.01 ( 0%) 0.04 ( 0%) 1262 kB ( 0%) | |
tree eh : 0.37 ( 0%) 0.00 ( 0%) 0.40 ( 0%) 63 kB ( 0%) | |
tree CFG cleanup : 3.37 ( 0%) 3.52 ( 3%) 7.22 ( 1%) 1765 kB ( 0%) | |
tree tail merge : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 17 kB ( 0%) | |
tree VRP : 1.50 ( 0%) 0.57 ( 0%) 2.04 ( 0%) 35085 kB ( 1%) | |
tree Early VRP : 7.06 ( 1%) 0.04 ( 0%) 7.26 ( 1%) 6803 kB ( 0%) | |
tree copy propagation : 0.93 ( 0%) 0.01 ( 0%) 0.92 ( 0%) 0 kB ( 0%) | |
tree PTA : 2.55 ( 0%) 0.06 ( 0%) 2.58 ( 0%) 6741 kB ( 0%) | |
tree SSA rewrite : 2.67 ( 0%) 1.46 ( 1%) 4.45 ( 0%) 231036 kB ( 6%) | |
tree SSA other : 0.00 ( 0%) 0.01 ( 0%) 0.02 ( 0%) 2332 kB ( 0%) | |
tree SSA incremental : 3.21 ( 0%) 0.09 ( 0%) 3.38 ( 0%) 1887 kB ( 0%) | |
tree operand scan : 29.94 ( 2%) 48.74 ( 41%) 76.02 ( 5%) 382866 kB ( 10%) | |
dominator optimization : 1.44 ( 0%) 1.19 ( 1%) 2.79 ( 0%) 89406 kB ( 2%) | |
backwards jump threading : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
tree SRA : 4.92 ( 0%) 1.82 ( 2%) 6.82 ( 0%) 294070 kB ( 8%) | |
isolate eroneous paths : 0.11 ( 0%) 0.00 ( 0%) 0.12 ( 0%) 0 kB ( 0%) | |
tree CCP : 6.50 ( 1%) 3.37 ( 3%) 10.76 ( 1%) 806 kB ( 0%) | |
tree reassociation : 0.37 ( 0%) 0.00 ( 0%) 0.39 ( 0%) 26 kB ( 0%) | |
tree PRE : 0.26 ( 0%) 0.00 ( 0%) 0.29 ( 0%) 25 kB ( 0%) | |
tree FRE : 5.05 ( 0%) 1.49 ( 1%) 6.59 ( 0%) 19300 kB ( 1%) | |
tree code sinking : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 16 kB ( 0%) | |
tree linearize phis : 0.08 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 4 kB ( 0%) | |
tree backward propagate : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) | |
tree forward propagate : 3.82 ( 0%) 0.17 ( 0%) 4.11 ( 0%) 243 kB ( 0%) | |
tree conservative DCE : 0.91 ( 0%) 0.20 ( 0%) 1.22 ( 0%) 102 kB ( 0%) | |
tree aggressive DCE : 2.42 ( 0%) 0.04 ( 0%) 2.51 ( 0%) 10499 kB ( 0%) | |
tree buildin call DCE : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
tree DSE : 2.66 ( 0%) 0.01 ( 0%) 2.79 ( 0%) 4221 kB ( 0%) | |
PHI merge : 0.21 ( 0%) 0.73 ( 1%) 0.82 ( 0%) 121 kB ( 0%) | |
tree loop invariant motion : 0.08 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 0 kB ( 0%) | |
tree canonical iv : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 1 kB ( 0%) | |
complete unrolling : 0.44 ( 0%) 0.65 ( 1%) 1.19 ( 0%) 42025 kB ( 1%) | |
tree vectorization : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
tree slp vectorization : 0.35 ( 0%) 0.02 ( 0%) 0.37 ( 0%) 25 kB ( 0%) | |
tree loop distribution : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) | |
tree iv optimization : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 18 kB ( 0%) | |
tree SSA uncprop : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 0 kB ( 0%) | |
tree switch conversion : 0.10 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 0 kB ( 0%) | |
gimple CSE sin/cos : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
gimple widening/fma detection : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
tree strlen optimization : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) | |
dominance frontiers : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) | |
dominance computation : 0.02 ( 0%) 0.01 ( 0%) 0.04 ( 0%) 0 kB ( 0%) | |
out of ssa : 0.07 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 0 kB ( 0%) | |
expand vars : 0.03 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 14 kB ( 0%) | |
expand : 0.52 ( 0%) 0.10 ( 0%) 0.65 ( 0%) 151144 kB ( 4%) | |
post expand cleanups : 0.17 ( 0%) 0.00 ( 0%) 0.18 ( 0%) 17 kB ( 0%) | |
varconst : 0.02 ( 0%) 0.01 ( 0%) 0.00 ( 0%) 4 kB ( 0%) | |
forward prop : 0.18 ( 0%) 0.00 ( 0%) 0.19 ( 0%) 7 kB ( 0%) | |
CSE : 0.68 ( 0%) 0.01 ( 0%) 0.71 ( 0%) 17 kB ( 0%) | |
dead code elimination : 0.10 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 0 kB ( 0%) | |
dead store elim1 : 0.18 ( 0%) 0.02 ( 0%) 0.20 ( 0%) 7 kB ( 0%) | |
dead store elim2 : 0.16 ( 0%) 0.00 ( 0%) 0.16 ( 0%) 7 kB ( 0%) | |
loop init : 0.07 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 413 kB ( 0%) | |
loop invariant motion : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
loop fini : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
CPROP : 1.14 ( 0%) 0.00 ( 0%) 1.14 ( 0%) 22 kB ( 0%) | |
PRE : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
CSE 2 : 0.43 ( 0%) 0.00 ( 0%) 0.44 ( 0%) 5 kB ( 0%) | |
branch prediction : 3.08 ( 0%) 0.01 ( 0%) 3.18 ( 0%) 311 kB ( 0%) | |
combiner : 0.08 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 27 kB ( 0%) | |
if-conversion : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 7 kB ( 0%) | |
integrated RA : 0.42 ( 0%) 0.00 ( 0%) 0.43 ( 0%) 137 kB ( 0%) | |
LRA non-specific : 0.59 ( 0%) 0.07 ( 0%) 0.70 ( 0%) 8 kB ( 0%) | |
LRA virtuals elimination : 0.08 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 0 kB ( 0%) | |
LRA reload inheritance : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 1 kB ( 0%) | |
LRA create live ranges : 0.02 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 1 kB ( 0%) | |
reload CSE regs : 0.18 ( 0%) 0.00 ( 0%) 0.18 ( 0%) 15 kB ( 0%) | |
load CSE after reload : 0.10 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 0 kB ( 0%) | |
ree : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%) 0 kB ( 0%) | |
thread pro- & epilogue : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 2 kB ( 0%) | |
combine stack adjustments : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) | |
peephole 2 : 0.03 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) | |
hard reg cprop : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) | |
scheduling 2 : 1.71 ( 0%) 0.25 ( 0%) 1.99 ( 0%) 36570 kB ( 1%) | |
machine dep reorg : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
reorder blocks : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 5 kB ( 0%) | |
shorten branches : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) | |
final : 0.13 ( 0%) 0.00 ( 0%) 0.13 ( 0%) 4547 kB ( 0%) | |
symout : 0.15 ( 0%) 0.03 ( 0%) 0.19 ( 0%) 11130 kB ( 0%) | |
variable tracking : 0.61 ( 0%) 0.00 ( 0%) 0.61 ( 0%) 727 kB ( 0%) | |
var-tracking dataflow : 0.15 ( 0%) 0.00 ( 0%) 0.16 ( 0%) 0 kB ( 0%) | |
var-tracking emit : 0.23 ( 0%) 0.00 ( 0%) 0.23 ( 0%) 10650 kB ( 0%) | |
tree if-combine : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 2 kB ( 0%) | |
straight-line strength reduction : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) | |
store merging : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) | |
initialize rtl : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 12 kB ( 0%) | |
address lowering : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) | |
rest of compilation : 0.90 ( 0%) 0.00 ( 0%) 0.87 ( 0%) 24 kB ( 0%) | |
remove unused locals : 10.20 ( 1%) 0.01 ( 0%) 10.24 ( 1%) 0 kB ( 0%) | |
address taken : 1.33 ( 0%) 0.03 ( 0%) 1.26 ( 0%) 0 kB ( 0%) | |
rebuild frequencies : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) | |
repair loop structures : 0.01 ( 0%) 0.01 ( 0%) 0.00 ( 0%) 0 kB ( 0%) | |
TOTAL :1268.77 120.21 1412.90 3822089 kB | |
real 23m33,527s | |
user 21m9,014s | |
sys 2m0,373s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment