-
-
Save navyxliu/6239ce24f1ae447060302cc8562cbb71 to your computer and use it in GitHub Desktop.
import java.lang.Math; | |
class Example1_Point { | |
static class Point { | |
double x; | |
double y; | |
public Point(double x, double y) { | |
this.x = x; | |
this.y = y; | |
} | |
// distance from zero | |
public double length() { | |
return Math.sqrt(x * x + y * y); | |
} | |
} | |
private static Point _cache; | |
public double test(boolean cond, double x, double y) { | |
Point pt = new Point(x, y); | |
pt.x = 3.0d; | |
if (cond) { | |
_cache = pt; // materialize here. pt1 = Mat(pt) | |
} | |
// pt2 = phi(pt, pt1) | |
return pt.length(); | |
} | |
public static void main(String[] args) { | |
var kase = new Example1_Point(); | |
long iterations = 0; | |
try { | |
while (true) { | |
kase.test(0 == (iterations & 0xf), 2.0d, 1.0d); | |
iterations++; | |
} | |
} finally { | |
System.err.println("Epsilon Test: " + iterations); | |
} | |
} | |
} |
Still with PEA, it's possible that we skip passive materialization. it's still correct because we use the original Allocation.
1 public double test(boolean cond, double x, double y) {
2 Point pt = new Point(x, y);
3 pt.x = 3.0d;
4
5 if (cond) {
6 pt1 = Mat(pt);
7 _cache = pt1;
8 }
9 pt2 = Phi(pt, pt1);
10 return pt2.length();
11 }
One problem of this approach is that we have a partial redundancy. when cond is true, we allocate 2 objects while the orignal program only has one allocation.
here is ideal graph before EA with PEA but without passive materialization. 171 Phi merges the original object and the clone object at line 9 above.
JO(4) is the original object. It is NSR so SR can't replace it.
JavaObject(4) NoEscape(NoEscape) NSR [ 101F 95F 197F 192F [ 43 48 171 ]] 31 Allocate === 5 6 7 8 1 (29 27 28 1 1 1 11 12 1 14 1 1 ) [[ 32 33 34 41 42 43 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) Example1_Point::test @ bci:0 (line 22) Type:{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:rawptr:NotNull} !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(14) NoEscape(NoEscape) [ 31P [ 48 101b 95b ]] 43 Proj === 31 [[ 44 48 95 101 ]] #5 Type:rawptr:NotNull !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(16) NoEscape(NoEscape) [ 43 31P [ 171 ]] 48 CheckCastPP === 45 43 [[ 171 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(18) NoEscape(NoEscape) [ 150 48 31P 126P [ 197b 192b ]] 171 Phi === 122 150 48 [[ 197 197 192 192 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:29 (line 29)
I believe JDK-8287061 intends to make the NonEscaped/NSR objects like JO(4) eligible for scalar replacement. That's why I think Cesar's work will have synergy effect with PEA. C2PEA splits the original object into 2 or 3 objects, so C2 can distinct what are really escaped from what are not escaped. if JDK-8287061 can guarantee all those PHIs are processed, the original object is either scalar-replaced or removed. we would get away from partial redundancy issue!
I cherry-picked Cesar's patch to PEA_parser branch.
As expected, this patch picks up Phi 171 and unravel it. in this way, EA/SR manages to eliminate 31 Allocate eventually. we eliminate the partial redundancy
in this case.
here is the final result of C2 EA/SR.
Can reduce Phi 171 during invocation 0:
======== Connection graph for Example1_Point::test
invocation #0: 2 iterations and 0.000021 sec to build connection graph with 220 nodes and worklist size 21
JavaObject(4) NoEscape(NoEscape) [ 101F 95F 197F 192F [ 43 48 171 ]] 31 Allocate === 5 6 7 8 1 (29 27 28 1 1 1 11 12 1 14 1 1 ) [[ 32 33 34 41 42 43 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) Example1_Point::test @ bci:0 (line 22) Type:{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:rawptr:NotNull} !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(14) NoEscape(NoEscape) [ 31P [ 48 101b 95b ]] 43 Proj === 31 [[ 44 48 95 101 ]] #5 Type:rawptr:NotNull !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(16) NoEscape(NoEscape) [ 43 31P [ 171 ]] 48 CheckCastPP === 45 43 [[ 171 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:0 (line 22)
LocalVar(18) NoEscape(NoEscape) [ 150 48 31P 126P [ 197b 192b ]] 171 Phi === 122 150 48 [[ 197 197 192 192 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:29 (line 29)
JavaObject(5) GlobalEscape(GlobalEscape) NSR [ 160F 155F 197F 192F [ 145 150 162 171 125 ]] 126 Allocate === 117 42 56 8 1 (29 27 28 1 1 1 11 12 1 14 1 1 ) [[ 129 130 131 143 144 145 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) Example1_Point::test @ bci:0 (line 22) Type:{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:rawptr:NotNull} !jvms: Example1_Point::test @ bci:26 (line 26)
LocalVar(15) GlobalEscape(GlobalEscape) [ 126P [ 150 160b 155b ]] 145 Proj === 126 [[ 146 150 155 160 ]] #5 Type:rawptr:NotNull !jvms: Example1_Point::test @ bci:26 (line 26)
LocalVar(17) GlobalEscape(GlobalEscape) [ 145 126P [ 162 171 ]] 150 CheckCastPP === 147 145 [[ 171 162 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:26 (line 26)
LocalVar(19) GlobalEscape(GlobalEscape) [ 150 126P [ 125 ]] 162 EncodeP === _ 150 [[ 163 ]] Type:narrowoop: Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:26 (line 26)
LocalVar(18) NoEscape(NoEscape) [ 150 48 31P 126P [ 197b 192b ]] 171 Phi === 122 150 48 [[ 197 197 192 192 ]] Oop:Example1_Point$Point:NotNull:exact * !jvms: Example1_Point::test @ bci:29 (line 29)
Field(6) GlobalEscape(GlobalEscape) oop +112 ( 123P )[ 162 1P 126P [ ]] 125 AddP === _ 123 123 124 [[ 163 ]] Oop:java/lang/Class (java/io/Serializable,java/lang/constant/Constable,java/lang/reflect/AnnotatedElement,java/lang/invoke/TypeDescriptor,java/lang/reflect/GenericDeclaration,java/lang/reflect/Type,java/lang/invoke/TypeDescriptor$OfField) java.lang.Class {0x00000000fe3acef8} - klass: public final synchronized 'java/lang/Class' - ---- fields (total size 15 words): - private volatile transient 'classRedefinedCount' 'I' @12 0 (0x00000000) - abstract internal 'klass' 'J' @16 4295764992 (0x00000001000c2c00) - abstract internal 'array_klass' 'J' @24 0 (0x0000000000000000) - abstract internal 'oop_size' 'I' @32 15 (0x0000000f) - abstract internal 'static_oop_field_count' 'I' @36 1 (0x00000001) - private volatile transient 'cachedConstructor' 'Ljava/lang/reflect/Constructor;' @40 nullptr (0x00000000) - private transient 'name' 'Ljava/lang/String;' @44 "Example1_Point"{0x00000000fe3acfa8} (0xfe3acfa8) - private transient 'module' 'Ljava/lang/Module;' @48 a 'java/lang/Module'{0x00000000fe2e6f20} (0xfe2e6f20) - private final 'classLoader' 'Ljava/lang/ClassLoader;' @52 a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000000fe2e6d60} (0xfe2e6d60) - private transient 'classData' 'Ljava/lang/Object;' @56 nullptr (0x00000000) - private transient 'packageName' 'Ljava/lang/String;' @60 ""{0x00000000fe00c6f8} (0xfe00c6f8) - private final 'componentType' 'Ljava/lang/Class;' @64 nullptr (0x00000000) - private volatile transient 'reflectionData' 'Ljava/lang/ref/SoftReference;' @68 a 'java/lang/ref/SoftReference'{0x00000000fe3ad178} (0xfe3ad178) - private volatile transient 'genericInfo' 'Lsun/reflect/generics/repository/ClassRepository;' @72 nullptr (0x00000000) - private volatile transient 'enumConstants' '[Ljava/lang/Object;' @76 nullptr (0x00000000) - private volatile transient 'enumConstantDirectory' 'Ljava/util/Map;' @80 nullptr (0x00000000) - private volatile transient 'annotationData' 'Ljava/lang/Class$AnnotationData;' @84 nullptr (0x00000000) - private volatile transient 'annotationType' 'Lsun/reflect/annotation/AnnotationType;' @88 nullptr (0x00000000) - transient 'classValueMap' 'Ljava/lang/ClassValue$ClassValueMap;' @92 nullptr (0x00000000) - abstract internal 'protection_domain' 'Ljava/lang/Object;' @96 a 'java/security/ProtectionDomain'{0x00000000fe3ac998} (0xfe3ac998) - abstract internal 'signers_name' 'Ljava/lang/Object;' @100 nullptr (0x00000000) - abstract internal 'source_file' 'Ljava/lang/Object;' @104 nullptr (0x00000000) - signature: LExample1_Point; - ---- static fields (1): - private static '_cache' 'LExample1_Point$Point;' @112 a 'Example1_Point$Point'{0x00000000fe3aeaf8} (0xfe3aeaf8):Constant:exact+112 * [narrow] !jvms: Example1_Point::test @ bci:26 (line 26)
Scalar 31 Allocate === 5 6 7 8 1 (29 27 28 1 1 1 11 12 1 14 1 1 ) [[ 32 33 34 41 42 43 ]] rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top, bool ) Example1_Point::test @ bci:0 (line 22) Type:{0:control, 1:abIO, 2:memory, 3:rawptr:BotPTR, 4:return_address, 5:rawptr:NotNull} !jvms: Example1_Point::test @ bci:0 (line 22)
++++ Eliminated: 31 Allocate
Statement: PEA without passive materialization doesn't duplicate the allocation when C2 applies to JDK-8287061
Prove:
o denotes the original object: o0 = Allocate(Class).
-
if o0 never encounters an escaping point, PEA doesn't change o0. no duplication.
-
if o0 encounters an escaping point in a branch, PEA materializes it
o1 = Mat(Obj)
. o1 is Escape- o0 is dead before reaching to a merging point, C2 parse doesn't generate phi node for it. o0 is NonEscape and no phi node consumes it. Current C2 guarantees to perform scalar replacement for it. o0 is gone, so no duplication in the path that PEA materializes the object.
- o0 is live at the merging point. C2 parse creates a new revision of it due to SSA property. o2 = PHI(o0, o1). o0 is NonEscape and NSR in C2 EA. JDK-8287061 marks o2 as a reducible phi and conducts scalar replacement for o0. no duplication.
It's natural to extend this proof to multiple branches. As long as we retain SSA property, we can prove that the original allocation is either dead or get scalar replaced if PEA does materialize the object.
JDK-8287061 determines whether o2 is a reducible phi based on both inputs and outputs. it's possible that JDK-8287061 opts out phi reduction because the consumer of o2 is not recognizable. I think this problem is amendable. we can extend it to include all possible codeshape in the future.
With PEA, we clone the object at both the escaping point and the merging point.
We end up with 2 Allocations. One is for cond = true and the other is for cond = false.
Here is the ideal graph after optimizer. please note the Phi Node 164. It merges 2 objects for pt.length().
Because of this 164 PHI, EA marks that NonEscaped object
NSR
. It prevents ScalarReplacement from replacing it. Here is relevant message from -XX:+PrintEscapeAnalysis. JO(5) is from the Passive_Mat(pt) at line 9.