Table of Contents
- Memory Access Utilities
- Introduction
- Layers of abstraction in memory access utilities
- Typical usage in SIL passes
- TODO
- Extend SmallProjectionPath to handle object tail indices
- Use SmallProjectionPath in AccessStorage
- Use SmallProjectionPath in AccessPath
- Generalize findReferenceRoot
- Handle casts and summarize cast info
- Simplify ReleaseDevirtualizer
- Simplify LICM load splitting
- Simplify Loop Opt Passes
- Rewrite LoadStoreOptUtils
- Preserve access markers
- Fix and verify MemBehavior's isLetAddress query
- Eliminate complexity in SIL
- Appendix: Proposed Swift API
- Appendix: How AccessPath verification works
- Appendix: Structural SIL requirements and def-use patterns
The Swift compiler often maps SIL addresses to storage locations and reasons about relationships between addresses. Memory access in SIL is structured. Invariants govern the role of references, pointers, addresses, and access scopes. Code throughout the optimizer should handle memory access consistently and reliably. SIL analysis should not view SIL as an arbitrary def-use chain of operations, but as part of a structured access, typically of the form:
%root = alloc_ref $C
%base = ref_element_addr %root : $C, #C.prop
%access = begin_access [read] [static] %base : $*S
%address = struct_element_addr %access : $*S, #.field
%value = load [trivial] %address : $*Int64
end_access %access : $*S
This is a proposal for utility APIs and abstractions that all SIL passes should use when working with memory access. New optimization logic can be introduced by filling the covered-set of cases provided by the abstractions. This sharply contrasts with the failed strategy of guessing, and hard-coding, which SIL opcodes need to be handled at each point.
Much of this is has been implemented and proved its effectiveness. For example, a complete and self-checked AccessStorage utility was essential for finding issues with the load borrow checker. It has also been essential in bringing up many OSSA-based optimizations and move-only diagnostics.
AccessPath is documented in the SIL Programmers’ Manual http://github.com/apple/swift/blob/main/docs/SILMemoryAccess.md. It should continue to be extended to better handle escape analysis, alias analysis and load/store optimization passes.
TODO: After review, append this document to docs/SILMemoryAccess.md.
Each layer of abstraction that the utilities are built from is well-defined and independently useful. Client code should always use the lowest level of abstraction necessary for clarity.
Although each level of abstraction can be customized, it's meaning is unchanged by that customization. Customization only allows new concepts to be built on top of the more fundamental concepts. This is in sharp contrast to other utilities in the SIL code base, in which a fantastically complex black box can be customized by callbacks to perform much simpler tasks.
These three forms of SSA traversal are are at core of all SIL address analysis. It is the single source of truth for identifying the operations that participate in a use-def chain at each stage of access from the memory instruction's operand to the storage root.
AccessUseDefChainVisitor
-
finds the base address of a memory access
-
identifies the storage kind (box, stack, global, class, argument, etc.)
-
clients must handle:
visitBase
visitNonAccess
(eventually illegal)visitPhi
(in case of pointer_to_address casts)visitStorageCast
visitAccessProjection
findReferenceRoot
-
finds the root object for a reference (previously referred to as RC-identity)
-
this could be generalized as a visitor
AccessUseVisitor
-
clients only need to handle
visitUse
-
clients can distinguish between exact, inner, and overlapping uses
-
currently only used by these top-level APIs:
visitAccessStorageUses
,visitAccessPathUses
, andAccessPath::collectUses
-
traversal logic is encapsulated by an underlying AccessPathDefUseTraversal. This could be exposed for other customizations.
The next level of memory access utilities are built on top of these SSA traversal visitors to handle several common use cases...
A compact unique ID for each storage object. For any SIL address, uniquely identify the storage kind and either base address or storage root.
Provides structural SIL verification. The access base must be reliably obtained from any SIL address:
-
to verify that access markers are correctly emitted by SILGen and correctly preserved across SIL transformations.
-
for complete diagnostics. For example, "load borrow" checking was simply "bailing out" in many cases before it was rewritten to use AccessedStorage. This means we were simply failing to diagnose ownership violations.
-
for IRGen optimization based on address origin, such as bitfield compaction
-
for reliable static access enforcement of locals. I don’t want the introduction of new SIL patterns involving addresses to silently result in emitting dynamic runtime checks for local variables where they used to be analyzed away.
-
to allow much simpler analysis in general, especially where completeness is required. There are many examples, but consider AccessEnforcementWMO. This currently implementation is trivial, based on recognizing access markers and relying on AccessedStorage.
(This API is currently over-complicated by the fact that a single global base can have multiple base addresses (because of global_addr cloning).
Nothing more than a composition of AccessStorage
and projection path for identifying more precise memory locations.
Both AccessStorage
and AccessPath
are minimal representations because they are intended to be stored in data structures as unique IDs. However, clients typically want to preserve information that was obtained during the use-def traversal in addition to the unique identifier. We currently have other "wrapper" classes that deliver more information to clients:
AccessStorageWithBase
preserves the base address if it is unique.
RelativeAccessStorageWithBase
preserves the original memory address and some cast information. This should probably be renamed AccessPathInfo
and generalized to include all potentially useful information: projection path, along with information about the kind of casts that occurred on that path.
AccessUseVisitor
is def-use traversal interface on top of AccessPathDefUseTraversal
.
UniqueStorageUseVisitor
extends AccessUseVisitor
to handle most use cases without any opcode matching.
AccessPath::collectUses
surfaces this logic is a minimal API.
AccessPathVerification
uses this API both to test internal AccessPath logic and to provide structural SIL verification for all storage uses. Many SIL utilities rely on the property that the use-def traversal is consistent with a corresponding def-use traversal.
For any reference, this attempts to find a common uniquely identified object, such as an alloc_ref or argument. Although it is a best effort analysis, it is important for this to be as consistent throughout the pipeline and stable across SIL changes as possible. Failing to find a uniquely identified object is found can block critical optimizations. This is complicated by phi analysis, which is important to
-
ensures that optimizations based on AccessedStorage can't be arbitrarily defeated by CFG transforms that simply clone blocks.
-
improves verification of the memory access utility themselves. Particularly "known use" verification.
Note: this has API has not yet been extended to a customizable abstraction. See "Generalize findReferenceRoot" below.
Escape analysis builds a chain of access paths across multiple objects. Then it propagations information across the SSA use-def chain using that "projection chain" as a guide.
Escape analysis only adds the following two concepts on top of the existing abstractions:
-
Projection chains
-
Alternative use-def and def-use traversals based on the same projection chain
Escape analysis chains multiple accesses together into a single projection chain using a single SmallProjectionPath instance. The "class projection" component links the projection path of each object.
For example, the path s0.c2.e1
spans two objects. The first object is a struct that holds a class reference. s0
projects a class reference out of the struct field #0. c2.e1
resolves the level of indirection between the class reference and the referenced object, then within that object projects the stored property #2 and enum payload #1.
Building a projection chain can be accomplished by directly chaining multiple calls to the fundamental SSA traversal visitors. It's only additional purpose is to add the following logic:
-
at a store, trigger a use-def traversal
-
at an either an access base or reference root for classes, trigger a def-use traversal
-
Cache each traversal to avoid backtracking when changing direction
Multiple variations customize the escape analysis utility:
-
Function effect escape summary
-
Object reference escape analysis
-
Address exposure analysis
-
A specific operation triggers the optimization. Grab its memory address.
-
Analyze storage: AccessStorage::compute(address)
-
Analyze all uses of the storage: UniqueStorageUseVisitor(storage)
See the SSADestroyHoisting
pass.
Legacy SIL passes could be drastically simplified by following this pattern.
In AccessPath
, this is referred to as the path "offset" (as opposed to the subobject index). Offsets must only occur at the head of the projection path or alias analysis is fundamentally broken.
findReferenceRoot (and by extension, AccessStorage::compute) should handle composition followed by projection, by popping the projection path as it goes:
%r = alloc_ref // root
%s = struct (%_, %r, ...)
%f = struct_extract %s, #1
%a = ref_element_addr %f // base
Simplify AccessPathDefUseTraversal
by popping
This, in turn makes UniqueStorageUseVisitor efficient.
(An exercise for the implementer: determine which logic to duplicate in Swift and C++ vs. expose via bridging.)
Combine these variations under a single abstraction:
- findOwnershipReferenceRoot
- findGuaranteedReferenceRoots
- findOwnershipReferenceAggregate
"Look through" composition of roots when the user provides a projection path.
Replace all RC-identity operations with this.
Currently, it is far too easy for clients of utilities to assume that an access path has no cast operations.
Summarize cast information along side SmallProjectionPath. Possibly in an AccessPathInfo
. Distinguish layout compatible casts, ownership preserving casts, etc. Verify no illegal casts.
This is a classic example of the need for AccessStorage::compute().getRoot()
.
Use an abstraction over SmallProjectionPath (converting AccessPath to SmallProjectionPath solves this). Simplify comparing load and store paths and rematerializing projections based on the cached projection kind.
COWArrayOpt and ArrayPropertyOpt should use the standard pattern of:
storage = AccessPath::compute()
UniqueStorageUseVisitor(storage)
Use the new abstractions to redesign dead store and redundant load elimination passes.
Markers should be preserved throughout the SIL pipeline.
-
Markers need to be supported throughout the pipeline anyway for checks that can't be eliminated
-
Removing most but not all of the markers hides many optimizer bugs where access scopes are improperly handled.
-
Preserving markers allows verification that exclusivity checks aren't accidentally dropped.
-
Preserving markers allows stronger optimization exclusivity checks later in the pipeline
-
Preserving markers allows much stronger and more efficient alias analysis. Aliasing only needs to check whether the other memory operation is within the same access scope.
-
Preserving marker will solve a major source of complexity in the design and interface of memory utilities. The complexity stems from not having a consistent SIL representation throughout the pipeline. The utilities current need to handle SIL with or without access markers. Sometimes the client wants to "see-through" the markers, sometimes it wants to find the markers, etc.
MemBehavior logic (used by SILInstruction::mayWriteToMemory
) assumes
that a valid AccessBase exists for every store that initializes a
'let' variables (see isLetAddress()
. But this was never checked. I
believe the current logic can cause miscompiles. Note that property
and type wrappers use initialization functions.
A complete fix requires that all addresses have a valid AccessBase.
To fix MemBehavior:
-
narrowly identify all the kinds of instructions that can initialize a variable (lets call this InitializeAddress)
-
identify the address operand that is written to
-
teach MemoryBehavior that any instruction other than InitializeAddress cannot write to a 'let' address
-
teach MemoryBehavior that any InitializeAddress instruction cannot write to a 'let' address if its AddressBase is a non-let.
Longer term:
-
generate and preserve
begin_access [init]
for all let initialization -
verify that only an InitializeAddress can use an address produced by a begin_access [init]
-
verify that that
begin_access [init]
initializes a let once on all paths
In addition to access markers inconsistency, there are other complications:
-
Storage casts (changing the address type) are currently allowed inside a formal access. It should be trivial to recover the access scope from a memory operation. But instead, there are potential phis and changes to the address type!
-
When the storage base is valid but unidentified, we currently allow different edges of a pointer phi to project different paths. This means we could have a valid AccessedStorage without a valid AccessPath.
-
We currently allow a 'load' instruction to take part in an address projection: unchecked_take_enum_data_addr -> load -> project_box. This makes the implementation quite confusing.
-
KeyPath generation incorrectly generates a pointer_to_address
$UnsafePointer to $ *X without the necessary struct_extract.
Strawman AccessPath API (the details will change). These typed wrappers allow building type-safe APIs with assertions in the right places.
enum AccessStorageKind {
case Box
case Stack
case Global
case Class
case Tail
case Argument
case Yield
case Nested
case Pointer
}
// Every SIL address has a single valid base. It is the origin of the
// address--the first address type that appears in a def-use walk. It identifies
// the kind of storage being accessed. The kind of storage must be known for
// optimizer and code generation correctness.
// The base address is never inside an access scope.
struct AccessBase {
var baseAddress: Value
init(forAddress: Value)
var storageKind: AccessStorageKind { get }
// non-nil for Box, Class, and Tail
var reference: Value? { get }
var isLet: Bool { get }
var isLocal: Bool { get }
var isUniquelyIdentified: Bool { get }
func isDistinct(from: AccessBase) -> Bool
}
enum AccessKind {
case Read
case Modify
func mayConflict(with: AccessKind) -> Bool
}
struct AccessScope {
// begin_access or begin_access_unpaired
var beginAccess: Instruction
var accessedAddress: Value { get }
var accessKind: AccessKind { get }
var scopeEndingInstructions: IteratorProtocol<Instruction> { get }
}
// Find the enclosing access scope or base address if there is no enclosing
// access scope.
enum EnclosingScope {
case scope(AccessScope)
case base(AccessBase)
init(forAddress: Value)
}
// For any SIL address, identify both the base and the field within the
// storage identified by the base. This provides both the kind of storage and
// location within that storage.
struct AccessPath {
var base: AccessBase
// address projections only
var projectionPath: SmallProjectionPath
init(forAddress: Value)
func isDistinct(from: AccessPath) -> Bool
}
// The origin of a reference.
struct AccessRoot {
var root: Value
// Sequence of: (value extractions, class projection, address projections)
// in that order. Only one class projection is allowed.
var projectionPath: SmallProjectionPath
}
// For any SIL address, identify a common reference root, base address, and
// field.
struct AccessStorage {
// valid for Box, Class, and Tail
var commonRoot: AccessRoot?
var base: AccessBase
init(forAddress: Value)
func isDistinct(from: AccessStorage) -> Bool
}
extension AccessBase {
// `forProjection` contains address projections only
//
// Returns a non-empty list for Box, Class, and Tail
func findRoots(forProjection: SmallProjectionPath?)
-> IteratorProtocol<AccesPath>
}
extension AccessPath {
// Returns a non-empty list for Box, Class, and Tail
func findRoots() -> IteratorProtocol<AccessRoot> {
base.findRoots(forProjection: projectionPath)
}
}
// `projectionPath` is the path from the address base to the use.
// If this is the use of a root, then the full path can be derived by appending
// this path to the root's extraction path and class projection.
struct AccessUse {
var scope: EnclosingScope
var projectionPath: SmallProjectionPath
var operand: Operand
}
extension AccessBase {
// `overlapping` contains address projections only
//
// Compare the resulting AccessPath to determine whether the use exactly
// matches `overlapping`, is contained by `overlapping`, or may overlap in
// an unknown way.
//
// Symmetric with init(forAddress:). `self` is the base of each use.
// (AccessPathVerification checks this on the C++ side.)
func findUses(overlapping: SmallProjectionPath? = nil)
-> IteratorProtocol<AccessUse>
// `matching` contains address projections only
func findUses(matching: SmallProjectionPath) -> IteratorProtocol<AccessUse>
}
extension AccessPath {
func findOverlappingUses() -> IteratorProtocol<AccessUse> {
base.findUses(overlapping: projectionPath)
}
func findMatchingUses() -> IteratorProtocol<AccessUse> {
base.findUses(matching: projectionPath)
}
}
// Symmetric with `findRoots`. `self` is one of the roots of each use.
extension AccessRoot {
func findUses(overlapping: SmallProjectionPath? = nil)
-> IteratorProtocol<AccessUse>
func findUses(matching: SmallProjectionPath) -> IteratorProtocol<AccessUse>
}
extension AccessStorage {
func findOverlappingUses() -> IteratorProtocol<AccessUse> {
root.findUses(overlapping: projectionPath)
}
func findMatchingUses() -> IteratorProtocol<AccessUse> {
root.findUses(matching: projectionPath)
}
}
AccessPath can be used to identify any addressible location without keeping a reference to the original address value. An address projection can always be rematerialized as long as its enclosing access scope is known.
Class properties are not part of the projection path in AccessPath. They are identified by the baseAddress instruction (ref_element_addr
). Unlike address projections, a ref_element_addr
cannot be rematerialized directly from the root. The root may have a different object type. The root may also be within a different SIL "scope".
PROBLEM: stored properties don't currently have unique indices. So summarizing class property info in SmallProjectionPath won't work.
AccessStorage has a unique identity for any set of accesses which must alias.
PROBLEM: if ref_element_addr
is part of AccessStorage identity, then we can't tell that multiple accesses to the same property must alias just by the identity.
The model for structurally valid SIL is expressed by the def-use and use-def visitors. The AccessPath type is just a convenient/efficient interface that computes itself using those visitors, producing a value that can be easily checked. It's mainly a wrapper around AccessedStorage, which has the key information for verification.
checkAccessedAddress() verifies that every memory access is within a formal scope. It currently lives in DiagnoseStaticExclusivity but it was supposed to be moved to SIL verification as soon as possible. It's currently hidden underneath -enable-verify-exclusivity, but we should run it by default at least whenever SIL is in OSSA form.
identifyFormalAccess() is a lesser form of verification. It only checks that access markers themselves have recognizable AccessedStorage. This is the bare minimum to check the assumptions made by exclusivity optimization. The correctness of access marker optimization depends on that verification being fully enabled. The SILVerifier currently calls identifyFormalAccess() to make sure it doesn't assert, but the verification itself is disabled!
The AccessPathVerification pass performs round-trip verification. This is a self-check that the AccessPath verification itself is complete. It also checks the def-use logic (as opposed to only the use-def logic), and it checks consistency of use-def and def-use logic. It runs a couple of times in the non-release pipeline.
SIL verification relies on the memory access utilities to enforce valid SIL. Any pattern that the SSA traversals can't handle is illegal SIL by definition. Verification guaranteed that:
-
Every address-type SILValue is associated with a either a Swift formal access to a recognized variable or an explicitly "unsafe" compiler-generated access. In short, we need to reliably recognize the access scope and storage being accessed.
-
Every address-type SILValue has exactly one valid AccessedStorage. A phi that originates in two non-identical AccessPaths is illegal.
-
If a SILValue's AccessedStorage is identified (not pointer loaded from UnsafePointer for example) then all aliasing memory access in the entire program has the same identity after resolving function arguments.
-
An AccessPath cannot index into a subobject. e.g. struct_element_addr -> index_addr is illegal.
TODO: I'm certainly missing some other rules.
Memory access utilities are currently more permissive than they should be because it will take more time to fix all SIL producers that currently generate problematic patterns. If the rules for valid SIL can be simplified, then the implementation of AccessPath will also be simplified.
For example, phi's of type RawPointer are currently allowed which leads to all sorts of problematic patterns. The same problem occurs with box-type phis.
Examples of SIL that cannot currently be allowed:
Address phi
bb1:
%1 = ref_element_addr %0 : $C, #C.prop1
br bb3(%1 : $*Int64)
bb2:
%2 = ref_element_addr %0 : $C, #C.prop2
br bb3(%2 : $*Int64)
bb3(%phi : $*Int64):
%3 = begin_access [read] %phi : $*Int64
pointer (or box) phi to distinct storage
bb1:
%1 = ref_element_addr %0 : $C, #C.prop1
%2 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
br bb3(%2 : $Builtin.RawPointer)
bb2:
%4 = ref_element_addr %0 : $C, #C.prop2
%5 = address_to_pointer %4 : $*Int64 to $Builtin.RawPointer
br bb3(%2 : $Builtin.RawPointer)
bb3(%phi : $Builtin.RawPointer):
%7 = pointer_to_address %3 : $Builtin.RawPointer to $*Int64
%8 = begin_access [read] %7 : $*Int64
pointer (or box) phi to distinct named subobjects
bb0:
%1 = begin_access [read] %0 : $*S
bb1:
%2 = struct_element_addr %1 : $*S, #S.field1
%3 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
br bb3(%3 : $Builtin.RawPointer)
bb2:
%4 = struct_element_addr %1 : $*S, #S.field2
%5 = address_to_pointer %4 : $*Int64 to $Builtin.RawPointer
br bb3(%3 : $Builtin.RawPointer)
bb3(%phi : $Builtin.RawPointer):
%7 = pointer_to_address %3 : $Builtin.RawPointer to $*Int64
%8 = load %7 : $*Int64
phi cycle with non-index projection
bb1(%phi : $Builtin.RawPointer):
cond_br %cond1, bb2, bb3
bb2:
%1 = pointer_to_address %phi : $Builtin.RawPointer to $*S
%2 = struct_element_addr %phi : $S, #S.field1
%3 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
br bb4(%3 : $Builtin.RawPointer)
bb3:
br bb4(%phi : $Builtin.RawPointer)
bb4(%merge : $Builtin.RawPointer)
%5 = load %merge : $*Builtin.Int64
cond_br %cond2, bb1, bb5
bb5:
Note that phi cycles with index projections are allowed.
Examples of SIL that should not be allowed but currently are allowed until verification is improved:
missing access scope
%1 = ref_element_addr %0 : $C, #C.prop
%2 = load %1 : $*Int64
We could introduce the concept of implicit access scopes for specific patterns, but that concept doesn't exist yet.
pointer cast of a named property within an access
%1 = ref_element_addr %0 : $C, #C.prop
%2 = address_to_pointer %1 : $*Int64 to $Builtin.RawPointer
%3 = pointer_to_address %2 : $*Int64 to $Builtin.RawPointer
%4 = begin_access [read] %3 : $*Int64
%5 = load %4 : $*Builtin.Int64
TODO: We could fill out these examples with many more cases. We should show some examples of valid/invalid address or pointer type function arguments.
@eeckstein thanks for reading. I'm not sure from your response whether you got all the way to the TODO section.
The job of the SIL verifier is to check the structural rules of SIL. It is deeply important to me that any assumptions made by passes are enforced by the SIL verifier.
(although some non-structural rules are actually enforced earlier by asserts in the physical SIL representation or the SIL builder)
SIL has many structural rules. The ones proposed here are no different. There are really only two big broad rules:
all memory operations are part of an access scope
all access scopes have a recognized base address, which indicates
the kind of storage being accessed
We may need add some other small rules, like legal positions for index_addr.
There's no separate verification for this. Just normal SIL verification.
I think the confusion comes from
AccessUseDefChainVisitor
being both a utility and used during verification.I have been thinking of everything in the /SIL directory as part of SIL and necessary for SIL verification, but that isn't true. Some of it is just fundamental abstraction "on top" of SIL as you say.
MemAccessUtils should probably be split into two pieces:
Bucket #1 (SIL/Verifier). a small amount of logic that is required enforce SIL invariants
Bucket #2 (SIL/Utils). other related logic that we want to be consistent across all SIL passes, and that is self-verified, but is not necessary to determine whether SIL itself is legal
In bucket #2 would be
all the findReferenceRoot related logic (which should itself be a
customizable visitor--like
walkUp
, but without any addresses orpointers.
all the phi analysis logic for finding a single common root
all the projection path book-keeping (which should be converted to SmallProjectionPath)
all the find-all-known-uses-or-bail logic
Structural rules are rules that are not confined to a single operation. SIL has many of these. All rules involving control flow or block arguments are structural. As are all rules involving scopes, stack-based operations, and lifetimes.
Of course, there are multiple ways to represent the same semantics. But fusing opcodes and making semantics implicit within an operation does not make those semantics more a part of SIL.
That might be a good idea, but it isn't necessary for defining the invariant.
(also note that it's not that simple because you'll need to invent new opcodes for unsafe pointers)
We can make an example of that for now, since I do consider that outrageous. But I'm actually referring to a repeated practice on the C++ side that I'd like to not see repeated on the Swift side.
That's fine. It's the concepts and requirements that are important, not the implementation.
It hides these things, which makes it impossible to understand the algorithm at a conceptual level. I was trying to reverse engineer the meaning of EscapeInfo for a week. I'm still in the middle of writing a design document to try to make sense of it. If the implementation were composed from smaller utilities that each modeled a well-understood concept, then this wouldn't be a problem.
It's impossible to begin to understand what
isEscaping
means without talking about the difference between addresses and values.It's impossible to begin to understand how to work with EscapeInfo without undersanding that each class projection is a level of indirection between objects.
That's a meaningless comparison. I could implement a collectUses in minutes, just like EscapeInfo, that doesn't support literally every SIL pattern that we've ever seen, isn't round-trip verified against the use-def abstraction, doesn't support partial path matches, doesn't support index_addr, and isn't attempting to map back and forth between path vectors and a prefix trie. Almost all the implementation complexity is the prefix trie mapping and partial paths.
In fact, we have many versions of exactly that scattered throughout the C++. The point is that we need one version, which should meet all of the requirements above, however it is implemented.
I cannot disagree with this strongly enough. EscapeInfo adds two features to a basic SSA traversal, both are nonsense in any other context where we otherwise need exactly the same SSA traversal. This is all explained above--
Incremental Escape Analysis--in fact, the main purpose of this document was to explain that.
AccessStorage was generalized about a year ago so that it is not specific to exclusivity. Since then, it seems to be the perfect utility for all the SIL analyses that we've needed. It has really saved us on countless occasions now.
The AccessPath interface is ideal for replacing legacy passes: LICM, RLE, DSE except for a couple problems:
our alias analysis interface doesn't accept it yet
manipluating the path requires converting between trie nodes and vectors
I never said EscapeInfo should use the AccessPath interface. Here is what should happen:
They should both use the same SSA traversal logic. That logic should be expressed as two independent visitors. One for reference-containing values. One for addresses and pointers.
Those visitors should be built on low level SIL abstractions, like for example: ProjectionInstruction, AddressCastInstruction, StorageCastInstruction. The customizations can easily intercept any particular opcode, but should almost never need to in practice.
None of that functionally changes how EscapeInfo builds its projection chain. Each step in the algorithm proceeds the same way. However, the code would be readable and the algorithm understandable. Logic that deals with address projections would be naturally separate from logic that composes references. The customization callbacks should not change the algorithm. They should only perform extra book-keeping and feedback whether to continue processing.
Yes, of course. That's exactly what I'm trying to work toward. Except that the abstractions we have aren't quite the ones we want. We don't need so many, just a few useful ones. I'm dying to replace all the "mixin" classes with a single, simple ForwardingInstruction.