atrick/2022-04-memaccess-utils.md

## 2022-04-memaccess-utils.md

      
    Raw
  

              2022-04-memaccess-utils.md
            
          
    Memory Access Utilities


Table of Contents

Memory Access Utilities

Introduction
Layers of abstraction in memory access utilities

I. SSA traversal
II. AccessStorage
III. AccessPath
IV. AccessUseVisitor customization
V. findReferenceRoot
VI. Incremental escape analysis utility


Typical usage in SIL passes
TODO

Extend SmallProjectionPath to handle object tail indices
Use SmallProjectionPath in AccessStorage
Use SmallProjectionPath in AccessPath
Generalize findReferenceRoot
Handle casts and summarize cast info
Simplify ReleaseDevirtualizer
Simplify LICM load splitting
Simplify Loop Opt Passes
Rewrite LoadStoreOptUtils
Preserve access markers
Fix and verify MemBehavior's isLetAddress query
Eliminate complexity in SIL


Appendix: Proposed Swift API

AccessPath Notes


Appendix: How AccessPath verification works
Appendix: Structural SIL requirements and def-use patterns


Introduction

The Swift compiler often maps SIL addresses to storage locations and reasons about relationships between addresses. Memory access in SIL is structured. Invariants govern the role of references, pointers, addresses, and access scopes. Code throughout the optimizer should handle memory access consistently and reliably. SIL analysis should not view SIL as an arbitrary def-use chain of operations, but as part of a structured access, typically of the form:
  %root    = alloc_ref $C
  %base    = ref_element_addr %root : $C, #C.prop
  %access  = begin_access [read] [static] %base : $*S
  %address = struct_element_addr %access : $*S, #.field
  %value   = load [trivial] %address : $*Int64
  end_access %access : $*S

This is a proposal for utility APIs and abstractions that all SIL passes should use when working with memory access. New optimization logic can be introduced by filling the covered-set of cases provided by the abstractions. This sharply contrasts with the failed strategy of guessing, and hard-coding, which SIL opcodes need to be handled at each point.
Much of this is has been implemented and proved its effectiveness. For example, a complete and self-checked AccessStorage utility was essential for finding issues with the load borrow checker. It has also been essential in bringing up many OSSA-based optimizations and move-only diagnostics.
AccessPath is documented in the SIL Programmers’ Manual
http://github.com/apple/swift/blob/main/docs/SILMemoryAccess.md. It should continue to be extended to better handle escape analysis, alias analysis and load/store optimization passes.
TODO: After review, append this document to docs/SILMemoryAccess.md.
Layers of abstraction in memory access utilities

Each layer of abstraction that the utilities are built from is well-defined and independently useful. Client code should always use the lowest level of abstraction necessary for clarity.
Although each level of abstraction can be customized, it's meaning is unchanged by that customization. Customization only allows new concepts to be built on top of the more fundamental concepts. This is in sharp contrast to other utilities in the SIL code base, in which a fantastically complex black box can be customized by callbacks to perform much simpler tasks.
I. SSA traversal

These three forms of SSA traversal are are at core of all SIL address analysis. It is the single source of truth for identifying the operations that participate in a use-def chain at each stage of access from the memory instruction's operand to the storage root.
AccessUseDefChainVisitor


finds the base address of a memory access


identifies the storage kind (box, stack, global, class, argument, etc.)


clients must handle:
visitBase
visitNonAccess (eventually illegal)
visitPhi       (in case of pointer_to_address casts)
visitStorageCast
visitAccessProjection


findReferenceRoot


finds the root object for a reference (previously referred to as RC-identity)


this could be generalized as a visitor


AccessUseVisitor


clients only need to handle visitUse


clients can distinguish between exact, inner, and overlapping uses


currently only used by these top-level APIs: visitAccessStorageUses, visitAccessPathUses, and AccessPath::collectUses


traversal logic is encapsulated by an underlying AccessPathDefUseTraversal. This could be exposed for other customizations.


The next level of memory access utilities are built on top of these SSA traversal visitors to handle several common use cases...
II. AccessStorage

A compact unique ID for each storage object. For any SIL address, uniquely identify the storage kind and either base address or storage root.
Provides structural SIL verification. The access base must be reliably obtained from any SIL address:


to verify that access markers are correctly emitted by SILGen and correctly preserved across SIL transformations.


for complete diagnostics. For example, "load borrow" checking was simply "bailing out" in many cases before it was rewritten to use AccessedStorage. This means we were simply failing to diagnose ownership violations.


for IRGen optimization based on address origin, such as bitfield compaction


for reliable static access enforcement of locals. I don’t want the introduction of new SIL patterns involving addresses to silently result in emitting dynamic runtime checks for local variables where they used to be analyzed away.


to allow much simpler analysis in general, especially where completeness is required. There are many examples, but consider AccessEnforcementWMO. This currently implementation is trivial, based on recognizing access markers and relying on AccessedStorage.


(This API is currently over-complicated by the fact that a single global base can have multiple base addresses (because of global_addr cloning).
III. AccessPath

Nothing more than a composition of AccessStorage and projection path for identifying more precise memory locations.
Both AccessStorage and AccessPath are minimal representations because they are intended to be stored in data structures as unique IDs. However, clients typically want to preserve information that was obtained during the use-def traversal in addition to the unique identifier. We currently have other "wrapper" classes that deliver more information to clients:
AccessStorageWithBase preserves the base address if it is unique.
RelativeAccessStorageWithBase preserves the original memory address and some cast information. This should probably be renamed AccessPathInfo and generalized to include all potentially useful information: projection path, along with information about the kind of casts that occurred on that path.
IV. AccessUseVisitor customization

AccessUseVisitor is def-use traversal interface on top of AccessPathDefUseTraversal.
UniqueStorageUseVisitor extends AccessUseVisitor to handle most use cases without any opcode matching.
AccessPath::collectUses surfaces this logic is a minimal API.
AccessPathVerification uses this API both to test internal AccessPath logic and to provide structural SIL verification for all storage uses. Many SIL utilities rely on the property that the use-def traversal is consistent with a corresponding def-use traversal.
V. findReferenceRoot

For any reference, this attempts to find a common uniquely identified object, such as an alloc_ref or argument. Although it is a best effort analysis, it is important for this to be as consistent throughout the pipeline and stable across SIL changes as possible. Failing to find a uniquely identified object is found can block critical optimizations. This is complicated by phi analysis, which is important to


ensures that optimizations based on AccessedStorage can't be arbitrarily defeated by CFG transforms that simply clone blocks.


improves verification of the memory access utility themselves. Particularly "known use" verification.


Note: this has API has not yet been extended to a customizable abstraction. See "Generalize findReferenceRoot" below.
VI. Incremental escape analysis utility

Escape analysis builds a chain of access paths across multiple objects. Then it propagations information across the SSA use-def chain using that "projection chain" as a guide.
Escape analysis only adds the following two concepts on top of the existing abstractions:


Projection chains


Alternative use-def and def-use traversals based on the same projection chain


Escape analysis chains multiple accesses together into a single projection chain using a single SmallProjectionPath instance. The "class projection" component links the projection path of each object.
For example, the path s0.c2.e1 spans two objects. The first object is a struct that holds a class reference. s0 projects a class reference out of the struct field #0. c2.e1 resolves the level of indirection between the class reference and the referenced object, then within that object projects the stored property #2 and enum payload #1.
Building a projection chain can be accomplished by directly chaining multiple calls to the fundamental SSA traversal visitors. It's only additional purpose is to add the following logic:


at a store, trigger a use-def traversal


at an either an access base or reference root for classes, trigger a
def-use traversal


Cache each traversal to avoid backtracking when changing direction


Multiple variations customize the escape analysis utility:


Function effect escape summary


Object reference escape analysis


Address exposure analysis


Typical usage in SIL passes


A specific operation triggers the optimization. Grab its memory address.


Analyze storage: AccessStorage::compute(address)


Analyze all uses of the storage: UniqueStorageUseVisitor(storage)


See the SSADestroyHoisting pass.
Legacy SIL passes could be drastically simplified by following this pattern.
TODO

Extend SmallProjectionPath to handle object tail indices

In AccessPath, this is referred to as the path "offset" (as opposed to the subobject index). Offsets must only occur at the head of the projection path or alias analysis is fundamentally broken.
Use SmallProjectionPath in AccessStorage

findReferenceRoot (and by extension, AccessStorage::compute) should handle composition followed by projection, by popping the projection path as it goes:
  %r = alloc_ref             // root
  %s = struct (%_, %r, ...)
  %f = struct_extract %s, #1
  %a = ref_element_addr %f   // base

Use SmallProjectionPath in AccessPath

Simplify AccessPathDefUseTraversal by popping
This, in turn makes UniqueStorageUseVisitor efficient.
(An exercise for the implementer: determine which logic to duplicate in Swift and C++ vs. expose via bridging.)
Generalize findReferenceRoot

Combine these variations under a single abstraction:

findOwnershipReferenceRoot
findGuaranteedReferenceRoots
findOwnershipReferenceAggregate

"Look through" composition of roots when the user provides a projection path.
Replace all RC-identity operations with this.
Handle casts and summarize cast info

Currently, it is far too easy for clients of utilities to assume that an access path has no cast operations.
Summarize cast information along side SmallProjectionPath. Possibly in an AccessPathInfo. Distinguish layout compatible casts, ownership preserving casts, etc. Verify no illegal casts.
Simplify ReleaseDevirtualizer

This is a classic example of the need for AccessStorage::compute().getRoot().
Simplify LICM load splitting

Use an abstraction over SmallProjectionPath (converting AccessPath to SmallProjectionPath solves this). Simplify comparing load and store paths and rematerializing projections based on the cached projection kind.
Simplify Loop Opt Passes

COWArrayOpt and ArrayPropertyOpt should use the standard pattern of:
storage = AccessPath::compute()
UniqueStorageUseVisitor(storage)

Rewrite LoadStoreOptUtils

Use the new abstractions to redesign dead store and redundant load elimination passes.
Preserve access markers

Markers should be preserved throughout the SIL pipeline.


Markers need to be supported throughout the pipeline anyway for checks that can't be eliminated


Removing most but not all of the markers hides many optimizer bugs where access scopes are improperly handled.


Preserving markers allows verification that exclusivity checks aren't accidentally dropped.


Preserving markers allows stronger optimization exclusivity checks later in the pipeline


Preserving markers allows much stronger and more efficient alias analysis. Aliasing only needs to check whether the other memory operation is within the same access scope.


Preserving marker will solve a major source of complexity in the design and interface of memory utilities. The complexity stems from not having a consistent SIL representation throughout the pipeline. The utilities current need to handle SIL with or without access markers. Sometimes the client wants to "see-through" the markers, sometimes it wants to find the markers, etc.


Fix and verify MemBehavior's isLetAddress query

MemBehavior logic (used by SILInstruction::mayWriteToMemory) assumes
that a valid AccessBase exists for every store that initializes a
'let' variables (see isLetAddress(). But this was never checked. I
believe the current logic can cause miscompiles. Note that property
and type wrappers use initialization functions.
A complete fix requires that all addresses have a valid AccessBase.
To fix MemBehavior:


narrowly identify all the kinds of instructions that can initialize a variable (lets call this InitializeAddress)


identify the address operand that is written to


teach MemoryBehavior that any instruction other than InitializeAddress cannot write to a 'let' address


teach MemoryBehavior that any InitializeAddress instruction cannot write to a 'let' address if its AddressBase is a non-let.


Longer term:


generate and preserve begin_access [init] for all let initialization


verify that only an InitializeAddress can use an address produced by a begin_access [init]


verify that that begin_access [init] initializes a let once on all paths


Eliminate complexity in SIL

In addition to access markers inconsistency, there are other complications:


Storage casts (changing the address type) are currently allowed inside a formal access. It should be trivial to recover the access scope from a memory operation. But instead, there are potential phis and changes to the address type!


When the storage base is valid but unidentified, we currently allow different edges of a pointer phi to project different paths. This means we could have a valid AccessedStorage without a valid AccessPath.


We currently allow a 'load' instruction to take part in an address projection:
unchecked_take_enum_data_addr -> load -> project_box. This makes the implementation quite confusing.


KeyPath generation incorrectly generates a pointer_to_address $UnsafePointer to $*X without the necessary struct_extract.


Appendix: Proposed Swift API

Strawman AccessPath API (the details will change). These typed wrappers allow building type-safe APIs with assertions in the right places.
enum AccessStorageKind {
  case Box
  case Stack
  case Global
  case Class
  case Tail
  case Argument
  case Yield
  case Nested
  case Pointer
}

// Every SIL address has a single valid base. It is the origin of the
// address--the first address type that appears in a def-use walk. It identifies
// the kind of storage being accessed. The kind of storage must be known for
// optimizer and code generation correctness.
// The base address is never inside an access scope.
struct AccessBase {
  var baseAddress: Value

  init(forAddress: Value)
  
  var storageKind: AccessStorageKind { get }

  // non-nil for Box, Class, and Tail
  var reference: Value? { get }
  
  var isLet: Bool { get }
  
  var isLocal: Bool { get }
  
  var isUniquelyIdentified: Bool { get }
  
  func isDistinct(from: AccessBase) -> Bool
}

enum AccessKind {
  case Read
  case Modify
  
  func mayConflict(with: AccessKind) -> Bool
}

struct AccessScope {
  // begin_access or begin_access_unpaired
  var beginAccess: Instruction

  var accessedAddress: Value { get }

  var accessKind: AccessKind { get }
  
  var scopeEndingInstructions: IteratorProtocol<Instruction> { get }
}

// Find the enclosing access scope or base address if there is no enclosing
// access scope.
enum EnclosingScope {
  case scope(AccessScope)
  case base(AccessBase)
  
  init(forAddress: Value)
}

// For any SIL address, identify both the base and the field within the
// storage identified by the base. This provides both the kind of storage and
// location within that storage.
struct AccessPath {
  var base: AccessBase

  // address projections only
  var projectionPath: SmallProjectionPath
  
  init(forAddress: Value)

  func isDistinct(from: AccessPath) -> Bool
}

// The origin of a reference.
struct AccessRoot {
  var root: Value
  
  // Sequence of: (value extractions, class projection, address projections)
  // in that order. Only one class projection is allowed.
  var projectionPath: SmallProjectionPath
}

// For any SIL address, identify a common reference root, base address, and
// field.
struct AccessStorage {
  // valid for Box, Class, and Tail
  var commonRoot: AccessRoot?

  var base: AccessBase
  
  init(forAddress: Value)

  func isDistinct(from: AccessStorage) -> Bool
}

extension AccessBase {
  // `forProjection` contains address projections only
  //
  // Returns a non-empty list for Box, Class, and Tail
  func findRoots(forProjection: SmallProjectionPath?)
    -> IteratorProtocol<AccesPath>
}

extension AccessPath {
  // Returns a non-empty list for Box, Class, and Tail
  func findRoots() -> IteratorProtocol<AccessRoot> {
    base.findRoots(forProjection: projectionPath)
  }
}

// `projectionPath` is the path from the address base to the use.
// If this is the use of a root, then the full path can be derived by appending
// this path to the root's extraction path and class projection.
struct AccessUse {
  var scope: EnclosingScope
  var projectionPath: SmallProjectionPath
  var operand: Operand
}

extension AccessBase {
  // `overlapping` contains address projections only
  //
  // Compare the resulting AccessPath to determine whether the use exactly
  // matches `overlapping`, is contained by `overlapping`, or may overlap in
  // an unknown way.
  //
  // Symmetric with init(forAddress:). `self` is the base of each use.
  // (AccessPathVerification checks this on the C++ side.)
  func findUses(overlapping: SmallProjectionPath? = nil)
    -> IteratorProtocol<AccessUse>

  // `matching` contains address projections only
  func findUses(matching: SmallProjectionPath) -> IteratorProtocol<AccessUse>
}

extension AccessPath {
  func findOverlappingUses() -> IteratorProtocol<AccessUse> {
    base.findUses(overlapping: projectionPath)
  }
  func findMatchingUses() -> IteratorProtocol<AccessUse> {
    base.findUses(matching: projectionPath)
  }
}

// Symmetric with `findRoots`. `self` is one of the roots of each use.
extension AccessRoot {
  func findUses(overlapping: SmallProjectionPath? = nil)
    -> IteratorProtocol<AccessUse>

  func findUses(matching: SmallProjectionPath) -> IteratorProtocol<AccessUse>
}

extension AccessStorage {
  func findOverlappingUses() -> IteratorProtocol<AccessUse> {
    root.findUses(overlapping: projectionPath)
  }
  func findMatchingUses() -> IteratorProtocol<AccessUse> {
    root.findUses(matching: projectionPath)
  }
}
AccessPath Notes

AccessPath can be used to identify any addressible location without keeping a reference to the original address value. An address projection can always be rematerialized as long as its enclosing access scope is known.
Class properties are not part of the projection path in AccessPath. They are identified by the baseAddress instruction (ref_element_addr). Unlike address projections, a ref_element_addr cannot be rematerialized directly from the root. The root may have a different object type. The root may also be within a different SIL "scope".
PROBLEM: stored properties don't currently have unique indices. So summarizing class property info in SmallProjectionPath won't work.
AccessStorage has a unique identity for any set of accesses which must alias.
PROBLEM: if ref_element_addr is part of AccessStorage identity, then we can't tell that multiple accesses to the same property must alias just by the identity.
Appendix: How AccessPath verification works

The model for structurally valid SIL is expressed by the def-use and use-def visitors. The AccessPath type is just a convenient/efficient interface that computes itself using those visitors, producing a value that can be easily checked. It's mainly a wrapper around AccessedStorage, which has the key information for verification.
checkAccessedAddress() verifies that every memory access is within a formal scope. It currently lives in DiagnoseStaticExclusivity but it was supposed to be moved to SIL verification as soon as possible. It's currently hidden underneath -enable-verify-exclusivity, but we should run it by default at least whenever SIL is in OSSA form.
identifyFormalAccess() is a lesser form of verification. It only checks that access markers themselves have recognizable AccessedStorage. This is the bare minimum to check the assumptions made by exclusivity optimization. The correctness of access marker optimization depends on that verification being fully enabled. The SILVerifier currently calls identifyFormalAccess() to make sure it doesn't assert, but the verification itself is disabled!
The AccessPathVerification pass performs round-trip verification. This is a self-check that the AccessPath verification itself is complete. It also checks the def-use logic (as opposed to only the use-def logic), and it checks consistency of use-def and def-use logic. It runs a couple of times in the non-release pipeline.
Appendix: Structural SIL requirements and def-use patterns

SIL verification relies on the memory access utilities to enforce valid SIL. Any pattern that the SSA traversals can't handle is illegal SIL by definition. Verification guaranteed that:


Every address-type SILValue is associated with a either a Swift formal access to a recognized variable or an explicitly "unsafe" compiler-generated access. In short, we need to reliably recognize the access scope and storage being accessed.


Every address-type SILValue has exactly one valid AccessedStorage. A phi that originates in two non-identical AccessPaths is illegal.


If a SILValue's AccessedStorage is identified (not pointer loaded from UnsafePointer for example) then all aliasing memory access in the entire program has the same identity after resolving function arguments.


An AccessPath cannot index into a subobject. e.g. struct_element_addr -> index_addr is illegal.


TODO: I'm certainly missing some other rules.
Memory access utilities are currently more permissive than they should be because it will take more time to fix all SIL producers that currently generate problematic patterns. If the rules for valid SIL can be simplified, then the implementation of AccessPath will also be simplified.
For example, phi's of type RawPointer are currently allowed which leads to all sorts of problematic patterns. The same problem occurs with box-type phis.
Examples of SIL that cannot currently be allowed:
Address phi
bb1:
  %1 = ref_element_addr %0 : $C, #C.prop1
  br bb3(%1 : $*Int64)
bb2:
  %2 = ref_element_addr %0 : $C, #C.prop2
  br bb3(%2 : $*Int64)
bb3(%phi : $*Int64):
  %3 = begin_access [read] %phi : $*Int64

pointer (or box) phi to distinct storage
bb1:
  %1 = ref_element_addr %0 : $C, #C.prop1
  %2 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
  br bb3(%2 : $Builtin.RawPointer)
bb2:
  %4 = ref_element_addr %0 : $C, #C.prop2
  %5 = address_to_pointer %4 : $*Int64 to $Builtin.RawPointer
  br bb3(%2 : $Builtin.RawPointer)
bb3(%phi : $Builtin.RawPointer):
  %7 = pointer_to_address %3 : $Builtin.RawPointer to $*Int64
  %8 = begin_access [read] %7 : $*Int64

pointer (or box) phi to distinct named subobjects
bb0:
  %1 = begin_access [read] %0 : $*S
bb1:
  %2 = struct_element_addr %1 : $*S, #S.field1
  %3 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
  br bb3(%3 : $Builtin.RawPointer)
bb2:
  %4 = struct_element_addr %1 : $*S, #S.field2
  %5 = address_to_pointer %4 : $*Int64 to $Builtin.RawPointer
  br bb3(%3 : $Builtin.RawPointer)
bb3(%phi : $Builtin.RawPointer):
  %7 = pointer_to_address %3 : $Builtin.RawPointer to $*Int64
  %8 = load %7 : $*Int64

phi cycle with non-index projection
bb1(%phi : $Builtin.RawPointer):
  cond_br %cond1, bb2, bb3
bb2:
  %1 = pointer_to_address %phi : $Builtin.RawPointer to $*S
  %2 = struct_element_addr %phi : $S, #S.field1
  %3 = address_to_pointer %2 : $*Int64 to $Builtin.RawPointer
  br bb4(%3 : $Builtin.RawPointer)
bb3:
  br bb4(%phi : $Builtin.RawPointer)
bb4(%merge : $Builtin.RawPointer)
  %5 = load %merge : $*Builtin.Int64
  cond_br %cond2, bb1, bb5
bb5:

Note that phi cycles with index projections are allowed.
Examples of SIL that should not be allowed but currently are allowed until verification is improved:
missing access scope
  %1 = ref_element_addr %0 : $C, #C.prop
  %2 = load %1 : $*Int64

We could introduce the concept of implicit access scopes for specific patterns, but that concept doesn't exist yet.
pointer cast of a named property within an access
  %1 = ref_element_addr %0 : $C, #C.prop
  %2 = address_to_pointer %1 : $*Int64 to $Builtin.RawPointer
  %3 = pointer_to_address %2 : $*Int64 to $Builtin.RawPointer
  %4 = begin_access [read] %3 : $*Int64
  %5 = load %4 : $*Builtin.Int64

TODO: We could fill out these examples with many more cases. We should show some examples of valid/invalid address or pointer type function arguments.