atrick/sil-proposal-1.md

## sil-proposal-1.md

      
    Raw
  

              sil-proposal-1.md
            
          
    SIL Opaque Values

Introduction

A SIL type is either loadable or address-only. A loadable type is one
whose object size and layout can be determined by the compiler and
whose values are not "pinned" to a memory address. Types are most
commonly address-only because their layout is opaque by
abstraction. Generic type parameters are address-only because their
concrete type not statically identified. Resilient types are
statically identified, but have opaque layout for binary
compatibility.
Compiled code must access and pass address-only objects via their
memory addresses. We refer to this as "physical" access, as expressed
by LLVM IR. Currently, SIL also reflects physical access. For example,
SIL for a generic function relies on address types:
sil @identity : $<T> (@in T) -> @out T {
bb0(%0 : $*T, %1 : $*T):
  copy_addr %1 to [initialization] %0 : $*T
  destroy_addr %1 : $*T
  %4 = tuple ()
  return %4 : $()
}

If the generic type is bound to a loadable type, then the SIL values
are promoted to value types:
// generic specialization <Swift.Int> of identity <T> (T) -> T
sil @identity : $(Int) -> Int {
bb0(%0 : $Int):
  return %0 : $Int
}

This leads to drastically different SIL patterns in code that is
semantically almost identical. A key aspect of optimizing SIL is
converting values from an in-memory form to an SSA value. Being
address-only currently prevents opaque values from taking part in
normal optimization. Naturally, optimization is limited in the
presence of opaque types. For example, protocol methods can't be
inlined without specializing the generic type, nor can methods on
resilient types. However, an important feature of the SIL optimizer
is copy elimination and by extension ARC operations, which applies
equally to opaque and loadable types.
This SIL code creates an unnecessary local copy:
%copy = alloc_stack $T
copy_addr %arg to [initialization] %copy : $*T
...
%ret = apply %callee<T>(%copy) : $@convention(thin) <τ_0_0> (@in τ_0_0) -> ()
...
dealloc_stack %copy : $*T
destroy_addr %arg : $*T

The same code expressed in SSA reveals that the copy is redundant:
%copy = copy_value %arg : $T
...
%ret = apply %callee<T>(%copy) : $@convention(thin) <τ_0_0> (@in τ_0_0) -> ()
...
destroy_value %arg : $T

Redundancy can be inferred from these facts: %arg by definition cannot mutate, it is consumed in-scope by a destroy_value, and %arg is not copied again after %copy has been consumed:
The address-only property of arguments determines physical calling
conventions in LLVM IR. These conventions cannot be complete hidden
from SIL code. In particular, SIL is responsible for handling
reabstraction of function types. If the caller and callee have
different views of an argument type, then a SIL-level thunk is
required to bridge between the two conventions.
For example, a concrete function may satisfy a protocol with generic
constraints. A protocol witness thunk will be generated to load the
address-only arguments and store address-only results. An extra copy
is also generated to convert a guaranteed self argument to owned:
sil @foo : $@convention(method) (Int, S) -> Int {
bb0(%0 : $Int, %1 : $S):
  return %0 : $Int
}

// protocol witness for P.foo (A.T) -> A.T in conformance S : P
sil [thunk] @_TTWV1t1SS_1PS_FS1_3foofwx1TwxS2_
  : $@convention(witness_method) (@in Int, @in_guaranteed S) -> @out Int {
bb0(%0 : $*Int, %1 : $*Int, %2 : $*S):
  %3 = load [trivial] %1 : $*Int
  %4 = load [trivial] %2 : $*S
  %5 = function_ref @foo : $@convention(method) (Int, S) -> Int
  %6 = apply %5(%3, %4) : $@convention(method) (Int, S) -> Int
  store %6 to [trivial] %0 : $*Int
  %8 = tuple ()
  return %8 : $()
}

Conceivably, the address-only types in the thunk could be expressed as SSA values. However, the @in, @in_guaranteed parameter conventions must remain to communicate indirection to the SIL compiler:
// protocol witness for P.foo (A.T) -> A.T in conformance S : P
sil [thunk] @_TTWV1t1SS_1PS_FS1_3foofwx1TwxS2_
  : $@convention(witness_method) (@in Int, @in_guaranteed S) -> @out Int {
bb0(%0 : $Int, %1 : $*S):
  %2 = function_ref @foo : $@convention(method) (Int, S) -> Int
  %3 = apply %2(%0, %1) : $@convention(method) (Int, S) -> Int
  return %3 : $Int
}

This proposal argues that address-only SIL values should be
represented within SIL function bodies as SSA values, unless SIL
semantics would otherwise require an address even if the value's type
were loadable. SIL parameter and result conventions will continue to
reflect argument indirection. Two SIL function signatures with
the same SIL types and conventions will always have the same ABI.
Motivation and Goals


Optimize generic and resilient code. Primarily done by avoiding
unnecessary copies.


Make ownership verification more efficient.


Simplify the SIL optimizer. SSA analyses for SILValues should apply
to opaque types. Avoid developing non-SSA memory optimizations "in
parallel". Avoid many redundant peepholes for address-type
operations.


Simplify SILGen. It should be a straightforward translation of the
AST. A lot of the complexity currently has to do with lowering
address-type values on-the-fly.


Simplify IRGen. It should be a straightforward translation of
lowered SIL into LLVM IR. Some on-the-fly logic for lowering
addresses can be removed.


Design

Address Lowering

The loadable and address-only property of SIL types will not
change. However, address-only will only refer to the physical
properties of the type and will no longer determine the SIL-level
representation.
A new "lowered" SIL state will be introduced as a preparation for
IRGen. Lowered SIL will reflect the physical constraints of a type,
just as SIL currently does at all stages.
Generic code before address lowering:
sil @identity : <T> (@in T) -> @out T {
bb0(%0 : $T):
  %2 = copy_value %0 : $T
  destroy_value %0 : $T
  return %2 : $T
}

Generic code after address lowering:
sil @identity : $<T> (@in T) -> @out T {
bb0(%0 : $*T, %1 : $*T):
  copy_addr %1 to [initialization] %0 : $*T
  destroy_addr %1 : $*T
  %4 = tuple ()
  return %4 : $()
}

The “lowered” SIL stage is conceptually part of IRGen (it is not
serialized in the module). Optimizations that allocate storage for SIL
types, handle physical calling conventions, and fold away computation
based on known sizes can be done here. The actual LLVM IR generation
can be a fairly literal translation of SIL.
This actually makes canonical SIL much more canonical by moving most
of the SIL address representation down to IRGen. alloc_stack will
still exist in canonical SIL for @inout and captures but operations
will be SSA-based.
The primary motivation is to optimize generic and resilient types, but
the side effect will be a drastically simplified SILGen and somewhat
simplified IRGen. SIL ownership verification will also be much more
efficient in practice.
SIL will continue supporting address types prior to lowering. inout
arguments, and by extension captured variables, must have a memory
location. These objects are semantically associated with a memory
location in SIL. This is entirely independent of whether the type is
address-only.
SIL Types and Argument Conventions

Users of SILFunctionType will now need to distinguish between
parameter and result indirection for the purpose of evaluating calling
conventions and handling reabstraction as opposed to the indirection
of SIL values within the function. For example, a function type with
an @out T result has an indirect storage convention, but the
indirection of the SIL argument depends on the SIL module's current
conventions for representing returned values. Before address lowering,
the result will be returned directly in SIL, so will not have an
address type. After address lowering, the result will be returned via
a SIL address type argument to match the SIL value's "storage" type.
SILFunctionType only knows how to map formal types to a SIL storage
types. It does not know about the SILModule's conventions for SIL
values. To make this distinction clear, it's methods are named:
getIndirectFormalResults(), getDirectFormalResults(), etc.
Querying SIL argument conventions requires wrapping the function type
with an instance of SILFunctionConventions. This provides an API
with methods named: getIndirectSILResults(),
getDirectSILResults(), etc.
Here's how this design fits into broader picture of the layers of
abstraction in the type system:


Formal types: The AST types directly exposed by the language.
(e.g. T?)


Canonical types: desugared formal AST types.
(e.g. Optional<T>)


Lowered types: Canonical types in the ASTContext that can be
directly referenced by SIL. These "formalize" some properties of the
ABI. For example, they make the ownership and indirection of
function arguments explicit. These formalized conventions must match
on both the caller and callee side of every call. Lowered types
include types that aren't part of the language's formal type
system. See SILFunctionType.  Although these types have been lowered
for use by SIL, they exist independent of a SILModule. (e.g. @in Optional<T>)


SIL types: The actual type associated with a SIL value. These merely
wrap a lowered type with a flag indicating whether the SIL value
has indirect SIL semantics. i.e. whether the value is an address or
an object type. SIL types are part of a SILModule, and reflect the
SILModule's conventions. Mapping lowered types to SIL types is
specific to the current SIL stage.
(e.g. $Optional<T>)


SIL storage types: These are SIL types with lowered addresses.  They
represent the ABI requirements for indirection and storage of SIL
objects. In the "lowered" SIL stage, the SIL type of every value is
its storage type. Lowered types directly correspond to SIL storage
types. For example, if a function parameter has an @in lowered type, then
the storage type of the corresponding SIL argument is an address.
(e.g. $*Optional<T>)


LLVM types: Represent the ABI requirements in terms of C types. This
could introduce additional indirection, but I'd like to handle most
if not all of that in SIL address lowering.
(e.g. %Sq* noalias nocapture, %swift.type* %T)


So to recap, if you ask for the SIL type corresponding to a formal
convention, you'll get the SIL storage type
(e.g. $*Optional<T>). If you ask for the SIL type for a function
argument corresponding to the same formal parameter, you will get the
right level of indirection for the current SIL stage
(e.g. $Optional<T>). In short the lowered type may specify a calling
convention that expects indirect storage, while the SIL type may be
direct.
Alternatives

Ignore address-only completely during SIL generation.

This would make formal calling conventions consistent with SIL-level
conventions. For example, @in T and @owned T arguments would both
be considered to passed directly. This way, whether two conventions
are compatible could be determined merely by comparing the SIL types
of the arguments. A SIL address type would map to an indirect
convention and a SIL object type would map to a direct convention.
This would hide part of the ABI from SIL. However, reabstraction must
be exposed to SIL. Doing so simplifies IRGen, allows the SIL optimizer
to improve code within thunks, and allows the SIL optimizer can
perform function signature optimizations across calls.
Instead, the proposed approach introduces two separate notions of
indirection. A indirect formal parameter requires indirection at the
ABI level as determined by the function's formal type. An indirect SIL
argument requires a SIL address type as determined by constraints
within the SIL code. @in T will be a formally indirect parameter
that, prior to address lowering, accepts a direct SIL value from the
callee and provides a direct SIL value in the caller.
Lower addresses during IRGen.

This would avoid the need to support two SIL representations depending
on the compilation stage. This could be done by analyzing SIL values
and supplying the information needed for address lowering to IRGen via
a side-channel.
Integrating lowering within IRGen is not seamless. It contradicts our
important goal of simplifying the IR bridging phases of the
compiler. Handling address lowering as an independent pass avoids
adding complexity to IRGen. Once in place, some of the existing
IRGen complexity can even be moved to the lowering pass.
By introducing a new SIL stage, the proposed approach opens up multiple
additional opportunities. It will be useful to allow some SIL
passes to operate on lowered SIL and have access to physical
properties of types. We have already added an AllocStackHoisting
pass and plan to add more.
Plan


Introduce SILFunctionConventions.

Update all code that deals with SILFunctionTypes and clarify the
separation between formal parameter/result types vs. SIL types.
This step was extremely invasive and time consuming. I believe that
it set the groundwork to make it easy to roll out the rest of the
feature. I hope the process of migrating to this API shook out most
of the bugs that we would have hit later when turning on the new
feature.
See PR 6922.

Introduce EnableSILOpaqueValues option.

Under this option SILGen will directly generate SSA values for
opaque types. The SIL optimizer will be modified to handle opaque
SSA values.

Introduce an AddressLowering pass.

This will be the last stage in the SIL pipeline. It will run after
SIL serialization, providing IRGen support for opaque SIL
values. This way IRGen can continue to function almost unchanged.
After AddressLowering, the SIL module will be in a lowered stage.

Verify SIL Opaque Values

SIL operations on address types will now be prohibited in canonical
SIL with the exception of alloc_stack, load, and store.
There is still much work to be done in SILGen here.

Optimized address lowering.

The AddressLowering pass will be optimized to avoid allocating
storage for temporaries.
John McCall's example:
  try_apply %someFunction() normal %cont, unwind %handler
cont(%value: $T):
  %enum = enum #MyEnum.foo, %value : $T
  %any = existential $Any, %enum
  %fn = function_ref @bar
  apply %fn(%any)
handler(%error: $Error):
  throw $error

"Naive allocation here is going to introduce a lot of moves.
Optimally, we would receive the return value from %someFunction
directly in the payload of %enum, which we want to build directly
into the allocated existential buffer of %any.  But to do this, we
actually need to allocate that existential buffer before executing
the try_apply; and if the try_apply throws, we need to deallocate
that existential buffer in the handler block.  The need to
retroactively insert this kind of clean-up code adds a lot of
complexity to this allocation approach.  Moreover, it's quite
possible that complex intermediate control — for example, if there's
a loop somewhere between the definition of a value and its consuming
use — will tend to block this kind of analysis and cause more
unnecessary moves."
I will be focusing on this design work over the next couple months.

Copy Optimization.

The copy forwarding pass will be redesigned to coalesce SSA values
by eliminating copies.
This will prepare SIL for "Semantic ARC Optimization".
The existing CopyForwarding pass can be eliminated!
I have been looking forward to doing this for a long time. As soon
as we have performance parity with the address lowering pass, I can
begin this work.