Skip to content

Instantly share code, notes, and snippets.

@philipturner
Created April 22, 2022 01:28
Show Gist options
  • Save philipturner/260d44f889c58f83bc3c3d7d830cbe1b to your computer and use it in GitHub Desktop.
Save philipturner/260d44f889c58f83bc3c3d7d830cbe1b to your computer and use it in GitHub Desktop.
// I suspect that the crash is caused by a memory leak at runtime. It keeps
// crashing at random places. Once, it even crashed at the end of `llvm::orc::
// runAsMain`.
// I transformed it into a cast that I am 100% sure is allowed, and it still
// crashed, but less than 100% of the time. In the possibly illegal form, it
// crashed exactly 100% of the time.
// It seems like a valid cast in the code. It is a differentiable_function_expr
// but it is cached somewhere. I noticed that removing the previous declaration
// of `MyType` decreased the crash rate to something below 100%. Perhaps the
// crash is caused by the duplication of a type object, or dual-referencing of
// similar type objects - one with and one without @noDerivative.
// There are a few possible explanations:
// Explanation #1
// 1) Two type objects that can theoretically match the function must exist.
// 2) There must already be a proper, working way to distinguish them, but that
// way is corrupted or circumvented due to unique circumstances.
// 3) The corruption or circumvention results in a memory leak at runtime.
// Explanation #2
// 1) Two type objects that can theoretically match the function must exist.
// 2) There is no way to distinguish them.
// 3) (2) results in a memory leak at runtime.
// Explanation #3
// 1) Only one matching type object may exist at any one time.
// 2) The existing type object is either modified or overwritten at runtime.
// 3) (2) results in a memory leak at runtime.
// Explanation #4
// 1) Only one matching type object can exist and is immutable.
// 2) The casting operation attempts to access functionality of the type object
// that does not exist, at runtime.
// 3) The attempt to access nonexistent functionality results in an invalid
// memory access at runtime.
// How can I test all of these hypotheses?
// 1) Examine SIL code
// 2) Run the Swift program itself (not the compiler) through LLDB and track
// the state of every variable or object present.
// On both ARM and x86, it crashes in the middle of a call to
// libswiftCore.dylib`swift_retain
// The assembly instruction it crashes on can vary.
// It can't 100% tell if the call to Retain happens inside the call to print, or
// after the call. From the stack trace, it seems to be 5 stack frames inside
// `print`. From the SIL, there are 3 strong retains to functions after the call
// to `print`. So I'm not 100% sure where the call happens.
// The error I got on ARM was much more descriptive than the one on x86. It said
// EXC_BAD_ACCESS (code=2, address=0x1aa325e58). On both architectures, it seems
// to happen when accessing the stack pointer. From online forums, the error
// with code=2 happens mostly when the stack runs out of memoruy. This matches
// what I saw in assembly.
// Zombie objects doesn't catch it
// Calling a function that works with strings suppresses the bug
// Using Address Sanitizer suppresses the bug (it's a Schrodinger's bug!)
// The probability of the bug happening is inversely proportional to the length
// of time of a call to `Foundation.usleep` happening before the `print`
// statement. On x86, this is measurable, while on ARM, the probability drops
// to <1% by the time I reach 1.00 seconds.
// This may just be an extremely weird edge case, with no solution. There is an
// example of this from Apple Developer Forums:
//
// I had the same crash but it occurred when accessing the auto-generated initializer for a struct. I'd do a clean, and then build, and it would crash with an "exc_bad_access" code=2. The stack trace ends with a call to libswiftCore.dylib`swift_retain I'd then manually code the initializer and it'd be fine. If I commented out the code, without cleaning, it'd still be fine which was odd. But if you clean it again, and then build it using the auto-generated initializer, it'd crash again. It seems that the auto-generated code is causing some issues. This also seems to be the case for codedby.pm too as he discussed changing a property from: public private(set) var foo: Bar to: public var foo: Bar removed the crash.
//
// The developer changed the declaration of a property. This dealt with
// auto-generated code and went away if the build products were prevously
// cached. This is an analogy, so it may not be solvable.
// If there is any auto-generated code within print, maybe I can narrow this
// down further. Let's look at code for the Swift Stdlib.
// I have success! I narrowed down the bug to something smaller!
import Differentiation
typealias MyType = @differentiable(reverse) (Float, @noDerivative Int) -> Float
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as MyType
let array = [item]
let obj = array.withUnsafeBufferPointer { pointer in
return pointer[0]
}
// I found another reproducer. It requires that there are 2 attempts to
// reference the function pointer. Also, Box must be a class.
// Even weirder:
import Differentiation
typealias MyType = @differentiable(reverse) (Float, @noDerivative Int) -> Float
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as MyType
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let x = box.value // crash HERE
let www = withUnsafePointer(to: item) { pointer in
return pointer[0]
}
// The crash happens before www is initialized. If I comment out the code that
// initializes www, the crash does not happen.
// That leads to this reproducer:
import Differentiation
typealias MyType = @differentiable(reverse) (Float, @noDerivative Int) -> Float
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as MyType
withExtendedLifetime(item) {
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
_ = box.value
}
// It deals with Swift memory retaining, which is not a coincidence. Although
// previous crashes may have occurred somewhere outside the retain function
// body (I'm not sure if my observation was accurate with llvm::orc::runAsMain),
// this is not a coincidence.
// Interestingly enough, removing the generic requirement and instead manually
// specifying the type as `MyType` stops the crash.
// Cleaning up the reproducer:
import Differentiation
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
withExtendedLifetime(item) {
let box = Box(value: item)
let retrieved = box.value
}
// The crash happens regardless of whether `box` is declared inside the
// `withExtendedLifetime` statement.
// Even worse:
import Differentiation
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
withExtendedLifetime(item) {}
let retrieved = box.value // crash here
// Does this change the refcount of the function pointer?
// This just happened
import Differentiation
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
var item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let retrieved = box.value // crash here
// That means there is no need to reference the object twice with `www` or
// `withExtendedLifetime`.
// This has no effect on where the crash happens:
import Differentiation
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
var item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
var ob: Any?
class Box<T> {
var value: T
init(value: T) {
self.value = value
ob = value
}
}
let box = Box(value: item)
print(ob)
let retrieved = box.value // crash here
// And casting `item` to Any in its declaration suppresses the crash.
// This suppresses the crash as well:
import Differentiation
@differentiable(reverse)
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
var item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: Any
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let retrieved = box.value as! (@differentiable(reverse) (Float, @noDerivative Int) -> Float) // crash here
// From the LLVM IR and assembly, there are 3 direct calls to _swift_retain
// in the final code. I don't know if it indirectly calls Retain by calling
// into the Swift Stdlib. Assuming it doesn't, I've found something interesting.
// It loads all 6 components of the 48-bit function pointer.
// components in %5, %7, and %9 look the most like actual reference-counted
// Swift function pointers. Coincidentally, they are the arguments of the 3
// calls to Retain.
// Maybe instead, it's ref-counting the context pointer. I can't be sure which
// of the 6 pointers were for the function, and which were for the context. The
// ones that weren't refcounted appeared first in memory, which intuitively
// should be the function pointer.
// Is the SIL, IR, or Assembly different when I omit the @noDerivative
// attribute?
// No. I can only omit the attribute from the function declaration, and it still
// crashes in the same place.
// Omitting @differentiable(reverse) from the function declaration does not stop
// the crash either.
// At least I have an even more minimized reproducer:
import Differentiation
func myFunc(_ x: Float, _ y: Int) -> Float { x }
let item = myFunc as
@differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let retrieved = box.value // crash here
// Now, I'm going to 100% guarantee the crash during swift_retain happens during
// one of the swift_retain's explicitly mentioned in assembly.
// I can say with confidence that no other function indirectly calls
// swift_retain.
// I have been having troube reproducing the crash with the above reproducer
// today, but adding the `withExtendedLifetime` clause right above the crash
// site makes the crash much more likely to happen. How does that change the
// final output code?
// This just happened!!!! @noDerivative is not part of the bug!
import Differentiation
func myFunc(_ x: Float) -> Float { x }
let item = myFunc as
@differentiable(reverse) (Float) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
withExtendedLifetime(item) { }
let retrieved = box.value
// And this causes the compiler to crash! I think I'm on to something!
import Differentiation
func myFunc(_ x: Float, _ y: Float) -> Float { x + y }
let item = myFunc as
@differentiable(reverse) (Float, Float) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let retrieved = box.value
// Some weird stuff is going on in Xcode, but I reproduced the crash on my
// custom toolchain with all the recent bug fixes implemented. Awesome!!!!!
// The reabstraction stuff works the first time, but crashes the second time. I
// think it even takes the same execution path. I should track the state of all
// variables through the 1st time (which succeeds) and the 2nd time (which
// fails).
// TODO: set an assert guard for similar conditions around `plan`. See if
// they're hit in the test suite.
// TODO: it looks like call #8 of `plan` is similar to a component of call #7
// SIL:
// thunk for @escaping @callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @escaping @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))
sil shared [transparent] [serialized] [reabstraction_thunk] [ossa] @$sS6fIegnrr_Iegnnro_S6fIegydd_Iegyydo_TR : $@convention(thin) (Float, Float, @guaranteed @callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))) -> (Float, @owned @callee_guaranteed (Float) -> (Float, Float)) {
// %0 // user: %4
// %1 // user: %6
// %2 // user: %8
bb0(%0 : $Float, %1 : $Float, %2 : @guaranteed $@callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))):
%3 = alloc_stack $Float // users: %8, %4
store %0 to [trivial] %3 : $*Float // id: %4
%5 = alloc_stack $Float // users: %8, %6
store %1 to [trivial] %5 : $*Float // id: %6
%7 = alloc_stack $Float // users: %9, %8
%8 = apply %2(%7, %3, %5) : $@callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))
%9 = load [trivial] %7 : $*Float
} // end sil function '$sS6fIegnrr_Iegnnro_S6fIegydd_Iegyydo_TR'
// ThunkType:
(sil_function_type type=@convention(thin) (Float, @guaranteed @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float)) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(input=sil_function_type type=@callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// I found a way to work around one case of the crash. When it's creating the
// reabstraction thunk, it is never possible for the innerResult to be indirect,
// the outerOriginType to be a Tuple, and innerResult to not be a tuple.
// However, that is exactly the case for when myFunc is (Float, Float) -> Float.
// Fix:
// swiftSILGen/SILGenPoly.cpp, circa line 2213
// ResultPlanner::plan(AbstractionPattern innerOrigType,
// CanType innerSubstType,
// AbstractionPattern outerOrigType,
// CanType outerSubstType,
// PlanData &planData)
//
// Change the two-part conditional below this comment:
// // If the inner result is a tuple, we need to expand from a temporary.
// to:
// if (innerResult.getInterfaceType()->is<TupleType>()) {
//
// Below this:
// auto outerResult = claimNextOuterResult(planData);
// Add this:
// if (innerResult.isFormalIndirect() && outerOrigType.isTuple()) {
// assert(!(SGF.silConv.isSILIndirect(outerResult.first)));
// assert(SGF.silConv.isSILIndirect(innerResult));
// SILValue innerResultAddr =
// addInnerIndirectResultTemporary(planData, innerResult);
// addIndirectToDirect(innerResultAddr, outerResult.first);
//
// auto innerResult2 = claimNextInnerResult(planData);
// auto outerResult2 = claimNextOuterResult(planData);
// SILValue innerResultAddr2 =
// addInnerIndirectResultTemporary(planData, innerResult2);
// addIndirectToDirect(innerResultAddr2, outerResult2.first);
// return;
// }
// But that doesn't stop the crash for this signature:
// (Float) -> Float
// Or this signature:
// (Float, @noDerivative Int) -> Float
// Observing the execution path through a few cases:
// C = a call to ResultPlanner::plan
//
// (Float, Float) -> Float
// repeat x 2 {
// C
// JVP {
// C
// }
// VJP {
// C
// C
// }
// }
// Crash at compile-time (manually averted)
//
// (Float) -> Float
// repeat x 2 {
// C
// JVP {
// C
// }
// VJP {}
// }
// Crash at runtime, which is flaky
//
// (Float, @noDerivative Int) -> Float
// repeat x 2 {
// C
// JVP {
// C
// C
// }
// VJP {}
// }
// Crash at runtime, which is flaky
//
// (Float) -> (Float, Float)
// ~ Exact same as (Float, Float) -> Float.
// Crash at compile-time (not averted)
// What's worse: (Float) -> (Float, Float) crashes during the 8th "C" because
// something isn't a `ReferenceStorageType`. The crash for (Float, Float) ->
// Float happened because something isn't a `TupleTyp`. That means the problem
// isn't in the code that handles reabstraction thunks; it's something more
// chronic and happenes earlier on.
// Compilation proceeds just fine for (Float, Float) -> Float and even proceeds
// through runtime without crashing. I should further scrutinize that working
// case, by calling the function at runtime after loading it. I suspect the
// function will be in a corrupted state. And, I can try this with cases #2
// and #3 - which don't crash 100% of the time at runtime.
// Sure enough, I got something!
import _Differentiation
func myFunc(_ x: Float, _ y: @noDerivative Int) -> Float { x }
let item = myFunc as @differentiable(reverse) (Float, @noDerivative Int) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
print("hello world")
let retrieved = box.value // crash here 40% of the time
let arg1: Float = 2
let arg2: Int = 2
print("hello world 2")
let (ret1, grad1) = valueWithGradient(at: arg1) { arg1 in
print("hello world 2.1")
let ret1 = retrieved(arg1, arg2) // crash here 60% of the time
print("hello world 2.2")
return ret1
}
print("hello world 3")
print(ret1, grad1)
// This also crashes! In fact, the first time it crashed seemed to be during
// COMPILE time. It was either IRGen at compile time, or runtime execution of
// some code that resolves the device's architecture.
import _Differentiation
func myFunc(_ x: Float, _ y: Float) -> Float { x + y }
let item: @differentiable(reverse) (Float, Float) -> Float = myFunc
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
print("hello world")
let retrieved = box.value // crash here
print("hello world 2")
let arg1: Float = 2
let arg2: Float = 3
let ret = retrieved(arg1, arg2)
// Conclusion:
// The compiler is producing corrupted code, but the problem's source happens
// before everything I just studied. I need to remove my workaround and
// determine when the compiler's state becomes corrupted.
// Execution path of (Float) -> (Float, Float)
// 1:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
// destTL: $@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
// 2 (no conditional precedent):
// dest: $*T
// destTL: $*T
// C jvp { C } vjp { C C }
// 3:
// dest: $*@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1> (@in_guaranteed τ_0_0) -> @out τ_0_1 for <Float, (Float, Float)>
// destTL: $@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1> (@in_guaranteed τ_0_0) -> @out τ_0_1 for <Float, (Float, Float)>
// 4:
// dest: $*Box<@differentiable(reverse) (Float) -> (Float, Float)>
// destTL: $Box<@differentiable(reverse) (Float) -> (Float, Float)>
// 5 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C jvp {
// 6 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C } vjp {
// 7 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C
// 8 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 9 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C }
// 10:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
// destTL: $@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
// rvalue: $@differentiable(reverse) @callee_guaranteed (Float) -> (Float, @noDerivative Float)
// crash
// Execution path of (Float, Float) -> (Float) with workaround
// 1:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float, Float) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float, Float) -> Float
// 2 (no conditional precedent):
// dest: $*T
// destTL: $*T
// C jvp { C } vjp { C C }
// 3:
// dest: $*@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1, τ_0_2> (@in_guaranteed τ_0_0, @in_guaranteed τ_0_1) -> @out τ_0_2 for <Float, Float, Float>
// destTL: $@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1, τ_0_2> (@in_guaranteed τ_0_0, @in_guaranteed τ_0_1) -> @out τ_0_2 for <Float, Float, Float>
// 4:
// dest: $*Box<@differentiable(reverse) (Float, Float) -> Float>
// destTL: $Box<@differentiable(reverse) (Float, Float) -> Float>
// 5 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 6 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C jvp {
// 7 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 8 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C } vjp {
// 9 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 10 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C
// 11 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C }
// 12:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float, Float) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float, Float) -> Float
// rvalue: $@differentiable(reverse) @callee_guaranteed (Float, Float) -> Float
// 13 (no conditional precedent):
// dest: $*T
// destTL: $*T
// Execution path of (Float) -> Float
// 1:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float) -> Float
// 2 (no conditional precedent):
// dest: $*T
// destTL: $*T
// C jvp { C } vjp {}
// 3:
// dest: $*@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1> (@in_guaranteed τ_0_0) -> @out τ_0_1 for <Float, Float>
// destTL: $@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1> (@in_guaranteed τ_0_0) -> @out τ_0_1 for <Float, Float>
// 4:
// dest: $*Box<@differentiable(reverse) (Float) -> Float>
// destTL: $Box<@differentiable(reverse) (Float) -> Float>
// 5 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C jvp {
// 6 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// C } vjp {}
// 7:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float) -> Float
// rvalue: $@differentiable(reverse) @callee_guaranteed (Float) -> Float
// 8 (no conditional precedent):
// dest: $*T
// destTL: $*T
// Execution path of (Float, @noDerivative Int) -> Float - myFunc decl is (Float, Int) -> Float
// 1:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float, @noDerivative Int) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float, @noDerivative Int) -> Float
// 2 (no conditional precedent):
// dest: $*T
// destTL: $*T
// C jvp { C } vjp {}
// 3:
// dest: $*@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1, τ_0_2> (@in_guaranteed τ_0_0, @noDerivative @in_guaranteed τ_0_1) -> @out τ_0_2 for <Float, Int, Float>
// destTL: $@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1, τ_0_2> (@in_guaranteed τ_0_0, @noDerivative @in_guaranteed τ_0_1) -> @out τ_0_2 for <Float, Int, Float>
// 4:
// dest: $*Box<@differentiable(reverse) (Float, @noDerivative Int) -> Float>
// destTL: $Box<@differentiable(reverse) (Float, @noDerivative Int) -> Float>
// 5 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 6 (no conditional precedent):
// dest: $*Int
// destTL: $Int
// C jvp {
// 7 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// 8 (no conditional precedent):
// dest: $*Int
// destTL: $Int
// C }
// 9 (no conditional precedent):
// dest: $*Float
// destTL: $Float
// vjp {}
// 10:
// dest: $*@differentiable(reverse) @callee_guaranteed (Float, @noDerivative Int) -> Float
// destTL: $@differentiable(reverse) @callee_guaranteed (Float, @noDerivative Int) -> Float
// rvalue: $@differentiable(reverse) @callee_guaranteed (Float, @noDerivative Int) -> Float
// 11 (no conditional precedent):
// dest: $*T
// destTL: $*T
// In case #1 of the above execution graphs, there is something peculiar. Right
// before it crashes, rvalue dumps this:
%58 = differentiable_function [parameters 0] [results 0] %49 : $@callee_guaranteed (Float) -> (Float, Float) with_derivative {%53 : $@callee_guaranteed (Float) -> (Float, Float, @owned @callee_guaranteed (Float) -> (Float, Float)), %57 : $@callee_guaranteed (Float) -> (Float, Float, @owned @callee_guaranteed (Float, Float) -> Float)}
// But its type is this:
$@differentiable(reverse) @callee_guaranteed (Float) -> (Float, @noDerivative Float)
// And its type lowering is this:
Type Lowering for lowered type: $@differentiable(reverse) @callee_guaranteed (Float) -> (Float, @noDerivative Float).
Expansion: Maximal
isTrivial: false.
isFixedABI: true.
isAddressOnly: false.
isResilient: false.
// Is that even legal? Why does the type not match what it dumps?
expr rvalue->getDefiningInstruction()->dump();
%58 = differentiable_function [parameters 0] [results 0] %49 : $@callee_guaranteed (Float) -> (Float, Float) with_derivative {%53 : $@callee_guaranteed (Float) -> (Float, Float, @owned @callee_guaranteed (Float) -> (Float, Float)), %57 : $@callee_guaranteed (Float) -> (Float, Float, @owned @callee_guaranteed (Float, Float) -> Float)}
// Problem: where the @noDerivative appears
// It creates a DifferentiableFunctionInst.
// ParameterIndices: [0]
// ResultIndices: [0]
// type: (Float) -> (Float, Float)
// Iterating through getWithDifferentiability:
// param #1: Float, present in indices
// differentiability: DifferentiableOrNotApplicable
// result #1: Float, present in indices
// differentiability: DifferentiableOrNotApplicable
// result #2: Float, NOT present in indices
// differentiability: NotDifferentiable
// Source of problem:
// sourceType is a SILFunctionType. It dumps:
(sil_function_type type=@differentiable(reverse) @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float)))
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// paramIndices returns [0]
// parameters are [@in_guaranteed Float]
// resultIndices returns [0]
// results are [@out (Float, Float)]
// when iterating over parameters, it thinks they are:
// [Float]
// when iterating over results, it thinks they are:
// [Float, Float]
// What? Is this even the same SILFunction to which we're calling
// getParameters() and getResults()? It is not.
// sourceType:
(sil_function_type type=@differentiable(reverse) @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float)))
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// originalThunk is created entirely independently of sourceType. It has the
// following properties:
// dump()
[cleanup] %49 = partial_apply [callee_guaranteed] %48(%47) : $@convention(thin) (Float, @guaranteed @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)) -> (Float, Float)
// getType.dump()
$@callee_guaranteed (Float) -> (Float, Float)
// forward(SGF).dump()
%49 = partial_apply [callee_guaranteed] %48(%47) : $@convention(thin) (Float, @guaranteed @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)) -> (Float, Float)
// forward(SGF).getType().dump()
$@callee_guaranteed (Float) -> (Float, Float)
// getType().castTo<SILFunctionType>().dump()
(sil_function_type type=@callee_guaranteed (Float) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// So the sourceType does not match everything else describing it. I need to
// prove that this is incorrect behavior.
// fn.getType().dump()
$@differentiable(reverse) @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)
// inputOrigType.dump();
AP::Type<τ_0_0>((generic_type_param_type depth=0 index=0)
)
// inputSubstType.dump()
(function_type escaping
(input=function_params num_params=1
(param
(struct_type decl=Swift.(file).Float)))
(output=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float))))
// outputOrigType.dump()
AP::Type((function_type escaping
(input=function_params num_params=1
(param
(struct_type decl=Swift.(file).Float)))
(output=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float))))
)
// outputSubstType.dump()
(function_type escaping
(input=function_params num_params=1
(param
(struct_type decl=Swift.(file).Float)))
(output=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float))))
// originalThunk.getType().dump()
$@callee_guaranteed (Float) -> (Float, Float)
// sourceType->getParameters().size()
1
// sourceType->getResults().size()
1
// originalThunk.getType().castTo<SILFunctionType>()->getParameters().size()
1
// originalThunk.getType().castTo<SILFunctionType>()->getResults().size()
2
// The assertions are never triggered in the tests! I have proof of how the
// compiler's state is corrupted! That also explains why there's a strange
// scenario where a Tuple should exist, but is in fact 2 scalars.
// It seems like this is a problem of not manually bridging between the Tuple
// and 2 scalars. I'll trace it earlier on, and see where it begins.
// For the (Float, Float) -> Float case, in the 8th call to C. There is a
// transformation of `innerOrigType` == AP::Opaque -> [@out Float, @out Float].
// During the 4th call to C, there is a transformation of `outerOrigType` from
// AP::Opaque -> [@out (Float, Float)]. The other characteristic of each call
// mirror each other.
// Also, calls 3 and 7 repeat that dynamic. It's about whether a VJP function
// type returns @out (Float, Float) or (@out Float, @out Float).
// The mislabeling of @out (Float, Float) to (@out Float, @out Float) does not
// happen in C's #1, 2, 3, 5, 6, and 7 for the (Float) -> (Float, Float) case.
// The crash for that case happens after the transformation.
// For (Float, Float) -> Float, these two instructions are where I think is goes
// awry:
%54 = differentiable_function_extract [vjp] %45 : $@differentiable(reverse) @callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> @out Float // user: %55
%55 = copy_value %54 : $@callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))
// The problem starts just before the 1st C of the 2nd VJP for (C #7). The
// location I backtrace to was inside `Transform::transformFunction` in
// swiftSILGen/SILGenPoly.cpp, which starts circa line 4085.
// Now examine the case of (Float) -> (Float, Float). Can I narrow down its
// origin further?
// It fails almost immediately, after a custom set of asserts. The location is
// swiftSILGen/SILGenPoly.cpp, ManagedValue createThunk, which starts circa line
// 3123. In the conditional that only fires if sourceType->isDifferentiable(), I
// was asserting that the sourceType and expectedType's results have the same
// size.
// sourceType
(sil_function_type type=@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// expectedType
(sil_function_type type=@differentiable(reverse) @callee_guaranteed (@in_guaranteed Float) -> @out (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=tuple_type num_elements=2
(tuple_type_elt
(struct_type decl=Swift.(file).Float))
(tuple_type_elt
(struct_type decl=Swift.(file).Float)))
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// A deeper custom assertion triggers right before the crash would otherwise
// happen. There, the compared types are:
// sourceType
(sil_function_type type=@differentiable(reverse) @callee_guaranteed @substituted <τ_0_0, τ_0_1> (@in_guaranteed τ_0_0) -> @out τ_0_1 for <Float, (Float, Float)>
(input=generic_type_param_type depth=0 index=0)
(result=generic_type_param_type depth=0 index=1)
(substitution_map generic_signature=<τ_0_0, τ_0_1>
(substitution τ_0_0 -> Float)
(substitution τ_0_1 -> (Float, Float)))
(substitution_map generic_signature=<nullptr>))
// expectedType
(sil_function_type type=@differentiable(reverse) @callee_guaranteed (Float) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// Going back to (Float, Float) -> Float:
// After 8th C:
sourceType
(sil_function_type type=@callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
expectedType
(sil_function_type type=@callee_guaranteed (Float) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// After 7th C: remember, compiler's state is supposedly corrupted before this
sourceType
(sil_function_type type=@callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> (@out Float, @owned @callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float))
(input=struct_type decl=Swift.(file).Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=sil_function_type type=@callee_guaranteed (@in_guaranteed Float) -> (@out Float, @out Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
expectedType
(sil_function_type type=@callee_guaranteed (Float, Float) -> (Float, @owned @callee_guaranteed (Float) -> (Float, Float))
(input=struct_type decl=Swift.(file).Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=sil_function_type type=@callee_guaranteed (Float) -> (Float, Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// After 6th C:
sourceType
(sil_function_type type=@callee_guaranteed (@in_guaranteed Float, @in_guaranteed Float) -> @out Float
(input=struct_type decl=Swift.(file).Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
expectedType
(sil_function_type type=@callee_guaranteed (Float, Float) -> Float
(input=struct_type decl=Swift.(file).Float)
(input=struct_type decl=Swift.(file).Float)
(result=struct_type decl=Swift.(file).Float)
(substitution_map generic_signature=<nullptr>)
(substitution_map generic_signature=<nullptr>))
// I see the nesting between C7 and C8.
// Let's try seeing code that successfully compiles, and figuring out why it
// might cause a crash. Examine case (Float) -> Float.
// Why does the AST always include <<error type>>? That's supposed to be bad,
// right?
// In the case of (Float) -> (Float, Float), the compiler shoots it down as
// not differentiable, only after I type `valueWithGradient`. When I add
// `valueWithGradient`, the error fires between AST and SILGen. That means it's
// in Sema.
// What about the case of (Float, Float) -> Float? Can a tuple be labeled
// @out (Float, Float) or a function's return type be distinguished between
// a tuple and two return types, during the Sema stage?
// I got some more interesting behavior. Apparently there are some cases where
// you're not supposed to be able to extract the gradient from a differentiable
// function. But storing and retrieving the function from a `Box` bypasses that
// and allows the compilation to proceed.
// Hypothesis: it should be illegal to store and retrieve a differentiable
// function, then take the gradient of it. There is a missed check in the logic
// during Sema (type checking).
// Locate where it bans the `pullback(at:of:)` of a differentiable function
// reference that isn't its exact declaration. For example, this doesn't
// compile:
@differentiable(reverse)
func myFunc(_ x: Float) -> Float { x }
let item = myFunc
let pb = pullback(at: 2, of: item)
// Neither does this:
@differentiable(reverse)
func myFunc(_ x: Float) -> Float { x }
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: myFunc)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// But this does:
@differentiable(reverse)
func myFunc(_ x: Float) -> Float { x }
let item = myFunc as @differentiable(reverse) (Float) -> Float
let pb = pullback(at: 2, of: item)
// And this does:
@differentiable(reverse)
func myFunc(_ x: Float) -> Float { x }
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: myFunc as @differentiable(reverse) (Float) -> Float)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// On second thought, the difference between the two is as simple as adding a
// type coercion. Perhaps the last case should be allowed, but has never been
// accounted for before. What's different between it and the successfully
// compiling case before it?
// This is allowed in DifferentiableProgrammingImplementation.md, but clearly
// isn't according to the compiler!
let function: (Float) -> Float = { x in x }
let diffFunction: @differentiable(reverse) (Float) -> Float = function
// Problem 1: a differentiable function cannot return a tuple. Previous, this
// has not been a problem. The error would only be sparked when calling into a
// differential operator. But now, it causes a bad problem with SILGen code
// that performs reabstractions. This violates built-in assumptions and causes
// the nightmarish (Float) -> (Float, @noDerivative Float) phenomenon.
// Solutions:
// 1) throw an error when you add @differentiable(reverse) to a declaration that
// returns a tuple. This will wreck the ability to use JVP in the compiler rn.
// 2) in the SILGen code that reabstracts differentiable functions, assert that
// the return type is never a tuple. This may also stop JVP from working, but
// I'm not 100% sure it will.
// Problem 2: extracting a stored differentiable function pointer has not been
// tested extensively enough. My reproducer touches on a lot of edge cases,
// which may not have been thought of before.
// Solution:
// add a lot of new functionality to Sema that manually detects loading
// differentiable functions, which is different from trivial storing to and from
// a local variable.
// * Solution 2 may stop problem 1 from happening, or create an opportunity to
// solve problem 1 right then and there.
// How is my reproducer different from what can already work in the compiler?
// 1) You must store and load the function pointer from RAM. This is why the
// crash never occurs if `Box` is a struct instead of a class.
// 2) There is inconsistency in the function pointer's type on the two ends of
// loading and storing.
// What other edge cases can I produce? How about unsafe bit casting of a
// differentiable function pointer? Or writing and reading from malloc'd memory?
// Any analogues I can make to standard Swift function pointers, regarding the
// context pointers?
// Two levels of indirection may be what causes this. It doesn't crash when I
// load and store the original function (is this true?), just when I load and
// store the local variable duplicate. Yet is compiles without crashing either
// way.
// No! It doesn't compile! I just massively narrowed down the source of the
// problem! Although fixing this may not fix problem #1. Or, for that matter,
// what causes the compiler crash during C8 of (Float, Float) -> Float.
// This fails to compile:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: myFunc)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// While this succeeds at compiling, usually crashing at runtime:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
let item = myFunc as @differentiable(reverse) (Float) -> Float
class Box<T> {
var value: T
init(value: T) {
self.value = value
}
}
let box = Box(value: item)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// What else shows that weird behavior?
// This compiles successfully and does not crash at runtime:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
class Box {
var value: @differentiable(reverse) (Float) -> Float
init(value: @escaping @differentiable(reverse) (Float) -> Float) {
self.value = value
}
}
let box = Box(value: myFunc)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// This fails to compile:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct Box<T> {
private var pointer: UnsafeMutablePointer<T>
var value: T { pointer.pointee }
init(value: T) {
pointer = .allocate(capacity: 1)
pointer.pointee = value
}
}
let box = Box(value: myFunc)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// This compiles and I have yet to see it crash at runtime:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
let item = myFunc as @differentiable(reverse) (Float) -> Float
struct Box<T> {
private var pointer: UnsafeMutablePointer<T>
var value: T { pointer.pointee }
init(value: T) {
pointer = .allocate(capacity: 1)
pointer.pointee = value
}
}
let box = Box(value: item)
let retrieved = box.value
let pb = pullback(at: 2, of: retrieved)
// In the case that crashes at compile time, the type of Box is:
// Box<(Float) -> Float>
// This is true with either the class or Box struct types.
// In the case that succeeds at compilation, the type is:
// Box<@differentiable(reverse) (Float) -> Float>
// Since this is known to crash even without extracting the pullback, the
// problem may be:
// The compiler assumes that it is referring to the function's original
// declaration in source code. But now it is not.
// This fails at runtime because of bit casting between types of different
// sizes:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
var c: Int64
var d: Int64
var e: Int64
var f: Int64
}
let xF = unsafeBitCast(myFunc, to: MStruct.self)
// But this runs just fine:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
}
// This fails at runtime because of bit casting between types of different
// sizes:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
}
let xF = unsafeBitCast(myFunc as
@differentiable(reverse) (Float) -> Float, to: MStruct.self)
// But this runs just fine:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
var c: Int64
var d: Int64
var e: Int64
var f: Int64
}
let xF = unsafeBitCast(myFunc as
@differentiable(reverse) (Float) -> Float, to: MStruct.self)
// These crash at runtime. Now, can I ensure it's a crash inside `swift_retain`?
// Case 1:
@differentiable(reverse) func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
var c: Int64
var d: Int64
var e: Int64
var f: Int64
}
print("hello world 1")
let xF = unsafeBitCast(myFunc as
@differentiable(reverse) (Float) -> Float, to: MStruct.self)
print("hello world 2") // crash at declaration of `xZ`
let xZ = unsafeBitCast(xF, to: (@differentiable(reverse) (Float) -> Float).self)
// Case 2:
func myFunc(_ x: Float) -> Float { x }
struct MStruct {
var a: Int64
var b: Int64
}
print("hello world 1")
let xF = unsafeBitCast(myFunc, to: MStruct.self)
print("hello world 2") // crash at declaration of `xZ`
let xZ = unsafeBitCast(xF, to: ((Float) -> Float).self)
// Case 1 crashes during swift::RefCounts<swift::RefCountBitsT<(swift::
// RefCountInlinedness)1> >::incrementSlow, with EXC_BAD_ACCESS code 1.
// Case 2 crashes during tiny_malloc_from_free_list, with EXC_BAD_ACCESS code 2.
// The `swift_retain` crash was similar.
// There are multiple symptoms that point to the same problem:
// 1) Suffering from improper reference counting, basically unsafe bit casting
// 2) Type mismanagement in the compiler
// 3) No standardized or documented way to handle differentiable function
// pointers in memory, at least in the way mentioned in the reproducer
// The problem: there might be no built-in support for reading differentiable
// functions from device RAM.
// This is likely a much bigger problem than I can handle alone. I'm giving up
// for now and hoping I can defer it to people with more experience in the
// future.
@philipturner
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment