Skip to content

Instantly share code, notes, and snippets.

@jckarter
Created December 17, 2015 17:26
Show Gist options
  • Save jckarter/f3d392cf183c6b2b2ac3 to your computer and use it in GitHub Desktop.
Save jckarter/f3d392cf183c6b2b2ac3 to your computer and use it in GitHub Desktop.
Swift property behaviors

Property Behaviors

Introduction

There are property implementation patterns that come up repeatedly. Rather than hardcode a fixed set of patterns into the compiler, we should provide a general "property behavior" mechanism to allow these patterns to be defined as libraries.

Motivation

We've tried to accommodate several important patterns for property with targeted language support, but this support has been narrow in scope and utility. For instance, Swift 1 and 2 provide lazy properties as a primitive language feature, since lazy initialization is common and is often necessary to avoid having properties be exposed as Optional. Without this language support, it takes a lot of boilerplate to get the same effect:

class Foo {
  // lazy var foo = 1738
  private var _foo: Int?
  var foo: Int {
    get {
      if let value = _foo { return value }
      let initialValue = 1738
      _foo = initialValue
      return initialValue
    }
    set {
      _foo = newValue
    }
  }
}

Building lazy into the language has several disadvantages. It makes the language and compiler more complex and less orthogonal. It's also inflexible; there are many variations on lazy initialization that make sense, but we wouldn't want to hardcode language support for all of them. For instance, some applications may want the lazy initialization to be synchronized, but lazy only provides single-threaded initialization. The standard implementation of lazy is also problematic for value types. A lazy getter must be mutating, which means it can't be accessed from an immutable value. Inline storage is also suboptimal for many memoization tasks, since the cache cannot be reused across copies of the value. A value-oriented memoized property implementation might look very different:

class MemoizationBox<T> {
  var value: T? = nil
  init() {}
  func getOrEvaluate(fn: () -> T) -> T {
    if let value = value { return value }
    // Perform initialization in a thread-safe way.
    // Implementation of `sync` not shown here
    return sync {
      let initialValue = fn()
      value = initialValue
      return initialValue
    }
  }
}

struct Person {
  let firstName: String
  let lastName: String

  let _cachedFullName = MemoizationBox<String>()

  var fullName: String {
    return _cachedFullName.getOrEvaluate { "\(firstName) \(lastName)" }
  }
}

Lazy properties are also unable to surface any additional operations over a regular property. It would be useful to be able to reset a lazy property's storage to be recomputed again, for instance, but this isn't possible with lazy.

There are important property patterns outside of lazy initialization. It often makes sense to have "delayed", once-assignable-then-immutable properties to support multi-phase initialization:

class Foo {
  let immediatelyInitialized = "foo"
  var _initializedLater: String?

  // We want initializedLater to present like a non-optional 'let' to user code;
  // it can only be assigned once, and can't be accessed before being assigned.
  var initializedLater: String {
    get { return _initializedLater! }
    set {
      assert(_initializedLater == nil)
      _initializedLater = newValue
    }
  }
}

Implicitly-unwrapped optionals allow this in a pinch, but give up a lot of safety compared to a non-optional 'let'. Using IUO for multi-phase initialization gives up both immutability and nil-safety.

We also have other application-specific property features like didSet/willSet and array addressors that add language complexity for limited functionality. Beyond what we've baked into the language already, there's a seemingly endless set of common property behaviors, including resetting, synchronized access, and various kinds of proxying, all begging for language attention to eliminate their boilerplate.

Proposed solution

I suggest we allow for property behaviors to be implemented within the language. A var or let declaration can specify its behavior in parens after the keyword:

var (lazy) foo = 1738

which acts as sugar for something like this:

var `foo.lazy` = lazy(var: Int.self, initializer: { 1738 })
var foo: Int {
  get {
    return `foo.lazy`[varIn: self,
                      initializer: { 1738 }]
  }
  set {
    `foo.lazy`[varIn: self,
               initializer: { 1738 }] = newValue
  }
}

Furthermore, the behavior can provide additional operations, such as clear-ing a lazy property, by accessing it with property.behavior syntax:

foo.lazy.clear()

(The syntax for declaring and accessing the behavior is up for grabs; I'm offering these only as a starting point.)

Property behaviors obviate the need for special language support for lazy, observers, addressors, and other special-case property behavior, letting us move their functionality into libraries and support new behaviors as well.

Examples

Before describing the detailed design, I'll run through some examples of potential applications for behaviors.

Lazy

The current lazy property feature can be reimplemented as a property behavior:

public struct Lazy<Value> {
  var value: Value?

  public init() {
    value = nil
  }

  public subscript<Container>(varIn _: Container,
                              initializer initial: () -> Value) -> Value {
    mutating get {
      if let existingValue = value {
        return existingValue
      }
      let initialValue = initial()
      value = initialValue
      return initialValue
    }
    set {
      value = newValue
    }
  }
}

public func lazy<Value>(var type: Value.Type, initializer _: () -> Value)
    -> Lazy<Value> {
  return Lazy()
}

As mentioned above, lazy in Swift 2 doesn't provide a way to reset a lazy value to reclaim memory and let it be recomputed later. A behavior can provide additional operations on properties that use the behavior; for instance, to clear a lazy property:

extension Lazy {
  public mutating func clear() {
    value = nil
  }
}

var (lazy) x = somethingThatEatsMemory()
use(x)
x.lazy.clear()

Memoization

Variations of lazy can be implemented that are more appropriate for certain situations. For instance, here's a memoized behavior that stores the cached value indirectly, making it suitable for immutable value types:

public class MemoizationBox<Value> {
  var value: Value? = nil
  init() {}
  func getOrEvaluate(fn: () -> Value) -> Value {
    if let value = value { return value }
    // Perform the initialization in a thread-safe way.
    // Implementation of 'sync' not shown here.
    return sync {
      let initialValue = fn()
      value = initialValue
      return initialValue
    }
  }
  func clear() {
    value = nil
  }

  public subscript<Container>(letIn _: Container,
                              initializer value: () -> Value) -> Value {
    return box.getOrEvaluate(value)
  }
}

public func memoized<Value>(let type: Value.Type, initializer: () -> Value)
    -> MemoizationBox<Value> {
  return MemoizationBox()
}

Which can then be used like this:

struct Location {
  let street, city, postalCode: String

  let (memoized) address = "\(street)\n\(city) \(postalCode)"
}

Delayed Initialization

A property behavior can model "delayed" initialization behavior, where the DI rules for var and let properties are enforced dynamically rather than at compile time:

public func delayed<Value>(let type: Value.Type) -> Delayed<Value> {
  return Delayed()
}
public func delayed<Value>(var type: Value.Type) -> Delayed<Value> {
  return Delayed()
}

public struct Delayed<Value> {
  var value: Value? = nil

  /// DI rules for vars:
  /// - Must be assigned before being read
  public subscript<Container>(varIn container: Container) {
    get {
      if let value = value {
        return value
      }
      fatalError("delayed var used before being initialized")
    }
    set {
      value = newValue
    }
  }

  /// DI rules for lets:
  /// - Must be initialized once before being read
  /// - Cannot be reassigned
  public subscript<Container>(letIn container: Container) {
    get {
      if let value = value {
        return value
      }
      fatalError("delayed let used before being initialized")
    }
  }

  /// Behavior operation to initialize a delayed variable
  /// or constant.
  public mutating func initialize(value: Value) {
    if let value = value {
      fatalError("delayed property already initialized")
    }
    self.value = value
  }
}

which can be used like this:

class Foo {
  let (delayed) x: Int

  init() {
    // We don't know "x" yet, and we don't have to set it
  }

  func initializeX(x: Int) {
    self.x.delayed.initialize(x) // Will crash if 'self.x' is already initialized
  }

  func getX() -> Int {
    return x // Will crash if 'self.x' wasn't initialized
  }
}

Resettable properties

There's a common pattern in Cocoa where properties are used as optional customization points, but can be reset to nil to fall back to a non-public default value. In Swift, properties that follow this pattern currently must be imported as ImplicitlyUnwrappedOptional, even though the property can only be set to nil. If expressed as a behavior, the reset operation can be decoupled from the type, allowing the property to be exported as non-optional:

public func resettable<Value>(var type: Value.Type,
                      initializer fallback: () -> Value) -> Resettable<Value> {
  return Resettable(value: fallback())
}
public struct Resettable<Value> {
  var value: Value?

  public subscript<Container>(varIn container: Container,
                              initializer fallback: () -> Value) -> Value {
    get {
      if let value = value { return value }
      return fallback()
    }
    set {
      value = newValue
    }
  }

  public mutating func reset() {
    value = nil
  }
}

var (resettable) foo: Int = 22
print(foo) // => 22
foo = 44
print(foo) // => 44
foo.resettable.reset()
print(foo) // => 22

Synchronized Property Access

Objective-C supports atomic properties, which take a lock on get and set to synchronize accesses to a property. This is occasionally useful, and it can be brought to Swift as a behavior:

// A class that owns a mutex that can be used to synchronize access to its
// properties.
//
// `NSObject` could theoretically be extended to implement this using the
// object's `@synchronized` lock.
public protocol Synchronizable: class {
  func withLock<R>(@noescape body: () -> R) -> R
}

public func synchronized<Value>(var _: Value.Type,
                                initializer initial: () -> Value)
    -> Synchronized<Value> {
  return Synchronized(value: initial())
}

public struct Synchronized<Value> {
  var value: Value

  public subscript<Container: Synchronizable>(varIn container: Container,
                                              initializer _: () -> Value)
      -> Value {
    get {
      return container.withLock {
        return value
      }
    }
    set {
      container.withLock {
        value = newValue
      }
    }
  }
}

NSCopying

Many Cocoa classes implement value-like objects that require explicit copying. Swift currently provides an @NSCopying attribute for properties to give them behavior like Objective-C's @property(copy), invoking the copy method on new objects when the property is set. We can turn this into a behavior:

public func copying<Value: NSCopying>(var _: Value.Type,
                                      initializer initial: () -> Value)
    -> Copying<Value> {
  return Copying(value: initial().copy())
}

public struct Copying<Value> {
  var value: Value

  public subscript<Container>(varIn container: Container,
                              initializer _: () -> Value)
      -> Value {
    get {
      return value
    }
    set {
      value = newValue.copy()
    }
  }
}

Referencing Properties with Pointers

We provide some affordances for interfacing properties with pointers for C interop and performance reasons, such as withUnsafePointer and implicit argument conversions. These affordances come with a lot of caveats and limitations. A property behavior can be defined that implements properties with manually-allocated memory, guaranteeing that pointers to the property can be freely taken and used:

public func pointable<Value>(var _: Value.Type,
                             initializer initial: () -> Value)
    -> Pointable<Value> {
  return Pointable(value: initial())
}

public class Pointable<Value> {
  public let pointer: UnsafeMutablePointer<Value>

  init(value: Value) {
    pointer = .alloc(1)
    pointer.initialize(value)
  }

  deinit {
    pointer.destroy()
    pointer.dealloc(1)
  }

  public subscript<Container>(varIn _: Container,
                              initializer _: () -> Value)
      -> Value {
    get {
      return pointer.memory
    }
    set {
      pointer.memory = newValue
    }
  }
}

var (pointable) x = 22
var (pointable) y = 44

memcpy(x.pointable.pointer, y.pointable.pointer, sizeof(Int.self))
print(x) // => 44

(Manually allocating and deallocating a pointer in a class is obviously not ideal, but is shown as an example. A production-quality stdlib implementation could use compiler magic to ensure the property is stored in-line in an addressable way.)

Property Observers

A property behavior can also replicate the built-in behavior of didSet/willSet observers:

typealias ObservingAccessor = (oldValue: Value, newValue: Value) -> ()

public func observed<Value>(var _: Value.Type,
                            initializer initial: () -> Value,
                            didSet _: ObservingAccessor = {},
                            willSet _: ObservingAccessor = {})
    -> Observed<Value> {
  return Observed(value: initial())
}

public struct Observed<Value> {
  var value: Value

  public subscript<Container>(varIn _: Container,
                              initializer _: () -> Value,
                              didSet didSet: ObservingAccessor = {},
                              willSet willSet: ObservingAccessor = {})
      -> Value {
    get { return value }
    set {
      let oldValue = value
      willSet(oldValue, newValue)
      value = newValue
      didSet(oldValue, newValue)
    }
  }
}

A common complaint with didSet/willSet is that the observers fire on every write, not only ones that cause a real change. A behavior that supports a didChange accessor, which only gets invoked if the property value really changed to a value not equal to the old value, can be implemented as a new behavior:

public func changeObserved<Value: Equatable>(var _: Value.Type,
                                             initializer initial: () -> Value,
                                             didChange _: ObservingAccessor = {})
    -> ChangeObserved<Value> {
  return ChangeObserved(value: initial())
}

public struct ChangeObserved<Value: Equatable> {
  var value: Value

  public subscript<Container>(varIn _: Container,
                              initializer _: () -> Value,
                              didChange didChange: ObservingAccessor = {}) {
    get { return value }
    set {
      if value == newValue { return }
      value = newValue
      didChange(oldValue, newValue)
    }
  }
}

This is a small sampling of the possibilities of behaviors. Let's look at how they can be implemented:

Detailed design

A property declaration can declare a behavior after the var or let keyword in parens:

var (runcible) foo: Int

(Possible alternatives to var (behavior) are discussed later.) Inside the parens is a dotted declaration reference that must refer to a behavior function that accepts the property attributes (such as its name, type, initial value (if any), and accessor methods) as parameters. How attributes map to parameters is discussed below.

When a property declares a behavior, the compiler expands this into a backing property, which is initialized by invoking the behavior function with the property's attributes as arguments. The backing property takes on whatever type is returned by the behavior function. The declared property forwards to the accessors of the backing property's subscript(varIn:...) (or subscript(letIn:...)) member, with self as the first argument (or () for a free variable declaration). The subscript may also accept any or all of the property's attributes as arguments. Approximately, the expansion looks like this:

var `foo.runcible` = runcible(var: Int.self)
var foo: Int {
  return `foo.runcible`[varIn: self]
}

with the fine print that the property directly receives the get, set, materializeForSet, etc. accessors from the behavior's subscript declaration. By forwarding to a subscript instead of separate get and set methods, property behaviors preserve all of the mutable property optimizations we support now and in the future for free. The subscript also determines the mutability of the declared property.

The behavior function is resolved by building a call with the following keyword arguments, based on the property declaration:

  • The metatype of the declared property's type is passed as an argument labeled var for a var, or labeled let for a let.
  • If the declared property provides an initial value, the initial value expression is passed as a () -> T closure to an argument labeled initializer.
  • If the property is declared with accessors, their bodies are passed by named parameters corresponding to their names. Accessor names can be arbitrary identifiers.

For example, a property with a behavior and initial value:

var (runcible) foo = 1738

gets its backing property initialized as follows:

var `foo.runcible` = runcible(var: Int.self, initializer: { 1738 })

A property that declares accessor methods:

var (runcible) foo: Int {
  bar { print("bar") }
  bas(x) { print("bas \(x)") }
}

passes those accessors on to its behavior function:

private func `foo.bar`() { print("bar") }
private func `foo.bas`(x: T) { print("bar") }

var `foo.runcible` = runcible(var: Int.self,
                              bar: self.`foo.bar`,
                              bas: self.`foo.bas`)

Contextual types from the selected behavior function can be used to infer types for the accessors' parameters as well as their default names. For example, if the behavior function is declared as:

func runcible<T>(var type: T.Type, bar: (newValue: T) -> ())
  -> RuncibleProperty<T>

then a bar accessor using this behavior can implicitly receive newValue as a parameter:

var (runcible) x: Int {
  bar { print("\(newValue.dynamicType)") } // prints Int
}

Once the behavior function has been resolved, its return type is searched for a matching subscript member with labeled index arguments:

  • The self value that contains the property is passed to a labeled varIn argument for a var, or a letIn argument for a let. This may be the metatype for a static property, or () for a global or local property.
  • After these arguments, the subscript must take the same labeled initializer and/or accessor closure arguments as the behavior function.

It is an error if a matching subscript can't be found on the type. By constraining what types are allowed to be passed to the varIn or letIn parameter of the subscript, a behavior can constrain what kinds of container it is allowed to appear in.

By passing the initializer and accessor bodies to both the behavior function and subscript, the backing property can avoid requiring storage for closures it doesn't need immediately at initialization time. It would be unacceptable if every lazy property needed to store its initialization closure in-line, for instance. The tradeoff is that there is potentially redundant work done forming these closures at both initialization and access time, and many of the arguments are not needed by both. However, if the behavior function and subscript are both inlineable, the optimizer ought to be able to eliminate dead arguments and simplify closures. For most applications, the attribute closures ought to be able to be @noescape as well.

Some behaviors may have special operations associated with them; for instance, a lazy property may provide a way to clear itself to reclaim memory and allow the value to be recomputed later when needed. The underlying backing property may be accessed by referencing it as property.behavior.

var (lazy) x = somethingThatEatsMemory()

use(x)
x.lazy.clear() // free the memory

The backing property has internal visibility by default (or private if the declared property is private). If the backing property should have higher visibility, the visibility can be declared next to the behavior:

public var (public lazy) x = somethingThatEatsMemory()

However, the backing property cannot have higher visibility than the declared property.

The backing property is always a stored var property. It is the responsibility of a let property behavior's implementation to provide the expected behavior of an immutable property over it. A well behaved let should produce an identical value every time it is loaded, or die trying, as in the case of an uninitialized delayed let. A let should be safe to read concurrently from multiple threads. (In the fullness of time, an effects system might be able to enforce this, with escape hatches for internally-impure things like memoization of course.)

Impact on existing code

By itself, this is an additive feature that doesn't impact existing code. However, it potentially obsoletes lazy, willSet/didSet, and @NSCopying as hardcoded language features. We could grandfather these in, but my preference would be to phase them out by migrating them to library-based property behavior implementations. (Removing them should be its own separate proposal, though.)

It's also worth exploring whether property behaviors could replace the "addressor" mechanism used by the standard library to implement Array efficiently. It'd be great if the language only needed to expose the core conservative access pattern (get/set/materializeForSet) and let all variations be implemented as library features. Note that superseding didSet/willSet and addressors completely would require being able to apply behaviors to subscripts in addition to properties, which seems like a reasonable generalization.

Alternatives considered/to consider

Declaration syntax

Alternatives to the proposed var (behavior) propertyName syntax include:

  • An attribute, such as @behavior(lazy) or behavior(lazy) var. This is the most conservative answer, but is clunky.
  • Use the behavior function name directly as an attribute, so that e.g. @lazy works. This injects functions into the attribute namespace, which is problematic (but maybe not as much if the function itself also has to be marked with a @behavior_function attribute too).
  • Use a new keyword, as in var x: T by behavior.
  • Something on the right side of the colon, such as var x: lazy(T). To me this reads like lazy(T) is a type of some kind, which it really isn't.
  • Something following the property name, such as var x«lazy»: T or var x¶lazy: T (picking your favorite ASCII characters to replace «»¶). One nice thing about this approach is that it suggests self.x«lazy» as a declaration-follows-use way of accessing the backing property.

Syntax for accessing the backing property

The proposal suggests x.behaviorName for accessing the underlying backing property of var (behaviorName) x. The main disadvantage of this is that it complicates name lookup, which must be aware of the behavior in order to resolve the name, and is potentially ambiguous, since the behavior name could of course also be the name of a member of the property's type. Some alternatives to consider:

  • Reserving a keyword and syntactic form to refer to the backing property, such as foo.x.behavior or foo.behavior(x). The problems with this are that reserving a keyword is undesirable, and that behavior is a vague term that requires more context for a reader to understand what's going on. If we support multiple behaviors on a property, it also doesn't provide a mechanism to distinguish between behaviors.
  • Something following the property name, such a foo.x«lazy» or foo.x¶lazy (choosing your favorite ASCII substitution for «»¶, again), to match the similar proposed declaration syntax above.
  • "Overloading" the property name to refer to both the declared property and its backing property, and doing member lookup in both (favoring the declared property when there are conflicts). If foo.x is known to be lazy, it's attractive for foo.x.clear() to Just Work without annotation. This has the usual ambiguity problems of overloading, of course; if the behavior's members are shadowed by the fronting type, something incovenient like (foo.x as Lazy).clear() would be necessary to disambiguate.

Defining behavior requirements using a protocol

It's reasonable to ask why the behavior interface proposed here is ad-hoc rather than modeled as a formal protocol. It's my feeling that a protocol would be too constraining:

  • Different behaviors need the flexibility to require different sets of property attributes. Some kinds of property support initializers; some kinds of property have special accessors; some kinds of property support many different configurations. Allowing overloading (and adding new functionality via extensions and overloading) is important expressivity.
  • Different behaviors place different constraints on what containers are allowed to contain properties using the behavior, meaning that subscript needs the freedom to impose different generic constraints on its varIn/ letIn parameter for different behaviors.

It's true that there are type system features we could theoretically add to support these features in a protocol, but increasing the complexity of the type system has its own tradeoffs. I think it's unlikely that behaviors would be useful in generics either.

A behavior declaration

Instead of relying entirely on an informal protocol, we could add a new declaration to the language to declare a behavior, something like this:

behavior lazy<T> {
  func lazy(...) -> Lazy { ... }
  struct Lazy { var value: T; ... }
}

Doing this has some potential advantages:

  • It provides clear namespacing for things that are intended to be behaviors.
  • If the functions and types that implement the behavior can be nested under the behavior declaration somehow, then they don't need to pollute the global function/type namespace.
  • The behavior declaration can explicitly provide metadata about the behavior, such as what container and value types it supports, what kinds of accessors properties can provide to it, that are all discovered by overload resolution in this proposal. It'd also be a natural place to place extensions like how a behavior behaves with overriding, what behaviors it can or can't compose with, etc.

Naming convention for behaviors

This proposal doesn't discuss the naming convention that behaviors should follow. Should they be random adjectives like lazy? Should we try to follow an -ing or -able suffix convention? Does it matter, if behaviors have their own syntax namespace?

TODO

When do properties with behaviors get included in the memberwise initializer of structs or classes, if ever? Can properties with behaviors be initialized from init rather than with inline initializers?

Can behaviors be composed, e.g. (lazy, observed), or (lazy, atomic)? How? Composition necessarily has to have an ordering, and some orderings will be wrong; e.g. one of (lazy, atomic) or (atomic, lazy) will be broken.

To be able to fully supplant didSet/willSet (and addressors), we'd need to be able to give behaviors to subscripts as well. The special override behavior of didSet/willSet in subclasses needs to be accounted for as well.

It's worth considering what the "primitive" interface for properties is; after all, theoretically even computed properties could be considered a behavior if you unstack enough turtles. One key thing to support that I don't think our current special-case accessors handle is conditional physical access. For instance, a behavior might want to pass through to its physical property, unless some form of transactionality is enabled. As a strawman, if there were an inout accessor, which received the continuation of the property access as an (inout T) -> Void parameter, that might be expressed like this:

var _x = 0
var x: Int {
  inout(continuation) {
    // If we're not logging, short-circuit to a physical access of `x`.
    if !logging {
      continuation(&_x)
      return
    }
    // Otherwise, save the oldValue and log before and after
    let oldValue = x
    var newValue = x
    continuation(&newValue)
    print("--- changing _x from \(oldValue) to \(newValue)")
    _x = newValue
    print("--- changed! _x from \(oldValue) to \(newValue)")
  }
}

An implementation of inout as proposed like this could be unrolled into a materializeForSet implementation using a SIL state machine transform, similar to what one would do to implement yield or await, which would check that continuation always gets called exactly once on all paths and capture the control flow after the continuation call in the materializeForSet continuation.

@dhoepfl
Copy link

dhoepfl commented Dec 18, 2015

I see huge potential in this proposal!

Some random thoughts:

  • Goodbye will/didSet, welcome ObservableBehavior.
  • Composability is desirable. Why not have observable variables to be atomic and/or initialized lazily?
  • The attribute-like notations remind me of Java’s annotations. Annotations being more flexible regarding where to use them but there is no need to have this in the first step. I think it would be good to have a syntax that has the potential to be expanded for function behaviors, class behaviors, etc later on.
  • Talking about function behaviors: Does this remind anyone of aspect oriented programming?
  • Is it possible to add (remove?!?) behaviors in subclasses/extensions? (So your property is not observable? Let me change that.)
  • Is it behavior or behaviour?

@muescha
Copy link

muescha commented Jan 18, 2017

it is on Swift Evolution as Proposal: SE-0030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment