Skip to content

Instantly share code, notes, and snippets.

@atrick
Last active May 10, 2024 22:36
Show Gist options
  • Save atrick/6b69e03f3b39a3a3680749314328c341 to your computer and use it in GitHub Desktop.
Save atrick/6b69e03f3b39a3a3680749314328c341 to your computer and use it in GitHub Desktop.
OwnedSpan

UniqueSpan

This pitch is an active brainstorm. It may eventually be converted into an SE proposal.

Introduction

This proposal introduces a nonescapable, noncopyable UniqueSpan<T> type. Like Span<T>, which is also nonescapable, it does not manage the underlying storage. But unlike Span<T>, it is noncopyable and takes ownership of all the elements within the span, allowing them to be moved, consumed, or replaced.

Motivation

Safe Access to Contiguous Storage introduces the Span<T> type: a read-only view of always-initialized contiguous storage. [Roadmap] Language support for BufferView referred the same concept as BufferView<T>. Compile-time Lifetime Dependency Annotations introduces language support for the type.

A "span" does not own its elements. This allows multiple spans to view the same storage. A span can also be sliced, mutated, or reassigned without affecting its elements. A given container may have a span property that yields a Span<Element> over a "read access" on the container, and the resulting span can be sliced, copied, and reassigned as follows:

var span = container.span
let prefix = span[span.startIndex..<index]
span = span[index..<span.endIndex]
// 'span' and 'prefix' are valid over the read access on 'container'.

Span values can even be embedded in other structs or enums as long as they never outlive the initial read access:

struct SpanPair {
  let span1: Span
  let span2: Span
}

extension Span {
  func splitAsPair(index: Span.Index) -> SpanPair {
    SpanPair(span1: self[startIndex..<index], span2: self[index..<endIndex])
  }
}

let pair = container.span.splitAsPair(index: index)
// 'pair' is valid over the read access on 'container'.

Span's flexibility is possible because it is a view of immutable storage. But we also need a common data type for mutating and consuming elements of contiguous storage independent of the container. This is where UniqueSpan<T> comes in. A "unique span" gains the capability to mutate and consume elements by giving up the flexibility of being able to make copies of the span and giving up the ability to modify the extent of the span without consuming its elements.

Design

A unique span requires exclusive ownership of the stored elements. This means a unique span is itself noncopyable.

In a mutating context (as an inout parameter) OutputSpan<T> provides a "mutable" span in which any element may be replaced. Imagine that a given container has an elements property that yields an UniqueSpan<Element> over a "write access" on the container:

container.elements[index] = value

A routine can now modify the elements without knowing the container type:

func update<Element>(elements: inout UniqueSpan<Element>, at index: UniqueSpan<Element>.Index, value: Element) {
  elements[index] = value
}

update(elements: &container.elements, at: index, value: value)

In an owning context, UniqueSpan provides an "input" span in which all elements must be consumed:

let ownedSpan = container.consumeElements()
// 'container' is now empty, but its storage is locked in place.
_ = ownedSpan
// After the last use of 'ownedSpan', it destroys its elements.
// `container` now unlocks and potentially frees its storage.

This allows moving elements across containers:

container2.elements[container2.startIndex..<container1.count] = container1.consumeElements()

In a borrowing context, UniqueSpan provides an "borrowed" span in which the source of the span shares ownership of the elements:

func scan<Element>(elements: borrowing UniqueSpan<Element>) {
  elements[index] = value
}

scan(elements: container.elements)

Future directions shows how borrowing and inout (or mutating) bindings improve the usability of noncopyable properties, like container.elements, by allowing local variables to refer to the property in conjunction with _read/_modify accessor coroutines.

Reassignment

A var declaration combines the owned and mutable contexts described above. As with any value type, variable reassignment destroys the previous span and takes ownership of the new value:

var ownedSpan = container1.consumeElements()
// 'container1' is now empty, but its storage is locked in place.
ownedSpan = container2.consumeElements()
// `container1`s elements are now destroyed.
// `container1` unlocks and potentially frees its storage.
_ = ownedSpan
// `container2`s elements are now destroyed.
// `container2` unlocks and potentially frees its storage.

In general, inout variables can also be reassigned. But that almost never makes sense for a unique span. The new value of the mutable span would refer to storage from a different container. Fortunately, the compiler diagnostics for nonescaping normally prevent it from happening:

func reasign<Element>(elements: inout UniqueSpan<Element>, newElements: UniqueSpan<Element>) {
  elements = newElements // ERROR: lifetime-dependent variable 'newElements' escapes its scope
}

reassign(elements: &destContainer.elements, newElements: sourceContainer.elements)

A programmer could still intentionally force a reassignment that reuses storage by requiring a lifetime dependence from the destination container to the source:

func reasign<Element>(elements: dependsOn(newElements) inout UniqueSpan<Element>, newElements: UniqueSpan<Element>) {
  elements = newElements // ERROR: lifetime-dependent variable 'newElements' escapes its scope
}

var destContainer = sourceContainer.consumeSomeElements()
// destContainer now depends on sourceContainer allowing it to reuse storage:
reassign(elements: &destContainer.elements, newElements: sourceContainer.elements)

It's up to the container type to decide whether it's valid to refer to another container's storage. Typically, this would be prevented, and checked in the range subscript's _modify accessor as shown in "Detailed design".

Paritioning

A subset of elements can only be moved out of a unique span by partitioning the span and destroying all unused elements. For example:

extension UniqueSpan where Element: ~Copyable & ~Escapable {
  // Consumes 'self'.
  public consuming func split(at index: Index) -> (Self, Self) {
    return (Self(baseAddress: baseAddress, count: index),
            Self(baseAddress: baseAddress + offset(of: count), count: endIndex - index))
  }

  // Destroys the suffix.
  public func prefix(_ maxLength: Int) -> Self {
    precondition(maxLength >= 0, "Can't have a prefix of negative length.")
    let idx = maxLength < count ? maxLength : count
    return split(at: idx).0
  }
}

Borrowing slices

A read-only span can always be created by "slicing" an owned-span. This creates immutable access scope for the duration of the slice and does not destroy any elements:

var ownedSpan = container.elements
let prefix = ownedSpan.slice(ownedSpan.startIndex..<index)
// 'elements' are immutable up the the last use of 'prefix'.
scan(prefix)

Detailed Design

For reference, this is a minimal straw-man implementation of UniqueSpan<T>:

public struct UniqueSpan<Element: ~Copyable>: ~Copyable, ~Escapable {
  // Using an integer index for demonstration purposes only.
  public typealias Index = Int

  let baseAddress: UnsafeMutableRawPointer
  public let count: Int

  @_unsafeNonescapableResult
  public init(baseAddress: UnsafeMutableRawPointer, count: Int) {
    self.baseAddress = baseAddress
    self.count = count
  }

  deinit {
    // TODO: if !_isPOD(Element.self) {
      let p = baseAddress.assumingMemoryBound(to: Element.self)
      p.deinitialize(count: count)
    // }
  }

  consuming func _discard() {
    discard self
  }
}

extension UniqueSpan where Element: ~Copyable {
  public var startIndex: Index { 0 }

  public var endIndex: Index { count }

  public var isEmpty: Bool { count == 0 }

  public func distance(from start: Index, to end: Index) -> Int {
    start.distance(to: end)
  }

  func _pointer(to position: Index) -> UnsafeMutableRawPointer {
    _checkBounds(position)
    return baseAddress + (position * MemoryLayout<Element>.stride)
  }

  func _checkBounds(_ position: Index) {
    precondition(
      distance(from: startIndex, to: position) >= 0 &&
      distance(from: position, to: endIndex) > 0,
      "Index out of bounds"
    )
  }
}

// Support for accessing single elements.
//
// For brevity, we don't consider unaligned access.
extension UniqueSpan where Element: ~Copyable {
  public subscript(position: Index) -> Element {
    _read {
      _checkBounds(position)
      yield self[unchecked: position]
    }
    _modify {
      _checkBounds(position)
      yield &self[unchecked: position]
    }
  }  
  public subscript(unchecked position: Index) -> Element {
    _read {
      let rp = _pointer(to: position)._rawValue
      let binding = Builtin.bindMemory(rp, 1._builtinWordValue, Element.self)
      defer {
        Builtin.rebindMemory(rp, binding)
      }
      yield UnsafeMutablePointer(rp).pointee
    }
    _modify {
      let rp = _pointer(to: position)._rawValue
      let binding = Builtin.bindMemory(rp, 1._builtinWordValue, Element.self)
      defer {
        Builtin.rebindMemory(rp, binding)
      }
      yield &(UnsafeMutablePointer(rp).pointee)
    }
  }
}

// Support for borrowing and mutating slices.
extension UniqueSpan where Element: ~Copyable {
  func _checkBounds(_ bounds: Range<Index>) {
    precondition(
      distance(from: startIndex, to: bounds.lowerBound) >= 0 &&
      distance(from: bounds.lowerBound, to: bounds.upperBound) >= 0 &&
      distance(from: bounds.upperBound, to: endIndex) >= 0,
      "Range of indices out of bounds"
    )
  }

  public subscript(bounds: Range<Index>) -> Self {
    _read {
      _checkBounds(bounds)
      yield self[unchecked: bounds]
    }
    _modify {
      _checkBounds(bounds)
      yield &self[unchecked: bounds]
    }
  }

  public subscript(unchecked bounds: Range<Index>) -> Self {
    _read {
      // Temporarily share element ownership. Multiple owned spans may think they own their elements as long as they
      // are all borrowed.
      let subspan = UniqueSpan<Element>(baseAddress: _pointer(to:bounds.lowerBound), count: bounds.count)
      yield subspan
      // Give 'self' back exclusive ownership its elements.
      subspan._discard()
    }
    _modify {
      let rangeAddress = _pointer(to:bounds.lowerBound)
      var subspan = UniqueSpan<Element>(baseAddress: rangeAddress, count: bounds.count)
      yield &subspan

      if subspan.baseAddress != rangeAddress {
        // Move the elements of 'newValue' into 'self'.
        precondition(bounds.count == subspan.count, "Range replacement must have the same count.")
        let count = bounds.count
        let dest = UnsafeMutableRawPointer(mutating: _pointer(to: bounds.lowerBound))
        let source = UnsafeMutableRawPointer(subspan.baseAddress)
        dest.withMemoryRebound(to: Element.self, capacity: count) {
          source.withMemoryRebound(to: Element.self, capacity: count) {
            [dest = $0 ] source in
            dest.moveUpdate(from: source, count: subspan.count)
          }
        }
      }
      subspan._discard()
    }
  }
}

// Support for partitioning.
extension UniqueSpan where Element: ~Copyable {
  // TODO: noncopyable tuples are not yet supported
  public struct Pair: ~Copyable, ~Escapable {
    let first: UniqueSpan
    let second: UniqueSpan

    init(_ first: consuming UniqueSpan<Element>, _ second: consuming UniqueSpan<Element>)
      -> dependsOn(first, second) Self {
      self.first = first
      self.second = second
    }
  }
  
  // Consumes 'self'.
  public consuming func split(at index: Index) -> Pair {
    return Pair(Self(baseAddress: baseAddress, count: index),
                Self(baseAddress: _pointer(to: index), count: endIndex - index))
  }

  // Destroys the suffix.
  public consuming func prefix(_ maxLength: Int) -> Self {
    precondition(maxLength >= 0, "Can't have a prefix of negative length.")
    let idx = maxLength < count ? maxLength : count
    return split(at: idx).first
  }
}

extension Array {
  var elements: UniqueSpan<Element> {
    _read {
      // Temporarily lie about element ownership. Multiple owned spans may think they own their elements as long as they
      // are all borrowed.
      let span = UniqueSpan<Element>(baseAddress: _buffer.firstElementAddress, count: count)
      yield span
      // Give the array back its elements.
      span._discard()
      // Manually enforce lifetime dependence on firstElementAddress.
      _fixLifetime(self)
    }
    _modify {
      _makeMutableAndUnique()
      var span = UniqueSpan<Element>(baseAddress: _buffer.firstElementAddress, count: count)
      yield &span
      _precondition(
        span.baseAddress == _buffer.firstElementAddress && span.count == count,
        "Array elements cannot be reassigned")
      // Give the array back its elements.
      span._discard()
      // Manually enforce lifetime dependence on firstElementAddress.
      _fixLifetime(self)
      _endMutation()
    }
  }
}

Alternatives Considered

MutableSpan

For some internal data structure management, we may want a MutableSpan type that only temporarily owns it's elements, and does not require all elements to be transferred back to a container when it relinquishes ownership.

container.withMutableElements { mutableSpan in
  // container.elements is *temporarily* inaccessible.
  mutableSpan[index] = newElement
}

A MutableSpan allows elements to be moved out of the span:

container.withMutableElements { mutableSpan in
  let consumeElement = mutableSpan[index]
}

Consequently, this is an unsafe type because it allows elements to be in an uninitialized state. That additional flexibility would make MutableSpan safer to use with other unsafe types. We may want to introduce it as an intermediate type that provides a withUnsafeMutableBufferPointer API.

Borrowing range subscript

The range subscript could be implemented as a borrow that returns an immutable Span<T>. This would make read-only access more usable:

readSpans(ownedSpan[range1], ownedSpan[range2])

But this would prevent using the range subscript for mutation, because all implementations of an accessor must yield the same type.

Future directions

'borrowing' and 'mutating' bindings

Closures, function nesting, and explicit writeback can be avoided by introducing borrowing and mutating bindings, as shown in the Performance predictability roadmap.

borrowing borrowedSpan = container.elements
// 'elements' is now read-only.
ownedSpan.update(at: index, value: newElement)
// After the last use of `borrowedSpan`, elements are unchanged and writable again.
mutating ownedSpan = container.elements
// 'elements' is now inaccessible.
ownedSpan.update(at: index, value: newElement)
// After it's last use, 'ownedSpan' is implicitly written back to 'elements'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment