- Proposal: SE-NNNN
- Authors: Daryle Walker
- Review Manager: TBD
- Status: Awaiting implementation
During the review process, add the following fields as needed:
- Implementation: apple/swift#NNNNN
- Decision Notes: Rationale, Additional Commentary
- Bugs: SR-NNNN, SR-MMMM
- Previous Revision: 1
- Previous Proposal: SE-XXXX
This proposal introduces two new protocols to the standard library,
RepeatableSequence
and PersistentSequence
. They are placed between
Sequence
and Collection
in the collection hierarchy, to mark that a sequence
type supports multiple passes.
Swift-evolution thread: Discussion thread topic for that proposal
The documentation for Sequence
includes the following "Repeated Access"
section:
The
Sequence
protocol makes no requirement on conforming types regarding whether they will be destructively consumed by iteration. As a consequence, don't assume that multiplefor
-in
loops on a sequence will either resume iteration or restart from the beginning:for element in sequence { if ... some condition { break } } for element in sequence { // No defined behavior }
In this case, you cannot assume either that a sequence will be consumable and will resume iteration, or that a sequence is a collection and will restart iteration from the first element. A conforming sequence that is not a collection is allowed to produce an arbitrary sequence of elements in the second
for
-in
loop.To establish that a type you've created supports nondestructive iteration, add conformance to the
Collection
protocol.
With the documentation for IteratorProtocol
having a similar "Using Multiple
Iterators" section:
Whenever you use multiple iterators (or
for
-in
loops) over a single sequence, be sure you know that the specific sequence supports repeated iteration, either because you know its concrete type or because the sequence is also constrained to theCollection
protocol.Obtain each separate iterator from separate calls to the sequence's
makeIterator()
method rather than by copying. Copying an iterator is safe, but advancing one copy of an iterator by calling itsnext()
method may invalidate other copies of that iterator.for
-in
loops are safe in this regard.
These notes suggest room for a protocol between Sequence
and Collection
, one
for sequences that shouldn't support indexing or other collection semantics but
can still support multiple iteration passes. Using a protocol instead of
"knowing the concrete type" allows the creation of algorithms that can safely
assume multi-pass semantics on its sequence arguments.
The size of the solution depends on specifying an unanswered policy on sequences.
Is a sequence allowed to model an unstable underlying sequence? These
unstable sequences publish a different string of elements each iteration cycle,
even though the sequence's source of truth isn't considered mutated in between
calls. This can happen when iterators access an unstable source of truth, like
a random number generator, or faked when a Sequence
-conforming type is a
class
type and therefore suspends enforcement of immutability during
makeIterator()
calls.
If the answer is "yes," then two intermediate protocols are needed, one for
allowing multiple iteration passes, and one for flagging that the underlying
sequence doesn't change its string of returned elements between iterations. If
the answer is "no," then only the later intermediate protocol is needed.
Neither protocol adds required members, because they differ from Sequence
in
restrictions in state sharing between Iterator
instances or additionally their
underlying sequence's stability.
As a demonstration, methods to detect cycles in (infinite) sequences shall be added. They eagerly find cycles in the data, so if a sequence of values has mini-cycles in its non-repeating prefix and/or the actual loop, they may be fooled.
Add the following protocols:
/// A sequence that supports nondestructive iteration.
///
/// Repeatable sequences can be used in multiple `for`-`in` loops, where each
/// loop iterates from the beginning. And they can receive multiple methods
/// that iterate through their elements.
///
/// Conforming to the RepeatableSequence Protocol
/// =============================================
///
/// The `RepeatableSequence` protocol adds a constraint on the conforming type
/// and its `Iterator` type, but otherwise imposes no additional requirements
/// over the `Sequence` protocol. Any return from `makeIterator()`, or one of
/// its copies, must maintain independent iteration state from any separate
/// return (or their copies) from that method.
public protocol RepeatableSequence: Sequence {}
/// A sequence that supports repeated iteration over the same elements.
///
/// A persistent sequence has to produce the same string of returned elements,
/// element count and equivalence for corresponding elements, per call to
/// iteration, as long as the sequence doesn't receive a method that mutates
/// its underlying sequence. In other words, the string of elements
/// represented by the underlying sequence must be stable. This enables
/// algorithms that use iteration multiple times and exploit synchronization
/// between the runs.
///
/// Conforming to the PersistentSequence Protocol
/// =============================================
///
/// The `PersistentSequence` protocol adds a constraint on the conforming
/// type, but otherwise imposes no additional requirements over the
/// `RepeatableSequence` protocol. Between calls to methods that mutate the
/// sequence's underlying string of elements, all returns of `makeIterator()`
/// must generate the same string of elements.
public protocol PersistentSequence: RepeatableSequence {}
For the following types, change their primary conformance from Sequence
to
PersistentSequence
:
StrideTo
StrideThrough
For the following protocols, change their primary refinement from Sequence
to
PersistentSequence
:
Collection
(The way LazySequenceProtocol
uses its elements
property isn't specified
enough to enforce the property allowing multiple passes, so this protocol can
neither fully nor conditionally refine RepeatableSequence
or
PersistentSequence
. Conforming types can individually evaluate whether to add
conformance for multiple passes, possibly conditionally.)
For the following types, set them to conditionally conform to
RepeatableSequence
when their wrapped sequences do, and to
PersistentSequence
too when their wrapped sequences do too.
DropFirstSequence
DropWhileSequence
EnumeratedSequence
FlattenSequence
JoinedSequence
LazyDropWhileSequence
LazyFilterSequence
LazyMapSequence
LazyPrefixWhileSequence
LazySequence
PrefixSequence
UnfoldSequence
Zip2Sequence
(Some of these types currently need non-trivial changes to support multiple
passes. For FlattenSequence
and JoinedSequence
, both the wrapped sequence
and its element type need to conform. For Zip2Sequence
, both component
sequences need to conform. IteratorSequence
can never conform, since there's
no way to make a disconnected copy of the internal iterator.)
Add the following existential sequence types:
/// A type-erased repeatable sequence.
///
/// An instance of `AnyRepeatableSequence` forwards its operations to an
/// underlying base sequence having the same `Element` type, hiding the
/// specifics of the underlying sequence.
public struct AnyRepeatableSequence<Element>: RepeatableSequence {
public init<S: RepeatableSequence>(_ base: S) where S.Element == Element
public func makeIterator() -> AnyIterator<Element>
public var underestimatedCount: Int
}
/// A type-erased persistent sequence.
///
/// An instance of `AnyPersistentSequence` forwards its operations to an
/// underlying base sequence having the same `Element` type, hiding the
/// specifics of the underlying sequence.
public struct AnyPersistentSequence<Element>: PersistentSequence {
public init<S: PersistentSequence>(_ base: S) where S.Element == Element
public func makeIterator() -> AnyIterator<Element>
public var underestimatedCount: Int
}
(Since the new protocols have no new required interface, the internals of the
existential wrappers could reuse the implementation of AnySequence
.)
Add the following methods:
extension PersistentSequence {
/// Find the period and initial delay for this sequence, if possible,
/// using the given predicate as the equivalence test within the given
/// attempt limit.
///
/// The algorithm may be fooled if either the preface and/or the cyclical
/// part have mini-cycles. The algorithm may stop after the minimal
/// length to find an answer, even if either the actual delay and period
/// are larger or the sequence isn't truly infinite or periodic.
///
/// The predicate must be an *equivalence relation* over the elements.
/// That is, for any elements `a`, `b`, and `c`, the following conditions
/// must hold:
///
/// - `areEquivalent(a, a)` is always `true`. (Reflexivity)
/// - `areEquivalent(a, b)` implies `areEquivalent(b, a)`. (Symmetry)
/// - If `areEquivalent(a, b)` and `areEquivalent(b, c)` are both `true`,
/// then `areEquivalent(a, c)` is also `true`. (Transitivity)
///
/// - Precondition: `maxAttempts > 0`.
///
/// - Parameter maxAttempts: The maximum number of times the underlying
/// sequence will be read before giving up. If not given, defaults to
/// `Int.max`.
/// - Parameter areEquivalent: A predicate that returns `true` if its two
/// arguments are equivalent; otherwise, `false`.
///
/// - Returns: A tuple of the cycle period and count of elements delaying
/// the start of the first cycle.
///
/// - Complexity: O(*λ* + *μ*), where *λ* is the cycle length and *μ* is
/// the initial delay.
func detectCycle(maxAttempts: Int = Int.max, by areEquivalent: (Element,
Element) throws -> Bool) rethrows -> (delay: Int, period: Int)?
}
extension PersistentSequence where Element: Equatable {
/// Find the period and initial delay for this sequence, if possible,
/// within the given attempt limit.
///
/// The algorithm may be fooled if either the preface and/or the cyclical
/// part have mini-cycles. The algorithm may stop after the minimal
/// length to find an answer, even if either the actual delay and period
/// are larger or the sequence isn't truly infinite or periodic.
///
/// - Precondition: `maxAttempts > 0`.
///
/// - Parameter maxAttempts: The maximum number of times the underlying
/// sequence will be read before giving up. If omitted, defaults to
/// `Int.max`.
///
/// - Returns: A tuple of the cycle period and count of elements delaying
/// the start of the first cycle.
///
/// - Complexity: O(*λ* + *μ*), where *λ* is the cycle length and *μ* is
/// the initial delay.
func detectCycle(maxAttempts: Int = Int.max) -> (delay: Int, period: Int)?
{ return detectCycle(maxAttempts: maxAttempts, by: ==) }
}
Looking at Wikipedia's Cycle detection page,
detectCycle(maxAttempts: by:)
could be implemented by algorithms such as
Floyd's cycle-finding algorithm or Brent's algorithm.
Most changes are additive. Some of the existing sequence types dual-conform to
IteratorProtocol
, which would have to be removed in order for a type to
conform to PersistentSequence
.
The hierarchy for Sequence
and Collection
is locked because of ABI
stability, so proposals involving intermediate protocols are stalled until a new
version of the ABI is authorized, including a transition plan.
All changes are additive, modulo concerns in the ABI-stability section above.
The main alternative is to do nothing. The new protocols come from filling in
the hierarchy than direct need. But their concepts behind them are already
implied in the documentation, but this proposal gives them force to be
recognized by the compiler instead of by convention, helping infinite sequences
and other sequences otherwise not compatible with indexing (or other parts of
Collection
) support.
Due to the ABI freeze, an alternative is to introduce the concepts here in a parallel protocol hierarchy that users have to additionally conform their types.