Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@CTMacUser
Last active May 14, 2020 04:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CTMacUser/f6e7df33c53fadf8361f97dd86cba2bc to your computer and use it in GitHub Desktop.
Save CTMacUser/f6e7df33c53fadf8361f97dd86cba2bc to your computer and use it in GitHub Desktop.
A proposal to generalize the Sequence comparison methods, with specializations for Collections.

Locating Sequence-Commonality Breaks

Introduction

This proposal expands the surface of the Standard Library by adding a generalization of finding mismatches that the Sequence comparison methods do. Specializations of this for Collections are also proposed.

Swift-evolution threads:

Motivation

The comparison methods of Sequence:

  • starts(with:) and starts(with: by:)
  • elementsEqual(_:) and elementsEqual(_: by:)
  • lexicographicallyPrecedes(_:) and lexicographicallyPrecedes(_: by:)

are based around the same algorithm:

  1. Find the first pair of elements from each of the two input sequences that do not match, and/or at least one sequence has reached its end.
  2. Return the result based on the values and/or at-the-end status of the element pair you stopped at.

Instead of repeating that search (i.e. Step 1) in every applicable algorithm, it can be segregated out to a method that even users can call for custom needs.

If we call that common algorithm findDivergence(from: by:), then we can implement the existing comparison methods by:

extension Sequence {

    func starts<PossiblePrefix: Sequence>(
        with possiblePrefix: PossiblePrefix,
        by areEquivalent: (Element, PossiblePrefix.Element) throws -> Bool
        ) rethrows -> Bool {
        return try findDivergence(from: possiblePrefix, by: areEquivalent).otherDivergingElement == nil
    }

    func elementsEqual<OtherSequence: Sequence>(
        _ other: OtherSequence,
        by areEquivalent: (Element, OtherSequence.Element) throws -> Bool
        ) rethrows -> Bool {

        let (e, oe) = try findDivergence(from: other, by: areEquivalent)
        return e == nil && oe == nil
    }

    func lexicographicallyPrecedes<OtherSequence: Sequence>(
        _ other: OtherSequence,
        by areInIncreasingOrder: (Element, Element) throws -> Bool
        ) rethrows -> Bool
        where OtherSequence.Element == Element {
            let (e, oe) = try findDivergence(from: other, by: { try !areInIncreasingOrder($0, $1) && !areInIncreasingOrder($1, $0) })
            switch (e, oe) {
            case (.some(let ee), .some(let ooe)):
                // The result is the same as the ordering of these unequal values.
                return try areInIncreasingOrder(ee, ooe)
            case (.none, .some):
                // This sequence is a prefix of the other.
                return true
            case (_, .none):
                // The other sequence is either identical to or shorter than this one.
                return false
            }
    }

}

The signature of findDivergence(from: by:) above returns the values of the elements of the two sequences where the commonalities stopped. For a Collection, a more useful return could be the index in each collection where values diverged. Then the common prefix count and the differing element values can be derived from the returned indexes.

A method that indicates where the common start of two collections diverged can be reversed to indicate where the common end of two collections converged. This requires collections that also conform to BidirectionalCollection in order to search backwards.

Proposed solution

The proposed solution is the add six methods across the Sequence hierarchy. One is the generalized algorithm, to live as an extension method in Sequence. Next is a Collection variant to return the indexes of the divergent point. Then another variant for BidirectionalCollection to return the indexes where two collections start being the same. Finally, there will be conditional refinements of the three methods that default to == as the comparison operation when the Element type is Equatable.

Detailed design

A reference implementation for the methods are included here:

extension Sequence {

    /// Returns the first corresponding pair of elements from this sequence and
    /// the given sequence that are not equivalent, using the given predicate as
    /// the equivalence test.
    ///
    /// The predicate must be an equivalence relation over the elements.
    ///
    /// To know how many elements in each sequence were skipped until a possible
    /// non-match was found, attach a counter in the predicate.  It will be
    /// called exactly once per corresponding pair, but only when neither
    /// sequence ran short that iteration.
    ///
    /// - Precondition: Either at least one of the involved sequences is finite,
    ///   or there is at least one pair of corresponding elements that are not
    ///   equivalent.
    ///
    /// - Parameters:
    ///     - other: A sequence to compare to this sequence.
    ///     - areEquivalent: A predicate that returns `true` if its two
    ///       arguments are equivalent; otherwise, `false`.
    /// - Returns: A tuple of two optional elements, corresponding to the first
    ///   non-matching element pair from this sequence and `other`,
    ///   respectively.  When both members are `nil`, then the sequences have
    ///   the same length and are equivalent all the way through.  When only one
    ///   member is `nil`, then the corresponding sequence is shorter and also
    ///   equivalent to the prefix of the other sequence.  (If neither is `nil`,
    ///   then the values must differ according to `areEquivalent`.)
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this sequence and
    ///   `other`.
    public func firstDelta<S: Sequence>(
        against other: S,
        by areEquivalent: (Element, S.Element) throws -> Bool
    ) rethrows -> (Element?, S.Element?) {
        var iterator = makeIterator(), otherIterator = other.makeIterator()
        while true {
            let first = iterator.next(), second = otherIterator.next()
            if let f = first, let s = second, try areEquivalent(f, s) {
                continue
            } else {
                return (first, second)
            }
        }
    }

}

extension Sequence where Element: Equatable {

    /// Returns the first corresponding pair of elements from this sequence and
    /// the given sequence that are not equal.
    ///
    /// - Precondition: Either at least one of the involved sequences is finite,
    ///   or there is at least one pair of corresponding elements that are not
    ///   equal.
    ///
    /// - Parameters:
    ///     - other: A sequence to compare to this sequence.
    /// - Returns: A tuple of two optional elements, corresponding to the first
    ///   unequal element pair from this sequence and `other`, respectively.
    ///   When both members are `nil`, then the sequences have the same length
    ///   and are equal all the way through.  When only one member is `nil`,
    ///   then the corresponding sequence is shorter and also equal to the
    ///   prefix of the other sequence.  (If neither is `nil`, then the values
    ///   must differ.)
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this sequence and
    ///   `other`.
    @inlinable
    public func firstDelta<S: Sequence>(against other: S)
        -> (Element?, S.Element?) where S.Element == Element {
        return firstDelta(against: other, by: ==)
    }

}

extension Collection {

    /// Returns the past-the-end indices of this collection and the given
    /// collection for their longest common prefix, using the given predicate as
    /// the equivalence test.
    ///
    /// The predicate must be an equivalence relation over the elements.
    ///
    /// - Parameters:
    ///     - other: A collection to compare to this collection.
    ///     - areEquivalent: A predicate that returns `true` if its two
    ///       arguments are equivalent; otherwise, `false`.
    /// - Returns: A tuple of two indices, corresponding to the locations for
    ///   the first non-matching element pair from this collection and `other`,
    ///   respectively.  If a particular collection ran out of elements before a
    ///   mismatch, the corresponding index in the returned tuple is that
    ///   collection's `endIndex`.
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this collection and
    ///   `other`.
    @inlinable
    public func diverges<C: Collection>(
        from other: C,
        by areEquivalent: (Element, C.Element) throws -> Bool
    ) rethrows -> (Index, C.Index) {
        let result = try indices.firstDelta(against: other.indices, by: {
            try areEquivalent(self[$0], other[$1])
        })
        return (result.0 ?? endIndex, result.1 ?? other.endIndex)
    }

}

extension Collection where Element: Equatable {

    /// Returns the past-the-end indices of this collection and the given
    /// collection for the longest common prefix.
    ///
    /// - Parameters:
    ///     - other: A collection to compare to this collection.
    /// - Returns: A tuple of two indices, corresponding to the locations for
    ///   the first unequal element pair from this collection and `other`,
    ///   respectively.  If a particular collection ran out of elements before a
    ///   mismatch, the corresponding index in the returned tuple is that
    ///   collection's `endIndex`.
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this collection and
    ///   `other`.
    @inlinable
    public func diverges<C: Collection>(from other: C) -> (Index, C.Index)
        where C.Element == Element {
        return diverges(from: other, by: ==)
    }

}

extension BidirectionalCollection {

    /// Returns the starting indices of this collection and the given collection
    /// for their longest common suffix, using the given predicate as the
    /// equivalence test.
    ///
    /// The predicate must be an equivalence relation over the elements.
    ///
    /// - Parameters:
    ///     - other: A collection to compare to this collection.
    ///     - areEquivalent: A predicate that returns `true` if its two
    ///       arguments are equivalent; otherwise, `false`.
    /// - Returns: A tuple of two indices, corresponding to one location after
    ///   the last non-matching element pair from this collection and `other`,
    ///   respectively, when aligning the collections by their ends.  If a
    ///   particular collection ran out of elements before a backwards-iterating
    ///   mismatch, the corresponding index in the returned tuple is that
    ///   collection's `startIndex`.
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this collection and
    ///   `other`.
    @inlinable
    public func converges<C: BidirectionalCollection>(
        with other: C,
        by areEquivalent: (Element, C.Element) throws -> Bool
    ) rethrows -> (Index, C.Index) {
        let result = try reversed().diverges(from: other.reversed(), by: areEquivalent)
        return (result.0.base, result.1.base)
    }

}

extension BidirectionalCollection where Element: Equatable {

    /// Returns the starting indices of this collection and the given collection
    /// for the longest common suffix.
    ///
    /// - Parameters:
    ///     - other: A collection to compare to this collection.
    /// - Returns: A tuple of two indices, corresponding to one location after
    ///   the last unequal element pair from this collection and `other`,
    ///   respectively, when aligning the collections by their ends.  If a
    ///   particular collection ran out of elements before a backwards-iterating
    ///   mismatch, the corresponding index in the returned tuple is that
    ///   collection's `startIndex`.
    ///
    /// - Complexity: O(*n*), where *n* is the shorter of this collection and
    ///   `other`.
    @inlinable
    public func converges<C: BidirectionalCollection>(with other: C)
        -> (Index, C.Index) where C.Element == Element {
        return converges(with: other, by: ==)
    }

}

Here's some example usage:

let orange = Array("orange")
let predetermined = Array("predetermined")
let predecessor = Array("predecessor")
let successor = Array("successor")
assert(orange.firstDelta(against: orange) == (nil, nil))
assert(orange.diverges(from: orange) == (6, 6))
assert(orange.converges(with: orange) == (0, 0))
assert(predetermined.firstDelta(against: predecessor) == ("t", "c"))
assert(predetermined.diverges(from: predecessor) == (5, 5))
assert(predecessor.converges(with: successor) == (5, 3))
assert(predetermined.diverges(from: successor) == (0, 0))
assert(predetermined.converges(with: successor) == (13, 9))
assert("add".firstDelta(against: "addend") == (nil, "e"))
assert("addend".firstDelta(against: "add") == ("e", nil))

Source compatibility

This change is additive only.

Effect on ABI stability

This change is additive only.

Effect on API resilience

This change is additive only.

Alternatives considered

The main alternative is to not add these methods to the Standard Library. I had found a need for diverges(from:) in other code I've written, with firstDelta(against:) and converges(with:) being easy extensions from diverges.

Earlier versions of this proposal had the firstDelta(against:) method called findDivergence(from:). The older method also retuned the common prefix count. That count is rarely needed and complicates the interface, which is why the current version removed it. The countCommonPrefix(with:) method was removed for the same reason. The counting functionality can be brought back as needed with a custom state-full predicate (or use the variants for collections).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment