Skip to content

Instantly share code, notes, and snippets.

@xwu
Last active April 23, 2017 05:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save xwu/e864ffdf343160a8a26839388f677768 to your computer and use it in GitHub Desktop.
Save xwu/e864ffdf343160a8a26839388f677768 to your computer and use it in GitHub Desktop.
An alternative design for Comparable

Proposed solution

Comparison

Map Foundation's NSComparisonResult as

@objc public enum Comparison : Int, Equatable {
  case less = -1
  case equal = 0
  case greater = 1
}

One justification for breaking from Objective-C precedent for case names is that the names proposed here align with IEEE 754 and with existing practice in other languages (e.g., Rust).

Moreover, it is a deliberate choice to avoid terminology such as "ascending" and "descending" because, by accommodating types in which two values may be unordered relative to each other (see below), .less and .greater are not intended to be suitable for sorting.

Indeed, even for types which have only a single, well-defined total order, .equal (or NSComparisonResult.orderedSame) cannot be suitable for sorting because equivalence is not irreflexive, and using terminology to suggest that a comparison result can simply double as a sort order would be misleading.

Comparable

Comparable will gain a new comparison method which is to be the "main" dispatch point of Comparable. It will be spelled as infix operator <=> with comparison precedence:

public static func <=> (lhs: Self, rhs: Self) -> Comparison?

The function will return nil if lhs and rhs are unordered relative to each other. Otherwise,

  • (x <=> y) == .some(.less) if x compares less than y
  • (x <=> y) == .some(.equal) if x compares equal to y
  • (x <=> y) == .some(.greater) if x compares greater than y

Note: A conforming type for which every value is ordered with respect to every other value may satisfy the protocol requirements of Comparable by implementing <=> with return type Comparison instead of Comparison?. The compiler should use this information for optimization.

Default comparison operators

Most code will continue to use <, ==, or >. These will be default implementations defined in terms of <=> as follows:

public static func < (lhs: Self, rhs: Self) -> Bool {
  return (lhs <=> rhs)! == .less
}

public static func == (lhs: Self, rhs: Self) -> Bool {
  return (lhs <=> rhs)! == .equal
}

public static func > (lhs: Self, rhs: Self) -> Bool {
  return (lhs <=> rhs)! == .greater
}

(<= and >= will be defined along similar lines.)

Quiet comparison operators

Few types are expected to have values that are unordered relative to each other. In the standard library, FloatingPoint.nan (NaN, or "not a number") is a value that is unordered relative to any other floating point value, a requirement of the IEEE standard for floating point. Operations with NaN as an operand will often themselves produce NaN, but otherwise they are the result of computing:

  • .infinity - .infinity (or, equivalently, addition involving -.infinity)
  • 0 * .infinity or .infinity * 0 (or variations involving the sign of zero or infinity)
  • 0 / 0 or .infinity / .infinity (or variations involving the sign of zero or infinity)

Just as Swift arithmetic operators trap by default on overflow, comparison operators will trap by default on evaluation of values that are unordered relative to each other.

By analogy with Swift overflow arithmetic operators such as &+, C-style behavior (which, for floating point, corresponds to IEEE "quiet" comparison predicates) will be made available as &<, &==, &>, &<=, and &>=. These functions will return false instead of trapping when two values are unordered relative to each other; likewise, &!= will return true instead of trapping. They will have default implementations in terms of <=> as follows:

public static func &< (lhs: Self, rhs: Self) -> Bool {
  return (lhs <=> rhs) == .some(.less)
}

// etc.

These operators will be defined on Comparable and not on FloatingPoint. A key rationale for this design is to permit types other than FloatingPoint, including third-party types, to distinguish between signaling and quiet comparison of values when some values may be unordered with respect to others. (Another rationale for this design is that &< corresponds to what is currently spelled as <, which can be used as a predicate for Sequence.sorted.)

Backwards compatibility

For compatibility with existing code, <=> will have a default implementation defined in terms of < and ==. Compiler magic will be used to ensure that either <=> or both < and == are implemented by a conforming type.

Comparisons of floating point values will be migrated to use &< instead of <, etc., ensuring that all code continues to behave identically.

FloatingPoint

As discussed above, comparison operators such as < become IEEE "signaling" predicates that trap on comparison of NaN with any other value. Quiet comparisons are made available with the same behavior as Swift 3 operators, but with alternative spellings: &<, &==, etc.

In this design, only one ordering exists for floating point types, which corresponds to the "level 1" approximation of real arithmetic in the IEEE standard. In other words, there are no two values that compare equal in some context but unequal in another, and no generic algorithm will return a different value if the context "knows" about only comparable types or "knows" about floating point.

In all contexts:

  • (+0.0 <=> -0.0) == .some(.equal), and therefore:

    • +0.0 == -0.0 is true and +0.0 > -0.0 is false
    • +0.0 &== -0.0 is true and +0.0 &> -0.0 is false
  • (.nan <=> .nan) == .none, and therefore:

    • .nan == .nan traps and .nan > .nan traps
    • .nan &== .nan is false and .nan &> .nan is false

The results of these comparison operators are strictly in adherence with IEEE standards.

Changing the default comparison operators to trap when evaluating NaN is a "speed bump" with respect to porting numerical recipes from C, which is an important use case. However, faithfully porting any numerical recipes from C that involve integer types already requires translating + to &+ and accounting for differences in the order of operations for bitwise manipulations. Moreover, the accepted integer protocols proposal normalizes the behavior of bitshifting operators and introduces &<< and &>>. The migration from < to &<, which can be automated, will be a comparatively minor change.

Generic algorithms

Generic algorithms that make use of comparison need to account for the possibility that (a <=> b) == .none for two values a and b.

Another proposal suggests that, for these purposes, any floating point NaN should be deemed equivalent to any other floating point NaN but greater than any other floating point value. While this permits a stable sort, it yields potentially undesirable results in the context of max.

Instead, this proposal suggests that the handling of unordered comparisons should be left to each function. One practical consideration for this suggestion is that, empirically, many functions require only adjustments for spelling to yield a reasonable result. A minor re-design is proposed for sorted and sort.

The default implementation for contains, elementsEqual, split, and starts(with:) on Sequence where Iterator.Element : Equatable, and for index(of:) on Collection where Iterator.Element : Equatable, will (notionally) use the following predicate:

{
  ($0 &== $1) || (($0 <=> $0) == .none && ($1 <=> $1) == .none)
}

One consequence of this design is that [Double.nan].contains(.nan) returns true, as ordinarily expected. Consideration may be given to improving the naming of elementsEqual so as not to suggest that all elements of the receiver and argument literally compare .some(.equal).

The default implementation for lexicographicallyPrecedes and min on Sequence where Iterator.Element : Comparable will (notionally) use the following predicate, unchanged except for spelling from the current implementation:

{
  $0 &< $1
}

The default implementation for max on Sequence where Iterator.Element : Comparable will (notionally) use the following predicate, unchanged except for spelling from the current implementation:

{
  $0 &> $1
}

Preserving these predicates for min and max ensures that NaN is never returned as either the minimum or maximum value in a collection, which is unlikely to be the user's intention.

sorted and sort

The default implementation for sorted on Sequence where Iterator.Element : Comparable and for sort on MutableCollection where Iterator.Element : Comparable will gain an optional argument of type SortOrder with default value .ascending. The type is to be declared as follows:

public enum SortOrder : Int, Equatable {
  case ascending = -1
  case none = 0
  case descending = 1
}

A type distinct from Comparison and case names distinct from those in Comparison are proposed to emphasize that two values which do not compare .less may nevertheless be sorted apart from each other in .ascending sort order.

While sorted(.none) is a trivial operation, this enum case serves to maximize the re-usability of SortOrder. In other languages, for example, SortOrder is used in the context of data tables, with some columns sorted ascending or descending and other columns unsorted.

The (notional) predicate used for sorted and sort will be:

{
  switch sortOrder {
  case .ascending:
    return $0 &< $1 || (($1 <=> $1) == .none && ($0 <=> $0) != .none)
  case .none:
    return true
  case .descending:
    return $1 &< $0 || (($0 <=> $0) == .none && ($1 <=> $1) != .none)
  }
}

There is no a-priori reason why sorting a list of floating point values should place all NaN values at the end or beginning. Arguably, that decision is an arbitrary one intrinsic to the sorting algorithm and not to the floating point values themselves. In this design, such behavior is to be documented on the functions sorted and sort rather than types conforming to FloatingPoint.

A migration tool can suggest sorted(.ascending) in place of sorted(by: <) for all sequences with comparable elements. An even more intelligent migration tool can ignore uses of sorted(by: <) for types that implement <=>(_:_:) -> Comparison instead of <=>(_:_:) -> Comparison? (see above).

Users that continue to write sorted(by: <) or sorted(by: >) in generic code, or in the context of types that have elements unordered relative to each other, should be warned about the possibility of trapping. The warning would then be silenced by re-writing with a closure; for example, sorted { $0 < $1 }.

Overloading to sort by (T, T) -> Comparison?

Simultaneously, or as part of a future API expansion, sorted and sort can be overloaded with the following:

public func sorted(
  _: SortOrder,
  by: (Iterator.Element, Iterator.Element) -> Comparison?
) -> [Iterator.Element]

public mutating func sort(
  _: SortOrder,
  by: (Iterator.Element, Iterator.Element) -> Comparison?
)

This would permit, for example, localized sorting of an instance of type [String] by calling sorted(.ascending) { $0.localizedCompare($1) }.

Minor API renaming

APIs designed at different times should, if possible, use consistent terminology.

This proposal suggests two sort orders named ascending and descending. To minimize confusion, values that are "in order" should not be referred to as "increasing" since they may actually be in descending order. For clarity, the internal name for sorted(by:) should be changed from areInIncreasingOrder to areInSortOrder.

The standard library uses less than and greater than for comparisons. With respect to values in order, it refers to them as preceding one another, as in lexicographicallyPrecedes. (Previous versions of Swift also used the term successor in relation to sequences.) It does not, however, generally refer to values as being "above" or "below" one another. For this reason, rename FloatingPoint.isTotallyOrdered(belowOrEqualTo:) to FloatingPoint.isTotallyOrdered(lessThanOrEqualTo:).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment