Skip to content

Instantly share code, notes, and snippets.

@calda
Created April 10, 2019 14:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save calda/bde161eaa234594c6905194496234c8a to your computer and use it in GitHub Desktop.
Save calda/bde161eaa234594c6905194496234c8a to your computer and use it in GitHub Desktop.
Map Sorting

Map Sorting

Introduction

This proposal presents an addition to the Standard Library in an effort to make map sorting and, eventually, closureless key-path sorting a functional prerequisite for Swift.

Swift-evolution thread: Discussion thread topic for that proposal

Motivation

Today, the typical method for sorting arrays is to use an areInIncreasingOrder predicate. This closure takes the form (Element, Element) -> Bool, so it welds together both the accessing and comparison of values.

struct Person {
  ...
  var age: Int
}

struct Chat {
  ...
  var lastMsg: Message
}

var people: [Person] = ...
var chats: [Chat] = ...

people.sort { $0.age < $1.age }
chats.sort { $0.lastMsg.date > $1.lastMsg.date }

In many circumstances, this syntax can cause issues:

  • The $0.property < $1.property syntax often leads to copy-and-paste bugs, since it requires duplicating the property accessor before and after the comparison operator.
  • For long property names, this syntax can be especially verbose since it requires repeating the property name twice.
  • When the values are expensive to retrieve or calculate, this type of predicate becomes an obstacle for optimizations. It may be desirable to trade memory for speed such that each value is computed once per element rather than O(n logn) times. For an ϴ(n) operation, this optimization can theoretically speed up sorting by a factor of 10 for an array of 1000 elements. This is called the Schwartzian Transform.

Thereby, the goal of this proposal is to introduce an API that decouples the comparison of values from their retrieval, bringing ergonomic benefits for an ample range of cases, and opening the door for new optimizations.

Proposed solution

The authors propose to add an overload for both the nonmutating sorted and in-place sort methods on Sequence and MutableCollection respectively. A mapping closure on Element will lead the argument list, followed by the well known areInIncreasingOrder predicate and, finally, a flag for opting into the already mentioned Schwartzian Transform optimization.

  • transform is deliberately positioned before the predicate to respect logical and type-checking order.
  • Following the precedent set by sort and sorted, the areInIncreasingOrder comparator has a default value of < when the Value type being sorted on is Comparable.

Here are some example usages:

chats.sort(on: { $0.lastMsg.date })
chats.sort(on: { $0.lastMsg.date }, by: >)

fileUnits.sort(
  on: { $0.raw.count(of: "func") },
  by: <,
  isExpensiveTransform: true
)

Following the acceptance of SE-0249, the examples above can also be written with a key-path:

chats.sort(on: \.lastMsg.date)
chats.sort(on: \.lastMsg.date, by: >)

Implementation

extension Sequence {
    
    @inlinable
    public func sorted<Value>(
        on transform: (Element) throws -> Value,
        by areInIncreasingOrder: (Value, Value) throws -> Bool,
        isExpensiveTransform: Bool = false
        ) rethrows -> [Element] {
        guard isExpensiveTransform else {
            return try sorted {
                try areInIncreasingOrder(transform($0), transform($1))
            }
        }
        var pairs = try map {
            try (element: $0, value: transform($0))
        }
        try pairs.sort {
            try areInIncreasingOrder($0.value, $1.value)
        }
        
        return pairs.map { $0.element }
    }
    
    @inlinable
    public func sorted<Value: Comparable>(
        on transform: (Element) throws -> Value,
        isExpensiveTransform: Bool = false
        ) rethrows -> [Element] {
        return try self.sorted(on: transform, by: <, isExpensiveTransform: isExpensiveTransform)
    }
    
}


extension MutableCollection where Self: RandomAccessCollection {
    
    @inlinable
    public mutating func sort<Value>(
        on transform: (Element) throws -> Value,
        by areInIncreasingOrder: (Value, Value) throws -> Bool,
        isExpensiveTransform: Bool = false
        ) rethrows {
        guard isExpensiveTransform else {
            return try sort {
                try areInIncreasingOrder(transform($0), transform($1))
            }
        }
        var pairs = try map {
            try (element: $0, value: transform($0))
        }
        try pairs.sort {
            try areInIncreasingOrder($0.value, $1.value)
        }
        
        for (i, j) in zip(indices, pairs.indices) {
            self[i] = pairs[j].element
        }
    }
    
    @inlinable
    public mutating func sort<Value: Comparable>(
        on transform: (Element) throws -> Value,
        isExpensiveTransform: Bool = false
        ) rethrows {
        try self.sorted(on: transform, by: <, isExpensiveTransform: isExpensiveTransform)
    }
    
}

Source compatibility & ABI stability

This is an ABI-compatible addition with no impact on source compatibility.

Alternatives considered

Alternative argument labels

people.sort(over: { $0.age }, by: <)

over has more of a mathematical flavor to it, where the transform argument is read as a set of values with an injective (one-to-one) relation to the elements of the sequence. For that matter, «map sorting» can be thought of as sorting the set of transformed values and permuting the elements of the original sequence accordingly. While this variant emphasizes the strong correlation of mathematics and computer science, Swift as a multi-paradigm language should strive to settle with generally understandable names that, ideally, can be easily recognized regardless of the user's background.

people.sort(by: { $0.age }, using: <)

The by label is a perfect candidate to describe the metric used to sort the sequence. using, on its turn, is just as fitting for a predicate or an operator. The pair in question is perhaps the only one that always boils down to a proper sentence - «Sort(ed) by a metric using a predicate». Nevertheless, the author is convinced in the superior importance of preserving API uniformity and consistency with existing API the Standard Library developers have worked so hard to keep. With ABI Stability kicking in, we no longer have the opportunity for amendments to public API signatures and must be especially careful in this regard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment