Skip to content

Instantly share code, notes, and snippets.

@rocketnia
Created November 13, 2019 04:08
Show Gist options
  • Save rocketnia/5e319a54e4fd980ea5e26d024645628d to your computer and use it in GitHub Desktop.
Save rocketnia/5e319a54e4fd980ea5e26d024645628d to your computer and use it in GitHub Desktop.
A lazy merge sort of Racket streams, potentially useful for Rebellion
#lang racket/base
(require racket/contract/base)
(provide
(contract-out
[in-merge-sorted (-> (-> any/c any/c boolean?) stream? stream?)]))
(require racket/match
racket/promise
racket/stream)
;@------------------------------------------------------------------------------
;; Returns a nonempty stream that lazily forces the given promise. The promise
;; must return a nonempty stream.
(define (nonempty-stream-promise->nonempty-stream nonempty-stream-promise)
(stream-cons
(stream-first (force nonempty-stream-promise))
(stream-rest (force nonempty-stream-promise))))
;; Returns a stream that interleaves the elements of two given streams. If the
;; given comparison procedure behaves similarly to `<=` and the given streams
;; are already sorted according to that comparison procedure, then the result
;; will be sorted as well, and the sort will be stable: Elements from the first
;; stream will occur before elements from the second stream.
;;
;; The result stream is returned in `O(1)` time to begin with. The first time
;; the stream is forced with `stream-first` or `stream-rest`, it will force up
;; to two elements of the input streams, make up to one comparison, and other
;; than that, return in constant time. The first time each nonempty tail of the
;; stream is forced with `stream-first` or `stream-rest`, it will force up to
;; one element of the input streams, make up to one comparison, and other than
;; that, return in constant time.
;;
(define (in-merge-step <=? as bs)
;; If either input stream is empty, merging them is trivial.
(cond
[(stream-empty? as) bs]
[(stream-empty? bs) as]
[else
;; Otherwise, we return a stream that lazily forces the first elements of
;; the two input streams, compares them to determine which one is the
;; first element of the result, and uses recursion to merge the remaining
;; elements.
(nonempty-stream-promise->nonempty-stream
(delay
(let ([a (stream-first as)] [b (stream-first bs)])
(if (<=? a b)
(stream-cons a (in-merge-step <=? (stream-rest as) bs))
(stream-cons b (in-merge-step <=? as (stream-rest bs)))))))]))
;; Given a nonempty list and a two-argument merge function, this zig-zags
;; through the list, merging the elements pairwise. On the first pass, this
;; merges the first and second elements, the third and fourth elements, etc. If
;; there's a leftover element at the end, the first pass doesn't modify it. On
;; the second pass, this merges the second to last element with the last, the
;; fourth to last element with the third to last, etc. If there's a leftover
;; elementat the beginning, the second pass doesn't modify it. The third pass is
;; the same as the first. At some point there will be only one element left
;; over, and that element will be the return value.
;;
;; As a result of this approach, when this is called with `(N + 1)` elements, it
;; takes `O(N)` time and calls the merge function `N` times along the way. Those
;; `N` calls depend on each other in a balanced binary tree shape.
;;
;; The merging is stable: The order this passes arguments to the merge function
;; always reflects the order the values' ancestors appeared in the original
;; list.
;;
(define (merge-back-and-forth elems merge)
(let loop ([elems elems]
[merge merge]
[rev-elems (list)]
[rev-merge (λ (a b) (merge b a))])
(match elems
[(list elem)
(match rev-elems
;; If there's only one element left altogether, we return it.
[(list) elem]
;; If there's only one element left in the current pass but there are
;; still one or more elements that need to be processed in the next
;; pass, we add this element to those, and we flip our perspective:
;; What we called `rev-elems` and `rev-merge`, we now call `elems` and
;; `merge`, and vice versa. This begins the next pass, going the other
;; way through the list.
[_ (loop (cons elem rev-elems) rev-merge (list) merge)])]
;; If there are two or more elements left in the current pass, we merge
;; them. We put the result in `rev-elems`, the collection of values that
;; will be merged again on the next pass.
[(list* a b elems)
(loop elems merge (cons (merge a b) rev-elems) rev-merge)])))
;; Given a list of `N` instances of the value `#f` and a promise of a stream,
;; returns a stream that iterates through the first `N` elements of the given
;; stream. Forcing the result stream using `stream-rest` doesn't force any tails
;; of the input stream; the tails of the input are only forced when the elements
;; of the result stream are forced.
;;
;; If the given stream has fewer than `N` elements, the result stream will still
;; have a length of `N`, but forcing elements that don't exist in the original
;; stream will cause an error.
;;
(define (stream-adopt-length n xs-promise)
(match n
[(list) empty-stream]
[(cons #f n)
(stream-cons (stream-first (force xs-promise))
(stream-adopt-length n (delay (stream-rest (force xs-promise)))))]))
;; Returns a stream that iterates through the elements of the given stream in
;; some order. If the given comparison procedure behaves similarly to `<=`, the
;; resulting stream will be a stably sorted permutation of the original stream.
;;
;; (NOTE: The following computational complexity claims are pretty tricky and
;; might be incorrect.)
;;
;; This returns the result stream in `O(1)` time to begin with. The first time
;; the result stream is forced with `stream-first` or `stream-rest`, it will
;; force the entire input stream and take `O(N)` time aside from that, where `N`
;; is the input stream's length. Forcing the other tails of the result stream
;; will take `O(1)` time. Forcing the first element will make `(N - 1)`
;; comparisons. Forcing element `(K + 1)` will also force element `K`, and on
;; top of that cost, it will take up to `O(log N)` additional time and make up
;; to `(ceil(log_2 N) - 1)` additional comparisons. As `K` increases, the actual
;; amount of time and number of comparisons taken will tend to decrease.
;;
;; As a result of those properties, for nonzero `K`, using `in-merge-sorted` to
;; find the first `K` elements of a stream of length `N` will take up to
;; `O(max(N, K * log N))` time and make no more than
;; `(N - 1 + (K - 1) * (ceil(log_2 N) - 1))` comparisons, which simplifies to
;; `(N - K + (K - 1) * ceil(log_2 N))`. The actual bound is a bit tighter,
;; presumably closer to the *sorting numbers* as K approaches N.
;;
;; NOTE: We have some options for what computational complexity to expect of the
;; `stream-rest` operations:
;;
;; * On the first `stream-rest`, compute the entire length of the input in
;; `O(N)` time. Use that to determine the rest of the tails' `stream-rest`
;; operations in constant time and without memory leaks. (This is the one
;; we've chosen. This is cheaper than what we'd have to settle for if Racket
;; streams caused the head of a stream to be forced whenever the tail was.)
;;
;; * Compute every `stream-rest` in constant time, but use a reference to the
;; original stream to detect when we're at the end. While this is very cheap
;; in terms of time, it means the tails of the original stream might be
;; retained longer than necessary by the garbage collector.
;;
;; Using a combination of `delay/sync`, `wrap-evt`, `choice-evt`, and
;; `delay/idle`, we might be able to get the best of both worlds, at first
;; retaining the tails of the original stream, but then leaving those behind as
;; soon as the length is computed some other way. However, this approach would
;; make use of short-lived threads, which might have higher constant factor
;; costs and would likely be more difficult to maintain.
;;
(define (in-merge-sorted <=? xs)
(if (stream-empty? xs)
;; If the original stream is empty, it's already sorted.
empty-stream
;; Otherwise, we return a stream that, the first time it's forced, forces
;; the entire input stream, wraps each element in a single-element sorted
;; stream, and performs a balanced binary tree of merge operations to make
;; them all into one sorted stream.
;;
;; This expression returns in `O(N)` time, where `N` is the length of the
;; stream. It doesn't perform any comparisons. Instead, further computation
;; and comparisons are performed when parts of the resulting stream are
;; forced.
;;
(nonempty-stream-promise->nonempty-stream
(delay
(let ([xs (stream->list xs)])
(stream-adopt-length (for/list ([x (in-list xs)]) #f)
(merge-back-and-forth (for/list ([x (in-list xs)]) (stream x))
(λ (as bs) (in-merge-step <=? as bs)))))))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment