Skip to content

Instantly share code, notes, and snippets.

@jstepien
Last active March 6, 2017 11:21
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jstepien/70accee59eef8b899c72ee49b11ae95d to your computer and use it in GitHub Desktop.
Save jstepien/70accee59eef8b899c72ee49b11ae95d to your computer and use it in GitHub Desktop.
reduce-and-gc.clj
;; Let's experiment with Clojure 1.8.0 running with a relatively small heap
;; of 10MB. To avoid any extra dependencies, we'll start it as follows:
;;
;; java -Xmx10m -jar clojure-1.8.0.jar
;;
;; Copy following expressions into your REPL one by one.
;;
;; Let's take a small number, for instance one million.
(def small-num (* 1000 1000))
;; We can build a lazy sequence of the given length and get its last element
;; not exceeding our heap limits.
(last (range small-num))
;; We can define the lazy sequence in a let binding. It might look as if we
;; were holding onto its head, but the compiler is smart enough. We calculate
;; the last element of the sequence in the last expression of the let body.
;; This allows the compiler to drop the reference to the head of the sequence,
;; thus avoiding materialising the entire sequence in memory.
(let [coll (range small-num)]
(last coll))
;; Let's try to cause an out-of-memory error. We'll firstly calculate the last
;; element, ignore the result, and return the first element instead. The head of
;; the sequence (and the entire sequence itself) cannot be garbage collected.
;; We end up with an OutOfMemoryError.
(let [coll (range small-num)]
(last coll)
(first coll))
;; We can cause the same error by attempting to fully evaluate the sequence
;; before returning its last element. We get an OutOfMemoryError again.
(first (doall (range small-num)))
;; Now we've learned that a range of one million elements won't fit into our
;; 10MB heap. So let's continue our experiments with a range which is 100 times
;; longer.
(def large-num (* 100 small-num))
;; We can get the last element of a 100 times longer sequence not exceeding our
;; 10MB heap.
(last (range large-num))
;; The same result can be achieved with reduce.
(reduce (fn [_ current] current) (range large-num))
;; Similarly, we can ask for the last element in a subsequence of adjacent
;; elements satisfying a predicate. We can achieve it either with take-while
;; or with reduce.
(defn predicate [elem]
(< elem (/ large-num 2)))
(->> (range large-num)
(take-while predicate)
(last))
(->> (range large-num)
(reduce (fn [last elem] (if (predicate elem) elem (reduced last)))))
;; Summing elements of the long sequence? Sure, we can fit it in a 10MB heap too.
;; Let's initialise our reduction with a BigDecimal to avoid integer overflows.
(->> (range large-num)
(reduce + 0M))
;; Alternatively,
(->> (range)
(take large-num)
(reduce + 0M))
;; I hope this helps. Questions, comments, critique? Let me know! Reach me at
;; jan@stepien.cc or at @janstepien on Twitter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment