Created
October 2, 2021 07:26
-
-
Save bsless/0d9863424a00abbd325327cff1ea0a04 to your computer and use it in GitHub Desktop.
Idiomatically refactor Clojure code to improve performance
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; http://johnj.com/from-elegance-to-speed.html | |
(def times (iterate #(+ % (rand-int 1000)) 0)) | |
(def times-v (into [] (take 1e6) (iterate #(+ % (rand-int 1000)) 0))) | |
(defn smt-8 [times] | |
(->> times | |
(partition 8 1) | |
(map (juxt identity | |
(comp (partial apply -) | |
(juxt last first)))) | |
(filter (comp (partial > 1000) second)) | |
doall)) | |
(cc/quick-bench (smt-8 times-v)) | |
(defn sliding | |
([^long n] | |
(sliding n 1)) | |
([^long n ^long step] | |
(fn [rf] | |
(let [a (java.util.ArrayDeque. n)] | |
(fn | |
([] (rf)) | |
([result] | |
(let [result (if (.isEmpty a) | |
result | |
(let [v (vec (.toArray a))] | |
;;clear first! | |
(.clear a) | |
(unreduced (rf result v))))] | |
(rf result))) | |
([result input] | |
(.add a input) | |
(if (= n (.size a)) | |
(let [v (clojure.lang.LazilyPersistentVector/createOwning (.toArray a))] | |
(dotimes [_ step] (.removeFirst a)) | |
(rf result v)) | |
result))))))) | |
(def smt-xf0 | |
(comp | |
(sliding 8 1) | |
(map (juxt identity | |
(comp (partial apply -) | |
(juxt last first)))) | |
(filter (comp (partial > 1000) second)))) | |
(cc/quick-bench (doall (sequence smt-xf0 times-v))) | |
(def smt-xf1 | |
(comp | |
(sliding 8 1) | |
(map (fn [v] [v (- (peek v) (nth v 0))])) | |
(filter (fn [v] (> 1000 (nth v 1)))))) | |
(cc/quick-bench (doall (sequence smt-xf1 times-v))) | |
(def smt-xf2 | |
(comp | |
(sliding 8 1) | |
(keep (fn [v] (let [d (- (peek v) (nth v 0))] | |
(when (> 1000 d) | |
[v d])))))) | |
(cc/quick-bench (doall (sequence smt-xf2 times-v))) | |
(import 'java.util.ArrayDeque) | |
(defn sliding* | |
([^long n] | |
(sliding* n 1)) | |
([^long n ^long step] | |
(fn [rf] | |
(let [a (ArrayDeque. n)] | |
(fn | |
([] (rf)) | |
([result] | |
(let [result (if (.isEmpty a) | |
result | |
(let [v (.toArray ^ArrayDeque a)] | |
;;clear first! | |
(.clear a) | |
(unreduced (rf result v))))] | |
(rf result))) | |
([result input] | |
(.add a input) | |
(if (= n (.size a)) | |
(let [v (.toArray ^ArrayDeque a)] | |
(dotimes [_ step] (.removeFirst a)) | |
(rf result v)) | |
result))))))) | |
(def smt-xf3 | |
(comp | |
(sliding* 8 1) | |
(keep (fn [^objects arr] | |
(let [d (- ^long (aget arr (unchecked-dec-int (alength arr))) ^long (aget arr 0))] | |
(when (> 1000 d) | |
[arr d])))))) | |
(do | |
(cc/quick-bench (smt-8 times-v)) ;; 1.2 sec | |
(cc/quick-bench (doall (sequence smt-xf0 times-v))) ;; 470 ms | |
(cc/quick-bench (doall (sequence smt-xf1 times-v))) ;; 54 ms | |
(cc/quick-bench (doall (sequence smt-xf2 times-v))) ;; 49 ms | |
(cc/quick-bench (doall (sequence smt-xf3 times-v))));; 30 ms |
Sorry, no, with your times-v
payload, I get:
(c/quick-bench
(=>> times-v
(x/partition 512)
(x/partition 2 1)
(map #(concat (first %) (take 7 (second %))))
(map #(x>> %
(x/partition 8 1)
(map (fn [x] [x (- (last x) (first x))]))
(filter (comp (partial > 1000) second))))
(apply concat)
count))
; Evaluation count : 6 in 6 samples of 1 calls.
; Execution time mean : 264.569217 ms
; Execution time std-deviation : 5.173518 ms
; Execution time lower quantile : 260.041933 ms ( 2.5%)
; Execution time upper quantile : 272.853110 ms (97.5%)
; Overhead used : 2.237442 ns
0.264
seconds.
All the other times above reflected a larger payload (by 10 times) I was working against.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Speaking of idiomatic, injest has some tricks ;)
18.776
seconds. Now using just transducers:11.110
seconds.To try with parallism, 10 million is too big - we run out of memory. But we can chunk up our work:
3.467
seconds on my 8 core. Not bad for such simple code. I'd argue this looks more idiomatic though:And that's
2.744
seconds - even better.With your enhancements:
I get
2.263
seconds. Not quite java speed, but waaay easier to work with. And whereas the CL version is fine-tuned for for this super-light-weight transformation, if the work got any heavier, we'd likely start pulling ahead, with our idiomatic code.