Skip to content

Instantly share code, notes, and snippets.

@green-coder
Forked from mfikes/tree-seq.clj
Last active April 8, 2018 20:43
Show Gist options
  • Save green-coder/3adf11660b7b0ca83648c5be69de2a3b to your computer and use it in GitHub Desktop.
Save green-coder/3adf11660b7b0ca83648c5be69de2a3b to your computer and use it in GitHub Desktop.
Directly-reducible PoC in Clojure
(defn nth' [coll n]
(transduce (drop n) (completing #(reduced %2)) nil coll))
(defn tree-seq'
[branch? children root]
(eduction (take-while some?) (map first)
(iterate (fn [[node pair]]
(when-some [[[node' & r] cont] (if (branch? node)
(if-some [cs (not-empty (children node))]
[cs pair]
pair)
pair)]
(if (some? r)
[node' [r cont]]
[node' cont])))
[root nil])))
;; Adapted from nth' because we now have an transducer and not a sequence.
(defn nth'' [xf coll n]
(transduce (comp xf (drop n))
(completing #(reduced %2))
nil
coll))
;; My transducer implementation of `clojure.core.tree-seq`
(defn tree-seq''
([branch? children]
(fn [rf]
(fn xf
([] (rf))
([result] (rf result))
([result input]
(let [result (rf result input)]
(if (reduced? result) result
(if (branch? input)
(loop [r result
cs (children input)]
(if (seq cs)
(let [r' (xf r (first cs))]
(if (reduced? r') r'
(recur r' (rest cs))))
result))
result))))))))
(time (dotimes [x 10000] (nth' (tree-seq seq? identity (repeat 10 (repeat 100 [1 2 (repeat 100 3)]))) 1000)))
;; "Elapsed time: 6161.53902 msecs"
(time (dotimes [x 10000] (nth' (tree-seq' seq? identity (repeat 10 (repeat 100 [1 2 (repeat 100 3)]))) 1000)))
;; "Elapsed time: 2830.583306 msecs"
;; Please test on your CPU, it should be a lot faster.
;; About less than half the time above.
(time (dotimes [x 10000]
(nth'' (tree-seq'' sequential? identity)
(repeat 10 (repeat 100 [1 2 (repeat 100 3)]))
1000)))
@green-coder
Copy link
Author

There might be a local optimization around the seq, first and rest at Line 39 where we iterate on the sequence instead of the vector. To be tested.

@reborg
Copy link

reborg commented Feb 19, 2018

Good job! The speed boost seems related more to the different implementation of tree-seq than transducers, which probably calls for a new name, as your tree-seq'' is not producing a lazy seq. Naming apart, there could be scenarios where your fully realized tree-walk (here's a better name) could be useful. For instance:

(let [coll ["a" [1 2 [:a [] :b] [:c] 4] :d]]
  (sequence
    (comp
      (mapcat #(tree-seq vector? identity %))
      (filter keyword?)
      (map name))
    coll))

Is not that far from yours:

(let [coll ["a" [1 2 [:a [] :b] [:c] 4] :d]]
  (sequence
    (comp
      (tree-seq'' vector? identity)
      (filter keyword?)
      (map name))
    coll))

but the former has the needlessly lazy (and slower) tree-seq step.

@green-coder
Copy link
Author

Thank you for your feedback. I too agree that the name has to be changed. It was named tree-seq'' here only for comparison between implementations.

This code will be re-worked a little bit and may be added to the xform library, let's move the conversation there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment