Skip to content

Instantly share code, notes, and snippets.

@briansunter
Last active May 15, 2023 22:28
Show Gist options
  • Save briansunter/24cf3a357aaf2c4993cd6d6fd4c47980 to your computer and use it in GitHub Desktop.
Save briansunter/24cf3a357aaf2c4993cd6d6fd4c47980 to your computer and use it in GitHub Desktop.
distinct-by clojure transducer
(defn distinct-by
"Returns a lazy sequence of the elements of coll, removing duplicates of (f item).
Returns a stateful transducer when no collection is provided."
{:added "1.0"}
([f]
(fn [rf]
(let [seen (volatile! #{})]
(fn
([] (rf))
([result] result)
([result input]
(let [value (f input)]
(if (contains? @seen value)
result
(do (vswap! seen conj value)
(rf result input))))))))
([f coll]
(let [step (fn step [xs seen]
(lazy-seq
(when-let [s (seq xs)]
(let [h (first s)
t (rest s)
value (f h)]
(if (contains? seen value)
(recur t seen)
(cons h (step t (conj seen value)))))))]
(step coll #{}))))
@raymond-w-ko
Copy link

For anyone not looking closely, the transducer version (distinct-by f) in the first half does not work properly. input should probably be (f input) on both places, and ideally it should be (let [value (f input)] ... in case f has side effects.

@thenonameguy
Copy link

Since this gist has been blessed by Google when you search for "distinct-by transducer", here is a correct implementation based on @raymond-w-ko 's feedback:
https://gist.github.com/thenonameguy/714b4a4aa5dacc204af60ca0cb15db43
Copy-paste away!

@raspasov
Copy link

raspasov commented Apr 9, 2021

The excellent medley library has a correct (distinct-by f) transducer:

https://github.com/weavejester/medley/

@bdevel
Copy link

bdevel commented May 12, 2023

Here's simple version for anyone else who comes across this page.

(defn uniq-by
  ""
  [f items]
  (:uniq (reduce (fn [acc m]
                      (let [k (f m) ]
                        (if (get-in acc [:seen k])
                          acc
                          (-> acc
                              (update :seen assoc k true)
                              (update :uniq conj m)))))
                    {:seen {} :uniq []}
                    items)))

(comment
  (uniq-by :name [{:name "Bob"} {:name "Jane"} {:name "Bob"}])
  )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment