Last active
November 19, 2017 17:06
-
-
Save reborg/37c76a61c9a60ddae8de737170e2e675 to your computer and use it in GitHub Desktop.
Explorations around frequencies as a reducing function for transducers, sequential or parallel.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(require '[criterium.core :refer [quick-bench]]) | |
(require '[clojure.core.reducers :as r]) | |
(import '[java.util HashMap Collections Map] | |
'java.util.concurrent.atomic.AtomicInteger | |
'java.util.concurrent.ConcurrentHashMap) | |
(set! *warn-on-reflection* true) | |
;; 500k maps with the same key. value are overlapping 1/5 of the time. | |
(def data | |
(into [] | |
(map hash-map | |
(repeat :samplevalue) | |
(concat | |
(range 1e5) | |
(range 1e5) | |
(range 1e5) | |
(range 1e5) | |
(range 1e5))))) | |
;; Standard version. Transformations as transducers, frequencies | |
;; applied on the results. | |
(quick-bench | |
(frequencies | |
(eduction | |
(keep :samplevalue) | |
(map int) | |
data))) | |
;; Execution time mean : 236.744558 ms | |
;; frequencies implementation is used as a reducing function. A transient is updated | |
;; during reduction, it is made persistent as the last step. | |
(quick-bench | |
(transduce | |
(comp | |
(keep :samplevalue) | |
(map int)) | |
(completing #(assoc! %1 %2 (inc (get %1 %2 0))) persistent!) | |
(transient {}) | |
data)) | |
;; Execution time mean : 207.890362 ms | |
;; Going parallel on a mutable Java concurrent-hashmap takes the time down consistently. | |
;; Restriction1: transducers (if any) need to be stateless. | |
;; Restriction2: nil is not allowed, so they are always removed. | |
;; Bonus: the output is converted into persistent at the end. You can | |
;; remove that step for an additional speedup (~20ms). | |
(quick-bench | |
(let [m (ConcurrentHashMap. (quot (count data) 2) 0.75 (.availableProcessors (Runtime/getRuntime))) | |
combinef (fn ([] m) ([_ _])) | |
rf (fn [^Map m k] | |
(let [^AtomicInteger v (or (.get m k) (.putIfAbsent m k (AtomicInteger. 1)))] | |
(when v (.incrementAndGet v)) | |
m)) | |
reducef ((comp (keep :samplevalue) (map int)) rf)] | |
(do (r/fold combinef reducef data) | |
(into {} m)))) | |
;; Execution time mean : 45.498409 ms |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment