-
-
Save balinterdi/4694380 to your computer and use it in GitHub Desktop.
(ns benchmark-assoc.core | |
(:require [criterium.core :as crit])) | |
(defn no-assoc [items] | |
(apply hash-map (vec (flatten items)))) | |
(defn assoc-all [items] | |
(apply assoc {} (vec (flatten items)))) | |
(defn assoc-each-one [items] | |
(reduce (fn [h [k v]] (assoc h k v)) {} (map vec items))) | |
(defn benchmark-assoc [] | |
(doseq [n [1e4 1e5 1e6]] | |
(let [pairs (take n (iterate #(map inc %) [0 0]))] | |
(println (str "--- " n " pairs")) | |
(println) | |
(println "Initialize with all") | |
(crit/bench (no-assoc pairs)) | |
(println) | |
(println "Associng in one go") | |
(crit/bench (assoc-all pairs)) | |
(println) | |
(println "Associng one-by-one") | |
(crit/bench (assoc-each-one pairs))))) |
-- 10000.0 pairs | |
Initialize with all | |
Evaluation count : 960 in 60 samples of 16 calls. | |
Execution time mean : 67.632676 ms | |
Execution time std-deviation : 2.608085 ms | |
Execution time lower quantile : 62.879875 ms ( 2.5%) | |
Execution time upper quantile : 72.020153 ms (97.5%) | |
Associng in one go | |
Evaluation count : 960 in 60 samples of 16 calls. | |
Execution time mean : 73.413894 ms | |
Execution time std-deviation : 4.324156 ms | |
Execution time lower quantile : 68.028500 ms ( 2.5%) | |
Execution time upper quantile : 81.069625 ms (97.5%) | |
Found 1 outliers in 60 samples (1.6667 %) | |
low-severe 1 (1.6667 %) | |
Variance from outliers : 43.4800 % Variance is moderately inflated by outliers | |
Associng one-by-one | |
Evaluation count : 2400 in 60 samples of 40 calls. | |
Execution time mean : 24.898720 ms | |
Execution time std-deviation : 897.388513 us | |
Execution time lower quantile : 23.186725 ms ( 2.5%) | |
Execution time upper quantile : 26.579045 ms (97.5%) | |
--- 100000.0 pairs | |
Initialize with all | |
Evaluation count : 120 in 60 samples of 2 calls. | |
Execution time mean : 711.468392 ms | |
Execution time std-deviation : 37.029515 ms | |
Execution time lower quantile : 641.727000 ms ( 2.5%) | |
Execution time upper quantile : 799.149188 ms (97.5%) | |
Found 2 outliers in 60 samples (3.3333 %) | |
low-severe 2 (3.3333 %) | |
Variance from outliers : 38.4618 % Variance is moderately inflated by outliers | |
Associng in one go | |
Evaluation count : 120 in 60 samples of 2 calls. | |
Execution time mean : 773.466508 ms | |
Execution time std-deviation : 33.222215 ms | |
Execution time lower quantile : 730.681725 ms ( 2.5%) | |
Execution time upper quantile : 832.623088 ms (97.5%) | |
Found 1 outliers in 60 samples (1.6667 %) | |
low-severe 1 (1.6667 %) | |
Variance from outliers : 28.7296 % Variance is moderately inflated by outliers | |
Associng one-by-one | |
Evaluation count : 180 in 60 samples of 3 calls. | |
Execution time mean : 345.703717 ms | |
Execution time std-deviation : 48.639714 ms | |
Execution time lower quantile : 307.542333 ms ( 2.5%) | |
Execution time upper quantile : 458.668208 ms (97.5%) | |
Found 4 outliers in 60 samples (6.6667 %) | |
low-severe 2 (3.3333 %) | |
low-mild 2 (3.3333 %) | |
Variance from outliers : 82.4198 % Variance is severely inflated by outliers |
Just a minor, comment, as I was also curious what was going on:
I tested this benchmark script under 1.4 and 1.5RC4, a things MUCH slower on 1.5RC4, would be interesting to know why...
UPDATE: when I ran https://gist.github.com/4695353 against 1.5RC4, things get much more consistent, about 1-5% performance loss for all test cases.
Good point, @laczoka, I ran it on 1.4.
Another possible reason is that @lesbroot and I have different number of cores in our machines. Not sure if the underlying code leverages multiple cores, but if it does then it could really cause a difference. I have a 2-core one. All this really proves my point that microbenchmarks are hard (but fascinating).
I cleaned up the code a bit. I factored out flatten
and vec
, so benchmarked functions only contain code that we actually want to benchmark. I also added a doall
call to realize the flattened list, which implicitly realizes the pairs as well (it does not make that much sense, though, as it only removes a first call penalty).
After these modifications no-assoc
wins the race hands down (4.2ms with 10k pairs).
(defn no-assoc [items]
(apply hash-map items))
(defn assoc-all [items]
(apply assoc {} items))
(defn assoc-each-one [items]
(reduce #(apply assoc % %2) {} items))
(defn benchmark-assoc []
(doseq [n [1e4 1e5 1e6]]
(let [pairs (for [i (range n)] [i i])
flattened (doall (flatten pairs))]
(println (str "--- " n " pairs"))
(println)
(println "Initialize with all")
(crit/bench (no-assoc flattened))
(println)
(println "Associng in one go")
(crit/bench (assoc-all flattened))
(println)
(println "Associng one-by-one")
(crit/bench (assoc-each-one pairs)))))
I also felt that generating the pairs looks better with a list comprehension. :)
@steerio It's nice to have it factored out and I like your code a lot better but as far as I see it should not make a difference as long as I'm only concerned about relative performance. In both the original and your version, no-assoc
and assoc-all
receives works on the same data so the relative numbers should be the same.
They are not. In my case, they are on par, while in yours no-assoc is 2x as fast.
To address the original question (assoc-all vs. assoc-each-one), they seem to be equally performant both for 1e5 and 1e6 pairs.
I guess the reason relies somewhere else. Even if I enforce the realization of the lazy seqs (right before & after the creation of
assoc
) with thedoall
function, thehash-map
version remains the fastest (and the performance loss is slightly noticeable).(disclosure: my version operates on
assoc
with size n/2 unlike your solution (n), but it does not change the fact of being faster withno-assoc
andassoc-all
)