Skip to content

Instantly share code, notes, and snippets.

@maacl
Last active July 19, 2020 13:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save maacl/953b092af98ee71aa58fbe48f80d6f64 to your computer and use it in GitHub Desktop.
Save maacl/953b092af98ee71aa58fbe48f80d6f64 to your computer and use it in GitHub Desktop.
r/fold parallelism
(ns csv2summap.core
(:require [clojure.data.csv :as csv]
[clojure.java.io :as io]
[clojure.core.reducers :as r]))
(with-open [writer (io/writer "numbers.csv")]
(csv/write-csv
writer
(take 1000000 (repeatedly #(vector (char (+ 65 (rand-int 26))) (rand-int 1000))))))
(defn sum-vals
([] {})
([m [k v]]
(update m k (fnil + 0) (Integer/parseInt v))))
(defn merge-sums
([] {})
([& m] (apply merge-with + m)))
;; r/fold version, should run in parallel on all cores, but does not
(with-open [reader (io/reader "numbers.csv")]
(doall
(r/fold
(/ 1000000 12)
merge-sums
sum-vals
(csv/read-csv reader))))
(defproject csv2summap "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "EPL-2.0 OR GPL-2.0-or-later WITH Classpath-exception-2.0"
:url "https://www.eclipse.org/legal/epl-2.0/"}
:dependencies [[org.clojure/clojure "1.10.1"]
[org.clojure/data.csv "1.0.0"]]
:repl-options {:init-ns csv2summap.core})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment