Skip to content

Instantly share code, notes, and snippets.

@annashipman
Last active September 20, 2021 21:22
Show Gist options
  • Save annashipman/d3b0533ce26df1e4dd84fbc3001e98dc to your computer and use it in GitHub Desktop.
Save annashipman/d3b0533ce26df1e4dd84fbc3001e98dc to your computer and use it in GitHub Desktop.
Debugging Clojure/Vagrant for 7 concurrency models in 7 weeks
I am trying to replicate an example from this book: https://uk.bookshop.org/books/seven-concurrency-models-in-seven-weeks-when-threads-unravel/9781937785659
Running the below commands in a lein repl.
In the book, the second `parallel-sum` function executes 2.5x as fast as `sum` but on my Vagrant box (which has 4 cores, and demonstrates other parallelised code having a speed up effect) the two functions take pretty much the same time every time. My Vagrantfile is here https://github.com/annashipman/7weeks-concurrency/blob/main/Vagrantfile
This works in a non-Vagrant Clojure environment, i.e parallel-sum runs much faster.
So, what should I do differently in Vagrant to get it to work?
(ns sum.core
(:require [clojure.core.reducers :as r]))
(defn sum [numbers]
(reduce + numbers))
(defn parallel-sum [numbers]
(r/fold + numbers))
(def numbers (into [] (range 0 10000000)))
(time (sum numbers))
(time (sum numbers))
(time (parallel-sum numbers))
(time (parallel-sum numbers))
@gildegoma
Copy link

After learning of LEIN_JVM_OPTS, I did following tests on a Vagrant/Virtualbox VM with 4Gb of RAM (note that I intentionally kept the default garbage collector options, and just set the Heap Size in the second example).

With Leiningen default JVM Heap Size the sum function is hammered by the garbage collector:

$ LEIN_JVM_OPTS="-verbose:gc" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]

user=> (load-file "test-sum.clj")
[GC (Allocation Failure)  253938K->141790K(367616K), 0.0845617 secs]
[Full GC (Ergonomics)  141790K->141199K(452096K), 0.6545601 secs]
[GC (Allocation Failure)  365967K->323434K(436736K), 0.1540708 secs]
[Full GC (Ergonomics)  323434K->322577K(588288K), 1.0697589 secs]
[GC (Allocation Failure)  435217K->322882K(648192K), 0.0675298 secs]
"Elapsed time: 1445.108388 msecs"
[GC (Allocation Failure)  435522K->322890K(645120K), 0.0043745 secs]
[GC (Allocation Failure)  435530K->322890K(646656K), 0.0047949 secs]
"Elapsed time: 153.090624 msecs"
49999995000000

On the other hand, parallel-sum function runs smoothly:

$ LEIN_JVM_OPTS="-verbose:gc" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-parallel-sum.clj")
[GC (Allocation Failure)  239162K->78419K(293376K), 0.0558904 secs]
[Full GC (Ergonomics)  78419K->77898K(360448K), 0.2834019 secs]
[GC (Allocation Failure)  279626K->279372K(421376K), 0.1765328 secs]
[Full GC (Ergonomics)  279372K->278413K(570368K), 0.9371647 secs]
[GC (Allocation Failure)  419725K->323027K(639488K), 0.0614067 secs]
[GC (Allocation Failure)  464339K->323027K(634880K), 0.0221538 secs]
"Elapsed time: 227.864323 msecs"
[GC (Allocation Failure)  465875K->323027K(637952K), 0.0186388 secs]
"Elapsed time: 108.615041 msecs"
49999995000000

Now if you let clojure take advantage of 3Gb of memory, you see well that the "random" slownesses are caused by the GC work.

$ LEIN_JVM_OPTS="-verbose:gc -Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-sum.clj")
"Elapsed time: 132.713757 msecs"
[GC (Allocation Failure)  827748K->325129K(3014656K), 0.2017542 secs]
"Elapsed time: 325.798749 msecs"
49999995000000

Same for parallel-sum usage:

$ LEIN_JVM_OPTS="-verbose:gc -Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-parallel-sum.clj")
"Elapsed time: 116.657475 msecs"
[GC (Allocation Failure)  827812K->325299K(3014656K), 0.1895846 secs]
"Elapsed time: 270.828002 msecs"
49999995000000

Notes: The test-sum.clj and test-parallel-sum.clj files are copy of the above script ending by only calling the two calls of the function in the filename (used execute runs independently to avoid some memory caching effects).

@gildegoma
Copy link

Running the same test cases on a hardware machine (here a MBP Core i7 2.7GHz 16Gb RAM) and OpenJDK 17 (not OpenJDK 1.8), the garbage collector is now G1 by default and the results are:

$ lein repl 
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 17+0
[...]
user=> (load-file "test-sum.clj")
"Elapsed time: 307.124373 msecs"
"Elapsed time: 359.415788 msecs"
49999995000000
$ lein repl 
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 17+0
[...]
user=> (load-file "test-parallel-sum.clj")
"Elapsed time: 175.177959 msecs"
"Elapsed time: 137.79619 msecs"
49999995000000

and with -Xms3g -Xmx3g:

$ LEIN_JVM_OPTS="-Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-sum.clj")
"Elapsed time: 144.63988 msecs"
"Elapsed time: 117.451834 msecs"
49999995000000
$ LEIN_JVM_OPTS="-Xms3g -Xmx3g" lein repl
[...]
Clojure 1.10.3
OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~16.04.1-b10
[...]
user=> (load-file "test-parallel-sum.clj")
"Elapsed time: 178.446205 msecs"
"Elapsed time: 74.989178 msecs"
49999995000000

@gildegoma
Copy link

⚠️ The above examples are only ran 1 time each, I don't pretend to present any concurrency benchmarking, but demonstrate that GC impact is tremendous and can prevent to compare parallelism gains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment