This gist compares the performance of Julia, Nim, C++ and R - the latter using either POMP, or LibBi in a simple simulation of an SIR epidemiological model. In addition to keeping track of susceptibles, infecteds and recovereds, I also store the cumulative number of infections. Time moves in discrete steps, and the algorithm avoids language-specific syntax features to make the comparison as fair as possible, including using the same algorithm for generating binomial random numbers and the same random number generator; the exception are the R versions, POMP uses the standard R Mersenne Twister for the random number generator; I'm not sure what LibBi uses. The algorithm for generating random binomial numbers is only really suitable for small np.
Benchmarks were run on a Mac Pro (Late 2013), with 3 Ghz 8-core Intel Xeon E3, 64GB 1866 Mhz RAM, running OSX v 10.11.3 (El Capitan), using Julia v0.6.1, Nim v0.17.3, clang v5.0.0, emscripten v1.37.34, node v8.9.1, and R v3.3.2.
Compile using:
nim c -d:release --passC="-flto -ffast-math" sir
Run:
./sir
Output:
0.038819
20.75
The first number is time taken in s, the second the result of averaging the cumulative number of infections (Y
in the code).
Based on the comments below, I wrote an optimised version, sir_opt.nim
.
nim c -d:release --passC="-flto -ffast-math" sir_opt
./sir_opt
Output:
0.031404
20.964
Nim can also compile to Webassembly via Emscripten; using nim.cfg
linked below:
nim c -d:release -d:wasm -o:sir_opt.js sir_opt
node ./sir_opt.js
Output:
0.052
20.964
Run:
julia sir.jl
Output:
0.058961 seconds (2.09 k allocations: 232.998 KiB)
21.019
There is also a modified version that uses tuples (and hence is also a bit neater), adds in a couple of macros, and uses an algorithm to work out random binomials that is faster for small np. Thanks to @DNF2 and @ChrisRackauckas (see below) for tips here.
julia sir_opt.jl
0.045802 seconds (87 allocations: 14.123 KiB)
21.028
Compilation:
g++ -std=c++11 -O3 -o sir_cpp sir.cpp
Running:
./sir_cpp
Output:
0.050181
20.708
To compile the C++ code into Webassembly, I use Emscripten.
Compilation:
em++ -s WASM=1 -s SINGLE_FILE=1 -std=c++11 -O3 -o sir_cpp.js sir.cpp
Running:
node sir_cpp.js
Output:
0.07
20.708
This C++ code uses code from random_real.c, copyright Taylor G. Campbell (2014) and a portable implementation of clzll.
Running:
Rscript sir_pomp.R
Output:
Loading required package: methods
user system elapsed
0.741 0.108 0.849
Running:
Rscript sir_bi.R
Output:
Model: sir
Run time: 0.073982 seconds
Number of samples: 1000
[1] 20.251
Note that it takes much longer in practice than above for rbi to run the model, due to the need for compilation, which takes a lot longer than Julia. Thanks to @sbfnk for some excellent examples.
Any suggestions for improvements welcome!
As it turns out, the benchmarks were much closer on my Mac Pro, so I've edited the readme accordingly. Still trying to work out why there was a difference on my laptop...