Created
May 3, 2020 19:31
-
-
Save Geod24/c55e7765978d978e62c1a46631841b12 to your computer and use it in GitHub Desktop.
Benchmarking D's exception implementation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/// Context: I am currently working on improving debug info formatting in druntime | |
/// There are a few obvious things which are **very** wrong and performances seem abysmal. | |
/// The following is the first benchmark code I came up with: | |
import std.stdio; | |
void main () | |
{ | |
try | |
{ | |
foo(); | |
} | |
catch (Exception e) | |
{ | |
writeln(e); | |
} | |
} | |
void foo () | |
{ | |
// Using an empty delegate and a pre-allocated exception to avoid benchmarking | |
// the GC or IO. We only want the overhead of throwing / catching exceptions | |
scope void delegate(const(char)[] chunk) devNull = (const(char)[] chunk) {}; | |
Exception e = new Exception("Statically allocated"); | |
foreach (i; 0 .. 100_000) // Started with 1M, was a bad idea | |
{ | |
try bar(e); | |
catch (Exception ce) | |
{ | |
// Also check the overhead of no-op `toString` | |
// This triggers the backtrace code in `rt.backtrace.dwarf` | |
version (WithStackTrace) ce.toString(devNull); | |
} | |
} | |
} | |
void bar (Exception e) | |
{ | |
if (e !is null) | |
throw e; | |
} | |
// Benchmarking time! | |
// I used https://github.com/sharkdp/hyperfine with `--warmup 3` as argument | |
// (`hyperfine --warmup 3 ./benchmark_exceptions.d`) | |
// | |
// Our test matrix has 4 dimensions: | |
// - Platform: OSX 10.15.4 or Alpine Linux Edge (on docker for Mac, so expect worse results) | |
// - Compiler (dmd v2.091.1, ldc v1.21.0) | |
// - With or without debug infos (`-g` flag) | |
// - With or without `version=WithStackTrace` | |
//////////////////// OSX //////////////////// | |
// Command: dmd benchmark_exceptions.d | |
// Time (mean ± σ): 291.4 ms ± 17.3 ms [User: 289.1 ms, System: 0.8 ms] | |
// Range (min … max): 270.3 ms … 320.9 ms 10 runs | |
// | |
// Command: dmd -g benchmark_exceptions.d | |
// Time (mean ± σ): 288.5 ms ± 11.4 ms [User: 286.5 ms, System: 0.7 ms] | |
// Range (min … max): 270.9 ms … 303.5 ms 10 runs | |
// | |
// As we can see, not much difference ATM when `-g` is added, because it is simply not used. | |
// | |
// Command: dmd -version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 7.883 s ± 0.683 s [User: 7.851 s, System: 0.014 s] | |
// Range (min … max): 7.261 s … 9.121 s 10 runs | |
// | |
// Wow, that's quite a slowdown. Bear in mind that without `-g`, printing a stack trace | |
// will simply print the names of the functions, not the file/line informations. | |
// The difference with the previous case is 7.6s, that is, 76ms per `toString` call. | |
// There is a large variance, which suggest that GC pauses are involved. | |
// | |
// Command: dmd -g -version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 25.005 s ± 1.646 s [User: 24.879 s, System: 0.058 s] | |
// Range (min … max): 22.518 s … 27.120 s 10 runs | |
// | |
// Yep, this test took *5 minutes* to complete. When both `-g-` and `-version=WithStackTrace` are used, | |
// file and line informations are read and printed, but that adds almost 250ms of overhead. | |
// 250ms is a *quarter* of a second, and having it being the default formatting is not doing us a favor. | |
// | |
// Time to benchmark LDC... | |
// Command: ldc2 benchmark_exceptions.d | |
// Time (mean ± σ): 92.9 ms ± 1.5 ms [User: 90.8 ms, System: 1.8 ms] | |
// Range (min … max): 90.4 ms … 96.2 ms 31 runs | |
// | |
// Things look good! It's already > 3 times faster than DMD. | |
// Now, DMD is touted as the development compiler, while LDC is the production compiler, | |
// so it's only fair that we include `-O` in our tests. | |
// However, for this test, I've not seen any difference when using `-O`, `-O2`, `-O3`. | |
// I also tested with `-g`. While it did slow things down a tiny bit, it is not worth dwelving over. | |
// | |
// Command: ldc2 --d-version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 28.095 s ± 0.927 s [User: 27.960 s, System: 0.064 s] | |
// Range (min … max): 26.622 s … 29.746 s 10 runs | |
// | |
// This is an interesting result. While pure exception was faster than DMD, | |
// attempting to print the stack trace absolutely destroys performances, | |
// and is 3 times slower than DMD. | |
// I was quite surprised by this, so I did a run with `-O3`. | |
// It did improve things, but the results were still >= 25s. | |
// | |
// Command: ldc2 -g --d-version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 32.255 s ± 1.591 s [User: 31.996 s, System: 0.088 s] | |
// Range (min … max): 30.589 s … 35.812 s 10 runs | |
// | |
// Command: ldc2 -O2 -g --d-version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 24.505 s ± 1.482 s [User: 24.396 s, System: 0.052 s] | |
// Range (min … max): 22.569 s … 26.980 s 10 runs | |
// | |
// So as we can see, things are *extermely* slow, with a single run > 22 seconds. | |
// Of course this benchmark is artificial and incomplete. | |
// First, because you usually don't need to throw 100k exceptions and print their stack trace. | |
// Second, because it was done on a work machine that is being used, so system use affect output. | |
// It however does show a trend, and that there are likely many low hanging fruits here. | |
// | |
//////////////////// Linux //////////////////// | |
// Command: dmd benchmark_exceptions.d | |
// Time (mean ± σ): 303.4 ms ± 13.7 ms [User: 292.3 ms, System: 2.4 ms] | |
// Range (min … max): 284.4 ms … 323.5 ms 10 runs | |
// | |
// Not much difference (bear in mind we're running inside a VM). | |
// | |
// Command: dmd -version=WithStackTrace benchmark_exceptions.d | |
// A run was > 12 minutes, I gave up... | |
// I then reduces the sample size from 100_000 to 1_000 | |
// Time (mean ± σ): 6.797 s ± 0.351 s [User: 284.1 ms, System: 827.5 ms] | |
// Range (min … max): 6.214 s … 7.419 s 10 runs | |
// | |
// We can see it scales linearly, as 100 times more would be ~12 minutes. | |
// | |
// dmd -g -version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 7.337 s ± 0.551 s [User: 530.5 ms, System: 880.0 ms] | |
// Range (min … max): 6.497 s … 8.351 s 10 runs | |
// | |
// Not bad considering how slow it is on Mac... | |
// Command: ldc2 --d-version=WithStackTrace benchmark_exceptions.d | |
// Time (mean ± σ): 17.5 ms ± 2.0 ms [User: 8.7 ms, System: 1.9 ms] | |
// Range (min … max): 13.6 ms … 23.8 ms 185 runs | |
// | |
// This result should have anyone suspicious. Because indeed, it simply doesn't work, | |
// on does not simply get a stack trace with LDC on Linux (or at least Alpine). | |
// See https://github.com/ldc-developers/ldc/issues/863 | |
///////////////////////////////////////////// |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment