Skip to content

Instantly share code, notes, and snippets.

@oskopek
Last active September 3, 2015 13:17
Show Gist options
  • Save oskopek/d349428875e58cadfc39 to your computer and use it in GitHub Desktop.
Save oskopek/d349428875e58cadfc39 to your computer and use it in GitHub Desktop.
Statistical benchmarking

Statistical benchmarking

  • Goal: Rank solvers agnostic of PCs, JVMs, processes, OSs, runs, …​ (as much as possible), depending only on the Solver. ✓

  • Means: Run each single benchmark multiple (N) times (on different JVMs) and evaluate the results statistically. ✓

    • How big should N be? (calculate statistically) (Bloch: 30+) X

      • Student’s t-test ✓ (on the fly? + append another run) X

  • Statistics - results

    • Have to be modular ✓

      • f.e.: avg, min, max, median, geom. mean, std. dev ✓

    • Discussion with Jirka: evaluate against a function (1 hard ~ 300 soft) X

    • Library support? Implement our own? X

  • Report ✓

    • Examples ✓

      • Box plot ✓ JFreeChart? ✓

      • Candlestick diagram ✓ JFreeChart? ✓

        • Difference vs. box plot? ✓

      • Violin plot ✓ JFreeChart? X

        • Box plot vs. Violin plot? Choose one. See this article for inspiration.

        • I prefer the Violin plot, but it doesn’t have an implementation in JFreeChart.

    • Show as a layer above the current summary/other graphs (tabs in tabs) ?

    • Do we have support in JFreeChart? Do we need additional libraries? ✓

  • After resolution ?

    • Test reliability: Single thread vs 2 vs 4 ?

    • Validate old performance blog post benchmarks ?

  • Research ✓

    • Read up on performance and statistics ✓

  • Implementation ✓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment