chadbrewbaker/easyPerf5Dec2021.md

## easyPerf5Dec2021.md

      
    Raw
  

              easyPerf5Dec2021.md
            
          
    Guest Andrey Akinshin @andrey_akinshin


Benchmarks shoudld have a question about a business decision in mind.
Corner cases are routine.
If a few samples goes from 100ms to 5 seconds then we don't need fancy methods.
What are practically significant differences in the context of your problem?
Avoid measurements from cold starts - unless that is what you are trying to benchmark.
Measure of central tendency - you usually want the median not the mean.
Median trick - take an odd number of samples to have one number in the middle.
Efficiency of the Harrell-Davis quantile estimator
New Arxiv paper
Perfolizer
Try median absolute deviation instead of standard deviation. Plot the distribution of the medians from each experiement.
Effect size - median/median absoulte deviation. Understand what noise levels are.
Take four medians (quantiles) etc to get more resolution. Take 4*(2n+1) samples so you have four exact numbers.
Use sequential analysis - if you are getting steady samples then stop - don't burn your AWS bill.
Big four parameters: false postitive rate, false negative rate, effect size, number of measurements.
Watch for p-hacking repeated sampling bias.
Plotting distributions is helpful. Histogram, density estimation.
Performance is almost never a normal distribution - they are usually multimodal.
Quantile-respectful density estimation
Sheather-Jones bandwidth estimation
Make sure your machines have enough free disk space size before running benchmarks. Also be aware of thermal throttling.
Ask questions on slowest and longest runs. /usr/bin/time -v
Close your other programs when benchmarking - especially the browser.
Especially watch browser extensions that are noisy, Spotlight on MacOS, Windows Defender.
Use a faraday cage like a microwave to isolate phones for benchmarking.
Eliminate as much noise as you can - we want repeatability.
Story - all unit tests sped up by 10% on Saturday, slowed down on Monday. VMs had separate CPU cores, but shared disk. No disk contention on weekends.
Change point detectors on time series. Need to use a suite as no one algorithm is best for all distributions.
ED-PELT works good for short timeseries, can produce a lot of false positives.
Jetbrains has about 10 standard checkpoints to compare performance.
Evergreen
MP² quantile estimator: estimating the moving median without storing values - constant memory footprint.
Book/podcast recomendations - write some code after reading or you will forget.