Skip to content

Instantly share code, notes, and snippets.

@markrmiller
Last active November 28, 2023 15:58
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save markrmiller/a04f5c734fad879f688123bc312c21af to your computer and use it in GitHub Desktop.
Save markrmiller/a04f5c734fad879f688123bc312c21af to your computer and use it in GitHub Desktop.

JMH-duke (1) plus L![plus](https://user-images.githubusercontent.com/448788/136684602-3f83cdb9-89ca-4b01-856c-6663723f582d.png) ucene Logo

profile, approximate luceneutil, compare and introspect

A flexible, developer-friendly, microbenchmark framework

Table Of Content


Overview

JMH is a Java microbenchmark framework from some of the developers that work on OpenJDK. Not surprisingly, OpenJDK is where you will find JMH's home today, alongside some useful little Java libraries such as JOL (Java Object Layout). The significant value in JMH is that you get to stand on the shoulders of some brilliant engineers that have done some tricky groundwork that many an ambitious Java benchmark writer has merrily wandered past.

Rather than provide a boilerplate framework for driving iterations and measuring elapsed times, which JMH does happily do, the focus is on the many forces that deceive and disorient the Ernest benchmark enthusiast. From spinning your benchmark into a whole new algorithmic generated set of benchmark source code in an attempt to avoid falling victim to undesirable optimizations to offering BlackHoles and a solid collection of convention and cleverly thought out yet simple boilerplate, the goal of JMH is to lift the developer off the microbenchmark floor and at least to their knees.

JMH reaches out a hand to both the best and most regular among us in a solid, cautious the effort to promote us into the real, often obscured game of the microbenchmark.

The Module Contains a few Distinct Focuses

JMH micro-benchmarks & introspection

Random Data generation boilerplate

Luceneutil benchmark approximation included

Getting Started

Running JMH is handled via the jmh.sh shell script. This script uses Gradle to extract the correct classpath and configures a handful of helpful Java command prompt arguments and system properties. For the most part, the jmh. The jmh.sh script will pass any arguments it receives directly to JMH. You run the script from the root JMH module directory.

Running jmh.sh with no Arguments

# run all benchmarks found in subdirectories
$ ./jmh.sh

Pass a regex pattern or name after the command to select the benchmark(s) to run

$ ./jmh.sh BenchmarkClass 

The argument -l will list all the available benchmarks

$ ./jmh.sh -l

Check which benchmarks will run by entering a pattern after the -l argument

Use the full benchmark class name, the simple class name, the benchmark method name, or a substring.

./jmh.sh -l Ben

Further Pattern Examples

$ ./jmh.sh -l org.apache.lucene.jmh.benchmarks.search.BenchmarkClass
$ ./jmh.sh -l BenchmarkClass
$ ./jmh.sh -l BenchmarkClass.benchmethod
$ ./jmh.sh -l Bench
$ ./jmh.sh -l benchme

jmh.sh accepts all the standard arguments that the standard JMH main-class handles

Here we tell JMH to run the trial iterations twice, forking a new JVM for each trial. We also explicitly set the number of warmup iterations as well as the measured iterations to 2.

$ ./jmh.sh -f 2 -wi 2 -i 2 BenchmarkClass

Overriding Benchmark Parameters

@Param("1000")
private int numDocs;

The state objects that can be specified in benchmark classes will often have a number of input parameters that benchmark method calls will access. The notation above will default numDocs to 1000 and also allow you to override that value using the -p argument. A benchmark might also use a @Param annotation such as:

@Param("1000","5000","1000")
private int numDocs;

By default, that would cause the benchmark to be run enough times to use each of the specified values. If multiple input parameters are specified this way, the number of runs needed will quickly expand. You can pass multiple -p arguments and each will completely replace the behavior of any default annotation values.

# use 2000 docs instead of 1000
$ ./jmh.sh BenchmarkClass -p numDocs=2000


# use 5 docs, then 50, then 500
$ ./jmh.sh BenchmarkClass -p numDocs=5,50,500


# run the benchmark enough times to satisfy every combination of two
multi-valued input parameters
$ ./jmh.sh BenchmarkClass -p numDocs=10,20,30 -p docSize 250,500

Format and Write Results to Files

Rather than just dumping benchmark results to the console, you can specify the -rf argument to control the output format; for example, you can choose CSV or JSON. The -off argument will dictate the filename and output location.

# format output to JSON and write the file to the `work` directory relative to
the JMH working directory.
$ ./jmh.sh BenchmarkClass -rf json -rff work/jmh-results.json

πŸ’‘ If you pass only the -rf argument, JMH will write out a file to the current working directory with the appropriate extension, e.g., jmh-results.csv.

JMH Command-Line Arguments

The JMH Command-Line Syntax

Usage: ./jmh.sh [regexp*] [options]
[opt] means optional argument.
<opt> means required argument.
"+" means comma-separated list of values.
"time" arguments accept time suffixes, like "100ms".

Command-line options usually take precedence over annotations.

The Full List of JMH Arguments

Expand the following collapsible section for the entire list of JMH arguments and their descriptions. ​

πŸ—£οΈ πŸ…²πŸ…»πŸ…ΈπŸ…²πŸ…ΊπŸ„΅πŸ„ΎπŸ… πŸ…ΌπŸ…žπŸ†πŸ…΄

 Usage: ./jmh.sh [regexp*] [options]
 [opt] means optional argument.
 <opt> means required argument.
 "+" means a comma-separated list of values.
 "time" arguments accept time suffixes, like "100ms".

Command-line options usually take precedence over annotations.

  [arguments]                 Benchmarks to run (regexp+). (default: .*) 

  -bm <mode>                  Benchmark mode. Available modes are: 
                              [Throughput/thrpt,  AverageTime/avgt, 
                              SampleTime/sample, SingleShotTime/ss, 
                              All/all]. (default: Throughput) 

  -bs <int>                   Batch size: number of benchmark method calls per 
                              operation. Some benchmark modes may ignore this 
                              setting; please check this separately. 
                              (default: 1) 

  -e <regexp+>                Benchmarks to exclude from the run. 

  -f <int>                    How many times to fork a single benchmark. Use 0
                              to  disable forking altogether. Warning:
                              disabling  forking may have a detrimental impact on
                               benchmark and infrastructure reliability. You might
                                want
                              to use a different warmup mode instead. (default: 
                              5) 

  -foe <bool>                 Should JMH fail immediately if any benchmark has
                              experienced an unrecoverable error? Failing fast
                              helps to make quick sanity tests for benchmark
                              suites and allows automated runs to do error
                              checking.
                              codes. (default: false) 

  -gc <bool>                  Should JMH force GC between iterations? Forcing 
                              GC may help lower the noise in GC-heavy benchmarks
                              at the expense of jeopardizing GC ergonomics
                              decisions. 
                              Use with care. (default: false) 

  -h                          Displays this help output and exits. 

  -i <int>                    Number of measurement iterations to do.
                              Measurement 
                              iterations are counted towards the benchmark
                            score. 
                              (default: 1 for SingleShotTime, and 5 for all
                              other modes) 

  -jvm <string>               Use given JVM for runs. This option only affects
                              forked  runs. 

  -jvmArgs <string>           Use given JVM arguments. Most options are 
                              inherited from the host VM options, but in some
                              cases, you want to pass the options only to a forked
                             VM. Either single space-separated option line or 
                             multiple options are accepted. This option only
                             affects forked runs. 

  -jvmArgsAppend <string>     Same as jvmArgs, but append these options after
                               the  already given JVM args. 

  -jvmArgsPrepend <string>    Same as jvmArgs, but prepend these options before 
                              the already given JVM arg. 

  -l                          List the benchmarks that match a filter and exit. 

  -lp                         List the benchmarks that match a filter, along
  with 
                              parameters, and exit. 

  -lprof                      List profilers and exit. 

  -lrf                        List machine-readable result formats and exit. 

  -o <filename>               Redirect human-readable output to a given file. 

  -opi <int>                  Override operations per invocation, see 
                               @OperationsPerInvocation  Javadoc for details.
                               (default: 1) 

  -p <param={v,}*>            Benchmark parameters. This option is expected to
                               be used once per parameter. The parameter name and
                                parameter values should be separated with an
                                equal sign. Parameter values should be separated
                                with commas. 

  -prof <profiler>            Use profilers to collect additional benchmark
  data. 
                              Some profilers are not available on all JVMs or
                              all OSes.  '-lprof' will list the available 
                              profilers that are available and that can run
                              with the current OS configuration and installed dependencies.

  -r <time>                   Minimum time to spend at each measurement
                              iteration. Benchmarks may generally run longer
                              than the iteration duration. (default: 10 s) 

  -rf <type>                  Format type for machine-readable results. These 
                              results are written to a separate file
                              (see -rff).  See the list of available result
                              formats with -lrf. 
                              (default: CSV) 

  -rff <filename>             Write machine-readable results to a given file. 
                              The -rf option controls the file format. Please
                              see  the list of result formats available. 
                              (default: jmh-result.<result-format>) 

  -si <bool>                  Should JMH synchronize iterations? Doing so would
                               significantly lower the noise in multithreaded
                               tests by ensuring that the measured part happens
                               when all workers are running.
                              (default: true) 

  -t <int>                    Number of worker threads to run with. 'max' means
                              the maximum number of hardware threads available
                              the machine, figured out by JMH itself. 
                              (default:  1) 

  -tg <int+>                  Override thread group distribution for asymmetric 
                              benchmarks. This option expects a comma-separated 
                              list of thread counts within the group. See 
                              @Group/@GroupThreads 
                              Javadoc for more information. 

  -to <time>                  Timeout for benchmark iteration. After reaching
                              this timeout, JMH will try to interrupt the running
                             tasks. Non-cooperating benchmarks may ignore this =
                              timeout. (default: 10 min) 

  -tu <TU>                    Override time unit in benchmark results. Available
                              time units are: [m, s, ms, us, ns].
                              (default: SECONDS) 

  -v <mode>                   Verbosity mode. Available modes are: [SILENT,
                            NORMAL, 
                              EXTRA]. (default: NORMAL) 

  -w <time>                   Minimum time to spend at each warmup iteration.
                              Benchmarks 
                              may generally run longer than iteration duration. 
                              (default: 10 s) 

  -wbs <int>                  Warmup batch size: number of benchmark method
                               calls  per operation. Some benchmark modes may
                               ignore this  setting. (default: 1) 

  -wf <int>                   How many warmup forks to make for a single
                              benchmark.   All benchmark iterations within the
                              warmup fork do not count towards the benchmark score.
                              Use 0 to disable warmup forks. (default: 0) 

  -wi <int>                   Number of warmup iterations to do. Warmup
                             iterations do not count towards the benchmark
                             score. 
                             (default:  0 for SingleShotTime, and 5 for all other
                             modes) 

  -wm <mode>                  Warmup mode for warming up selected benchmarks. 
                              Warmup modes are INDI = Warmup each benchmark
                              individually, 
                              then measure it. BULK = Warm up all benchmarks
                              first, then do all the measurements. BULK_INDI =
                              warmup all benchmarks first, then re-warm up each
                              benchmark individually, then measure it. 
                              (default: INDI) 

  -wmb <regexp+>              Warmup benchmarks to include in the run, in
                              addition to already selected by the primary filters.
                              The harness will not measure these benchmarks but only
                              use them for the warmup. 

Writing JMH benchmarks

For additional insight around writing correct JMH tests, the best place to start is the sample code provided by the JMH project.

JMH is highly configurable, and users are encouraged to look through the samples for exposure around what options are available. A good tutorial for learning JMH basics is found here

Continued Documentation

πŸ“š JMH and Writing Lucene Benchmarks

/mnt/t1/Dropbox/Public/IMG_8203.PNG

JMH Profilers

Introduction

Some may think that the appeal of a micro-benchmark is in the relative ease one can be put together. The often isolated nature of what is being measured. But this perspective is actually what can often make them dangerous. Benchmarking can be easy to approach from a non-rigorous, casual angle that results in the feeling that they are a relatively straightforward part of the developer's purview. From this viewpoint, microbenchmarks can appear downright easy. But good benchmarking is hard. Microbenchmarks are very hard. Java and HotSpot make hard even harder.

JMH was developed by engineers that understood the dark side of benchmarks very well. They also work on OpenJDK, so they are abnormally suited to building a java microbenchmark framework that tackles many common issues that naive approaches and go-it-alone efforts are likely to trip on. Even still, they will tell you, JMH is a sharp blade. Best to be cautious and careful when swinging it around.

The good folks working on JMH did not just build a better than average java micro-benchframework and then leave us to the still many wolves, though. They also built-in first-class support for the essential tools that the ambitious developer absolutely needs for defense when bravely trying to understand performance. This brings us to the JMH profiler options.

Using JMH Profilers

Using JMH with the Async-Profiler

It's good practice to check profiler output for micro-benchmarks in order to verify that they represent the expected application behavior and measure what you expect to measure. Some example pitfalls include the use of expensive mocks or accidental inclusion of test setup code in the benchmarked code. JMH includes async-profiler integration that makes this easy:

$ ./jmh.sh -prof async:libPath=/path/to/libasyncProfiler.so\;dir=profile-results

Run a specific test with async and GC profilers on Linux and flame graph output.

$ ./jmh.sh -prof gc -prof async:libPath=/path/to/libasyncProfiler.so\;output=flamegraph\;dir=profile-results BenchmarkClass

With flame graph output:

$ ./jmh.sh -prof async:libPath=/path/to/libasyncProfiler.so\;output=flamegraph\;dir=profile-results

Simultaneous CPU, allocation, and lock profiling with async profiler 2.0 and Java Flight Recorder output:

$ ./jmh.sh -prof async:libPath=/path/to/libasyncProfiler.so\;output=jfr\;alloc\;lock\;dir=profile-results BenchmarkClass

A number of arguments can be passed to configure async profiler, run the following for a description:

$ ./jmh.sh -prof async:help

You can also skip specifying libPath if you place the async profiler lib in a predefined location, such as one of the locations in the env variable LD_LIBRARY_PATH if it has been set (many Linux distributions set this env variable, Arch by default does not), or /usr/lib should work.

OS Permissions for Async-Profiler

Async Profiler uses perf to profile native code in addition to Java code. It will need the following for the necessary access.

$ :hash: echo 0 > /proc/sys/kernel/kptr_restrict
$ :hash: echo 1 > /proc/sys/kernel/perf_event_paranoid

or

$ sudo sysctl -w kernel.kptr_restrict=0
$ sudo sysctl -w kernel.perf_event_paranoid=1

Using JMH with the GC Profiler

You can run a benchmark with -prof gc to measure its allocation rate:

$ ./jmh.sh -prof gc:dir=profile-results

Of particular importance is the norm alloc rates, which measure the allocations per operation rather than allocations per second.

Using JMH with the Java Flight Recorder Profiler

JMH comes with a variety of built-in profilers. Here is an example of using JFR:

$ ./jmh.sh -prof jfr:dir=profile-results\;configName=jfr-profile.jfc BenchmarkClass

In this example, we point to the included configuration file with config name, but you could also do something like settings=default or settings=profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment