Skip to content

Instantly share code, notes, and snippets.

@wildgarden
Forked from brendangregg/fsmicrobench.md
Created January 8, 2017 14:39
Show Gist options
  • Save wildgarden/f8d812e76536db2c34576195edb90e94 to your computer and use it in GitHub Desktop.
Save wildgarden/f8d812e76536db2c34576195edb90e94 to your computer and use it in GitHub Desktop.
some FS micro-benchmarks

F1. FS 128k streaming writes

Benchmark: fio write

Command: fio --name=seqwrite --rw=write --bs=128k --size=4g --end_fsync=1 --loops=4 # aggrb tput

Rationale: Measure the performance of a single threaded streaming write of a reasonably large file. The aim is to measure how well the file system and platform can sustain a write workload, which will depend on how well it can group and dispatch writes. It's difficult to benchmark buffered file system writes in both a short duration and in a repeatable way, as performance greatly depends on if and when the pagecache begins to flush dirty data. As a workaround, an fsync() at the end of the benchmark is called to ensure that flushing will always occur, and the benchmark also repeats four times. While this provides a much more reliable measurement, it is somewhat worst-case (applications don't always fsync), providing closer to a minimum rate – rather than a maximum rate – that you should expect.

F2. FS cached 4k random reads

Benchmark: fio randread

Command: fio --name=randread --rw=randread --pre_read=1 --norandommap --bs=4k --size=256m --runtime=30 --loops=1000 # calc IOPS

Rationale: Measure the performance of a single threaded cached random read workload. Notes: The --loops option is necessary to keep fio running for --runtime=30, otherwise it aborts when it has performed --size=1g worth of random reads. Ensure instance has at least 100 MB of DRAM it can use for the pagecache; if not, reduce --size to fit.

F3. FS multi-threaded cached 4k random reads

Benchmark: fio randread

Command: fio --numjobs=#CPUs --name=randread --rw=randread --pre_read=1 --norandommap --bs=4k --size=256m/number_of_CPUs --runtime=30 --loops=1000 # calc IOPS

Rationale: Test the scalability of the file system compared to the single threaded version of this test. Notes: The --loops option is necessary to keep fio running for --runtime=30, otherwise it aborts when it has performed --size=1g worth of random reads.

F4. FS partially cached 4k random reads

Benchmark: fio randread

Command: fio --numjobs=#CPUs --name=partial --rw=randread --norandommap --random_distribution=pareto:0.9 --bs=4k --size=2xDRAM/#CPUs --runtime=60 --loops=1000 # calc IOPS

Rationale: Use a working set size that can only be partially cached, to test the performance of file system caching. This also uses a pareto access distribution to resemble real world workloads. This test may reveal additional storage I/O cache available to the instance beyond its own page cache. Notes: This is deliberately tuned to not be a disk benchmark. The --size option is per-job (thread). The intended working set size is 2 x DRAM, so this amount should be divided by the number of CPUs for the --size. You may also need the latest version of fio for the distribution option: https://github.com/axboe/fio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment