Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ilia-cher/2e147732299bb5521335de6e572c1a2d to your computer and use it in GitHub Desktop.
Save ilia-cher/2e147732299bb5521335de6e572c1a2d to your computer and use it in GitHub Desktop.
~/local/pytorch (rec_fn_bench_models)$ PYTHONPATH="$(pwd)/benchmarks/experimental_components" python benchmarks/record_function_benchmark/record_function_bench.py
Benchmarking RecordFunction overhead for resnet50_jit
Running warmup... finished
Benchmarking with RecordFunction, 1 threads ... finished
<utils.common.Measurement object at 0x7fd828431668>
Record function overhead: with_rec_fn
resnet50_jit
Median: 6.80 s
IQR: 1.01 s (6.32 to 7.34)
9 measurements, 1 runs per measurement, 1 thread
WARNING: Interquartile range is 14.9% of the median measurement.
This could indicate system fluctuation.
Benchmarking without RecordFunction, 1 threads ... finished
<utils.common.Measurement object at 0x7fd828431208>
Record function overhead: without_rec_fn
resnet50_jit
Median: 6.34 s
IQR: 0.24 s (6.18 to 6.42)
10 measurements, 1 runs per measurement, 1 thread
Benchmarking with RecordFunction, 2 threads ... finished
<utils.common.Measurement object at 0x7fd828431780>
Record function overhead: with_rec_fn
resnet50_jit
Median: 3.85 s
IQR: 0.03 s (3.83 to 3.87)
16 measurements, 1 runs per measurement, 2 threads
Benchmarking without RecordFunction, 2 threads ... finished
<utils.common.Measurement object at 0x7fd828431160>
Record function overhead: without_rec_fn
resnet50_jit
Median: 3.86 s
IQR: 0.07 s (3.81 to 3.88)
16 measurements, 1 runs per measurement, 2 threads
Benchmarking with RecordFunction, 4 threads ... finished
<utils.common.Measurement object at 0x7fd828431eb8>
Record function overhead: with_rec_fn
resnet50_jit
Median: 2.09 s
IQR: 0.04 s (2.07 to 2.11)
29 measurements, 1 runs per measurement, 4 threads
Benchmarking without RecordFunction, 4 threads ... finished
<utils.common.Measurement object at 0x7fd828431e48>
Record function overhead: without_rec_fn
resnet50_jit
Median: 2.09 s
IQR: 0.03 s (2.08 to 2.12)
29 measurements, 1 runs per measurement, 4 threads
Benchmarking with RecordFunction, 8 threads ... finished
<utils.common.Measurement object at 0x7fd828431d30>
Record function overhead: with_rec_fn
resnet50_jit
Median: 1.14 s
IQR: 0.03 s (1.13 to 1.15)
53 measurements, 1 runs per measurement, 8 threads
Benchmarking without RecordFunction, 8 threads ... finished
<utils.common.Measurement object at 0x7fd8284311d0>
Record function overhead: without_rec_fn
resnet50_jit
Median: 1.15 s
IQR: 0.04 s (1.12 to 1.16)
53 measurements, 1 runs per measurement, 8 threads
Benchmarking with RecordFunction, 16 threads ... finished
<utils.common.Measurement object at 0x7fd828431da0>
Record function overhead: with_rec_fn
resnet50_jit
Median: 777.56 ms
IQR: 43.58 ms (758.04 to 801.62)
76 measurements, 1 runs per measurement, 16 threads
Benchmarking without RecordFunction, 16 threads ... finished
<utils.common.Measurement object at 0x7fd828431b70>
Record function overhead: without_rec_fn
resnet50_jit
Median: 781.93 ms
IQR: 45.45 ms (763.96 to 809.40)
76 measurements, 1 runs per measurement, 16 threads
Benchmarking with RecordFunction, 32 threads ... finished
<utils.common.Measurement object at 0x7fd828431320>
Record function overhead: with_rec_fn
resnet50_jit
Median: 739.81 ms
IQR: 29.29 ms (726.06 to 755.35)
81 measurements, 1 runs per measurement, 32 threads
Benchmarking without RecordFunction, 32 threads ... finished
<utils.common.Measurement object at 0x7fd828431828>
Record function overhead: without_rec_fn
resnet50_jit
Median: 734.16 ms
IQR: 25.87 ms (721.52 to 747.38)
82 measurements, 1 runs per measurement, 32 threads
Benchmarking RecordFunction overhead for lstm_jit
Running warmup... finished
Benchmarking with RecordFunction, 1 threads ... finished
<utils.common.Measurement object at 0x7fd8284d66d8>
Record function overhead: with_rec_fn
lstm_jit
Median: 483.40 ms
IQR: 34.93 ms (456.20 to 491.13)
126 measurements, 1 runs per measurement, 1 thread
Benchmarking without RecordFunction, 1 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6a20>
Record function overhead: without_rec_fn
lstm_jit
Median: 480.92 ms
IQR: 12.51 ms (475.84 to 488.34)
125 measurements, 1 runs per measurement, 1 thread
Benchmarking with RecordFunction, 2 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6d30>
Record function overhead: with_rec_fn
lstm_jit
Median: 305.54 ms
IQR: 7.96 ms (301.91 to 309.87)
196 measurements, 1 runs per measurement, 2 threads
Benchmarking without RecordFunction, 2 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6780>
Record function overhead: without_rec_fn
lstm_jit
Median: 303.78 ms
IQR: 6.41 ms (300.25 to 306.66)
198 measurements, 1 runs per measurement, 2 threads
Benchmarking with RecordFunction, 4 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6940>
Record function overhead: with_rec_fn
lstm_jit
Median: 209.73 ms
IQR: 9.19 ms (207.78 to 216.97)
283 measurements, 1 runs per measurement, 4 threads
Benchmarking without RecordFunction, 4 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6748>
Record function overhead: without_rec_fn
lstm_jit
Median: 206.83 ms
IQR: 4.41 ms (205.82 to 210.23)
29 measurements, 10 runs per measurement, 4 threads
Benchmarking with RecordFunction, 8 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6b38>
Record function overhead: with_rec_fn
lstm_jit
Median: 153.40 ms
IQR: 4.71 ms (150.55 to 155.26)
388 measurements, 1 runs per measurement, 8 threads
Benchmarking without RecordFunction, 8 threads ... finished
<utils.common.Measurement object at 0x7fd8284d62b0>
Record function overhead: without_rec_fn
lstm_jit
Median: 147.51 ms
IQR: 2.09 ms (146.48 to 148.58)
405 measurements, 1 runs per measurement, 8 threads
Benchmarking with RecordFunction, 16 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6dd8>
Record function overhead: with_rec_fn
lstm_jit
Median: 134.52 ms
IQR: 3.26 ms (133.17 to 136.43)
441 measurements, 1 runs per measurement, 16 threads
Benchmarking without RecordFunction, 16 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6e48>
Record function overhead: without_rec_fn
lstm_jit
Median: 133.48 ms
IQR: 4.61 ms (132.35 to 136.97)
441 measurements, 1 runs per measurement, 16 threads
Benchmarking with RecordFunction, 32 threads ... finished
<utils.common.Measurement object at 0x7fd8284d69e8>
Record function overhead: with_rec_fn
lstm_jit
Median: 175.92 ms
IQR: 10.16 ms (174.21 to 184.37)
334 measurements, 1 runs per measurement, 32 threads
Benchmarking without RecordFunction, 32 threads ... finished
<utils.common.Measurement object at 0x7fd8284d6898>
Record function overhead: without_rec_fn
lstm_jit
Median: 179.15 ms
IQR: 9.90 ms (172.68 to 182.57)
333 measurements, 1 runs per measurement, 32 threads
[----------- Record function overhead -----------]
| resnet50_jit | lstm_jit
1 threads: ---------------------------------------
with_rec_fn | 7000 (! 7%) | 480
without_rec_fn | 6300 | 480
2 threads: ---------------------------------------
with_rec_fn | 3900 | 310
without_rec_fn | 3900 | 300
4 threads: ---------------------------------------
with_rec_fn | 2100 | 210
without_rec_fn | 2100 | 210
8 threads: ---------------------------------------
with_rec_fn | 1100 | 150
without_rec_fn | 1100 | 148
16 threads: --------------------------------------
with_rec_fn | 780 | 135
without_rec_fn | 780 | 130
32 threads: --------------------------------------
with_rec_fn | 740 | 180
without_rec_fn | 730 | 180
Times are in milliseconds (ms).
(! XX%) Measurement has high variance, where XX is the median / IQR * 100.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment