Skip to content

Instantly share code, notes, and snippets.

@ilia-cher
Created November 20, 2020 15:00
Show Gist options
  • Save ilia-cher/a5a9eb6b68504542a3cad5150fc39b1a to your computer and use it in GitHub Desktop.
Save ilia-cher/a5a9eb6b68504542a3cad5150fc39b1a to your computer and use it in GitHub Desktop.
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fb85eb71b80>
payload()
Median: 3.20 ms
IQR: 0.16 ms (3.11 to 3.27)
3127 measurements, 1 runs per measurement, 1 thread
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fb85eb71970>
payload()
Median: 55.22 ms
IQR: 1.35 ms (54.48 to 55.84)
179 measurements, 1 runs per measurement, 1 thread
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f51cfc2cb80>
payload()
Median: 3.40 ms
IQR: 0.17 ms (3.31 to 3.48)
2931 measurements, 1 runs per measurement, 1 thread
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f51cfc2c970>
payload()
Median: 56.25 ms
IQR: 1.24 ms (55.60 to 56.84)
179 measurements, 1 runs per measurement, 1 thread
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda --use_kineto
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f5de8259b80>
payload()
Median: 3.34 ms
IQR: 0.16 ms (3.26 to 3.42)
2987 measurements, 1 runs per measurement, 1 thread
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f5eba5700d0>
payload()
Median: 71.56 ms
IQR: 1.72 ms (70.57 to 72.29)
141 measurements, 1 runs per measurement, 1 thread
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda --use_kineto
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fc5d7128b80>
payload()
Median: 5.12 ms
IQR: 1.92 ms (3.50 to 5.42)
2201 measurements, 1 runs per measurement, 1 thread
WARNING: Interquartile range is 37.4% of the median measurement.
This suggests significant environmental influence.
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fc5d00a4310>
payload()
Median: 77.10 ms
IQR: 3.58 ms (75.78 to 79.36)
129 measurements, 1 runs per measurement, 1 thread
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda --use_kineto --cuda_only
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fe133b99b80>
payload()
Median: 3.32 ms
IQR: 0.18 ms (3.23 to 3.41)
3004 measurements, 1 runs per measurement, 1 thread
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7fe12c0b6eb0>
payload()
Median: 20.00 ms
IQR: 4.01 ms (17.03 to 21.04)
517 measurements, 1 runs per measurement, 1 thread
WARNING: Interquartile range is 20.1% of the median measurement.
This could indicate system fluctuation.
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda --use_kineto --cuda_only
Payload: loop, 256 iterations; timer min. runtime = 10
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f27930a9b80>
payload()
Median: 3.53 ms
IQR: 0.18 ms (3.44 to 3.62)
2824 measurements, 1 runs per measurement, 1 thread
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: True, with stacks: False, use script: False
<torch.utils.benchmark.utils.common.Measurement object at 0x7f27845dae80>
payload()
Median: 20.67 ms
IQR: 0.70 ms (20.33 to 21.03)
482 measurements, 1 runs per measurement, 1 thread
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment