This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
import torch.nn as nn | |
import torch.nn.parallel | |
import torch.backends.cudnn as cudnn | |
import torch.optim | |
import torch.utils.data | |
import torchvision | |
import torchvision.transforms as T | |
import torchvision.datasets as datasets | |
import torchvision.models as models |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
from torch.profiler import * | |
for with_cuda in [True, False]: | |
with profile() as prof: | |
x = torch.randn(2, 2) | |
if with_cuda: | |
x = x.cuda() | |
x = x.matmul(x) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(pytorch) iliacher@devgpu083:~/local/pytorch (activities_default)$ python test_resnet50.py | |
Files already downloaded and verified | |
step:0 | |
step:1 | |
step:2 | |
step:3 | |
step:4 | |
step:5 | |
step:6 | |
step:7 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
before: | |
~/local/pytorch (flops_warnings)$ python benchmarks/profiler_benchmark/profiler_bench.py | |
Payload: loop, 256 iterations; timer min. runtime = 10 | |
Profiling disabled, tensor size 1x1, use cuda: False, use kineto: False, with stacks: False, use script: False | |
<torch.utils.benchmark.utils.common.Measurement object at 0x7fcc4ca50490> | |
payload() | |
Median: 688.50 us | |
IQR: 7.70 us (684.29 to 691.99) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit edc815cb94e4a1cc501cda87c6e05a73137e4593 (HEAD -> extra_sampling_2, origin/gh/ilia-cher/89/orig) | |
Author: ilia-cher <iliacher@fb.com> | |
Date: Wed Dec 9 14:34:44 2020 -0800 | |
(pytorch) iliacher@devgpu083:~/local/pytorch (extra_sampling_2)$ python | |
Python 3.8.5 (default, Sep 4 2020, 07:30:14) | |
[GCC 7.3.0] :: Anaconda, Inc. on linux | |
Type "help", "copyright", "credits" or "license" for more information. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
iliacher@devgpu083:~/fbcode (20ae4497)$ ./buck-out/gen/caffe2/binaries/record_function_benchmark | |
Warm up | |
Tensor GEMM benchmark (1x1, 10000): 22792 us. | |
Tensor GEMM benchmark (16x16, 10000): 31387 us. | |
Pure RecordFunction benchmark (10000): 44 us. | |
Running without observers | |
Tensor GEMM benchmark (1x1, 10000): 7626 us. | |
Tensor GEMM benchmark (16x16, 10000): 10927 us. | |
Pure RecordFunction benchmark (10000): 84 us. | |
WARNING: Logging before InitGoogleLogging() is written to STDERR |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(pytorch) iliacher@devgpu083:~/local/pytorch (feeee76e)$ ./build/bin/record_function_benchmark | |
Warmup time: 335 us. | |
Running without observers | |
Tensor GEMM benchmark (1x1, 10000): 11665 us. | |
Tensor GEMM benchmark (16x16, 10000): 52187 us. | |
Pure RecordFunction benchmark (10000): 155 us. | |
Running with empty observers | |
Tensor GEMM benchmark (1x1, 10000): 21440 us. | |
Tensor GEMM benchmark (16x16, 10000): 61519 us. | |
Pure RecordFunction benchmark (10000): 1561 us. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
---------------------------- ------------ ------------ ------------ ------------ ------------ ------------ --------------------------------------------------------------------------- | |
Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls Source Location | |
---------------------------- ------------ ------------ ------------ ------------ ------------ ------------ --------------------------------------------------------------------------- | |
aten::mkldnn_convolution 98.47% 30.425ms 99.07% 30.610ms 30.610ms 1 ...s/iliacher/pytorch/torch/nn/modules/conv.py(389): _conv_forward (Conv2d) | |
...a/users/iliacher/pytorch/torch/nn/modules/conv.py(393): forward (Conv2d) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
python benchmarks/profiler_benchmark/profiler_bench.py --with_cuda | |
Payload: loop, 256 iterations; timer min. runtime = 10 | |
Profiling disabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False | |
<torch.utils.benchmark.utils.common.Measurement object at 0x7fb85eb71b80> | |
payload() | |
Median: 3.20 ms | |
IQR: 0.16 ms (3.11 to 3.27) | |
3127 measurements, 1 runs per measurement, 1 thread | |
Profiling enabled, tensor size 1x1, use cuda: True, use kineto: False, with stacks: False, use script: False |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 USE_CUDA=0 python setup.py develop install --cmake | |
base: | |
python benchmarks/profiler_benchmark/profiler_bench.py --use_timer | |
Payload: loop; 256 iterations, N = 100 | |
Profiling disabled, tensor size 1x1, use cuda: False, with stacks: False, use script: False | |
<torch.utils._benchmark.utils.common.Measurement object at 0x7f082441f0d0> | |
payload() |
NewerOlder