Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ilia-cher/3f37d54c3b2afb24d6776858e6860f69 to your computer and use it in GitHub Desktop.
Save ilia-cher/3f37d54c3b2afb24d6776858e6860f69 to your computer and use it in GitHub Desktop.
Profiling CPU Resnet model
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem Number of Calls Input Shapes
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
empty 0.69% 505.080us 0.69% 505.080us 4.106us 94.79 Mb 94.79 Mb 0 b 0 b 123 []
resize_ 0.01% 6.565us 0.01% 6.565us 3.283us 11.48 Mb 11.48 Mb 0 b 0 b 2 [[0]]
addmm 0.40% 293.587us 0.42% 309.148us 309.148us 19.53 Kb 19.53 Kb 0 b 0 b 1 [[1000], [5, 512], [512, 1000], [],
empty_strided 0.01% 4.246us 0.01% 4.246us 4.246us 4 b 4 b 0 b 0 b 1 []
conv2d 0.03% 22.653us 6.92% 5.067ms 5.067ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
convolution 0.02% 14.274us 6.89% 5.045ms 5.045ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
_convolution 0.09% 69.545us 6.87% 5.030ms 5.030ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
size 0.00% 2.810us 0.00% 2.810us 0.468us 0 b 0 b 0 b 0 b 6 [[5, 3, 224, 224]]
contiguous 0.01% 5.473us 0.01% 5.473us 5.473us 0 b 0 b 0 b 0 b 1 [[64, 3, 7, 7]]
contiguous 0.00% 2.583us 0.00% 2.583us 2.583us 0 b 0 b 0 b 0 b 1 [[5, 3, 224, 224]]
mkldnn_convolution 6.73% 4.926ms 6.76% 4.950ms 4.950ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
output_nr 0.00% 0.528us 0.00% 0.528us 0.528us 0 b 0 b 0 b 0 b 1 [[5, 3, 224, 224]]
is_leaf 0.00% 1.190us 0.00% 1.190us 0.595us 0 b 0 b 0 b 0 b 2 [[5, 3, 224, 224]]
output_nr 0.00% 0.388us 0.00% 0.388us 0.388us 0 b 0 b 0 b 0 b 1 [[64, 3, 7, 7]]
is_leaf 0.00% 0.700us 0.00% 0.700us 0.350us 0 b 0 b 0 b 0 b 2 [[64, 3, 7, 7]]
as_strided_ 0.00% 3.049us 0.00% 3.049us 3.049us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
is_complex 0.03% 22.506us 0.03% 22.506us 1.072us 0 b 0 b 0 b 0 b 21 [[]]
add 0.30% 220.125us 0.38% 278.617us 13.931us 160 b 0 b 0 b 0 b 20 [[], [], []]
batch_norm 0.02% 18.191us 3.93% 2.876ms 2.876ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112], [64], [64], [64
_batch_norm_impl_index 0.01% 10.287us 3.90% 2.858ms 2.858ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112], [64], [64], [64
native_batch_norm 3.49% 2.559ms 3.87% 2.836ms 2.836ms 15.31 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112], [64], [64], [64
output_nr 0.00% 2.915us 0.00% 2.915us 0.486us 0 b 0 b 0 b 0 b 6 [[5, 64, 112, 112]]
is_leaf 0.00% 3.410us 0.00% 3.410us 0.487us 0 b 0 b 0 b 0 b 7 [[5, 64, 112, 112]]
output_nr 0.02% 14.270us 0.02% 14.270us 0.571us 0 b 0 b 0 b 0 b 25 [[64]]
is_leaf 0.12% 90.816us 0.12% 90.816us 1.816us 0 b 0 b 0 b 0 b 50 [[64]]
size 0.01% 10.880us 0.01% 10.880us 0.725us 0 b 0 b 0 b 0 b 15 [[5, 64, 112, 112]]
select 0.06% 44.604us 0.32% 236.246us 26.250us 0 b 0 b 0 b 0 b 9 [[5, 64, 112, 112]]
as_strided 0.25% 183.916us 0.25% 183.916us 20.435us 0 b 0 b 0 b 0 b 9 [[5, 64, 112, 112]]
empty_like 0.01% 4.219us 0.03% 24.755us 24.755us 15.31 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
stride 0.03% 19.496us 0.03% 19.496us 0.650us 0 b 0 b 0 b 0 b 30 [[64]]
relu_ 0.04% 28.348us 0.44% 319.337us 319.337us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
threshold_ 0.39% 288.665us 0.39% 288.665us 288.665us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112], [], []]
max_pool2d 0.03% 25.551us 9.52% 6.969ms 6.969ms 11.48 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
max_pool2d_with_indices 9.46% 6.927ms 9.48% 6.944ms 6.944ms 11.48 Mb 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
contiguous 0.00% 1.421us 0.00% 1.421us 1.421us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
output_nr 0.03% 19.481us 0.03% 19.481us 0.590us 0 b 0 b 0 b 0 b 33 [[5, 64, 56, 56]]
is_leaf 0.02% 16.830us 0.02% 16.830us 0.467us 0 b 0 b 0 b 0 b 36 [[5, 64, 56, 56]]
conv2d 0.02% 14.013us 13.47% 9.864ms 2.466ms 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
convolution 0.02% 14.090us 13.45% 9.850ms 2.463ms 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
_convolution 0.11% 78.629us 13.44% 9.836ms 2.459ms 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
size 0.06% 42.599us 0.06% 42.599us 0.532us 0 b 0 b 0 b 0 b 80 [[5, 64, 56, 56]]
contiguous 0.01% 4.514us 0.01% 4.514us 1.129us 0 b 0 b 0 b 0 b 4 [[64, 64, 3, 3]]
contiguous 0.00% 2.543us 0.00% 2.543us 0.424us 0 b 0 b 0 b 0 b 6 [[5, 64, 56, 56]]
mkldnn_convolution 13.22% 9.676ms 13.31% 9.742ms 2.435ms 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
output_nr 0.00% 1.526us 0.00% 1.526us 0.381us 0 b 0 b 0 b 0 b 4 [[64, 64, 3, 3]]
is_leaf 0.00% 2.848us 0.00% 2.848us 0.356us 0 b 0 b 0 b 0 b 8 [[64, 64, 3, 3]]
as_strided_ 0.02% 12.852us 0.02% 12.852us 3.213us 0 b 0 b 0 b 0 b 4 [[5, 64, 56, 56]]
batch_norm 0.03% 22.325us 3.20% 2.346ms 586.431us 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
_batch_norm_impl_index 0.03% 21.496us 3.17% 2.323ms 580.850us 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
native_batch_norm 2.43% 1.780ms 3.11% 2.275ms 568.646us 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
select 0.30% 221.274us 0.43% 312.597us 8.683us 0 b 0 b 0 b 0 b 36 [[5, 64, 56, 56]]
as_strided 0.09% 67.998us 0.09% 67.998us 1.889us 0 b 0 b 0 b 0 b 36 [[5, 64, 56, 56]]
empty_like 0.01% 9.575us 0.05% 38.260us 9.565us 15.31 Mb 0 b 0 b 0 b 4 [[5, 64, 56, 56]]
relu_ 0.08% 55.948us 0.37% 274.132us 68.533us 0 b 0 b 0 b 0 b 4 [[5, 64, 56, 56]]
threshold_ 0.28% 208.551us 0.28% 208.551us 52.138us 0 b 0 b 0 b 0 b 4 [[5, 64, 56, 56], [], []]
add_ 0.35% 254.120us 0.35% 256.760us 128.380us 0 b 0 b 0 b 0 b 2 [[5, 64, 56, 56], [5, 64, 56, 56],
conv2d 0.00% 3.486us 2.33% 1.702ms 1.702ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
convolution 0.00% 3.320us 2.32% 1.699ms 1.699ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
_convolution 0.02% 16.236us 2.32% 1.696ms 1.696ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
contiguous 0.00% 1.004us 0.00% 1.004us 1.004us 0 b 0 b 0 b 0 b 1 [[128, 64, 3, 3]]
mkldnn_convolution 2.27% 1.664ms 2.29% 1.675ms 1.675ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
output_nr 0.00% 0.349us 0.00% 0.349us 0.349us 0 b 0 b 0 b 0 b 1 [[128, 64, 3, 3]]
is_leaf 0.00% 0.917us 0.00% 0.917us 0.458us 0 b 0 b 0 b 0 b 2 [[128, 64, 3, 3]]
as_strided_ 0.01% 7.670us 0.01% 7.670us 1.534us 0 b 0 b 0 b 0 b 5 [[5, 128, 28, 28]]
batch_norm 0.04% 26.424us 3.43% 2.510ms 502.059us 9.58 Mb 0 b 0 b 0 b 5 [[5, 128, 28, 28], [128], [128], [1
_batch_norm_impl_index 0.04% 32.328us 3.39% 2.484ms 496.774us 9.58 Mb 0 b 0 b 0 b 5 [[5, 128, 28, 28], [128], [128], [1
native_batch_norm 1.74% 1.273ms 3.33% 2.436ms 487.282us 9.58 Mb 0 b 0 b 0 b 5 [[5, 128, 28, 28], [128], [128], [1
output_nr 0.02% 14.944us 0.02% 14.944us 0.467us 0 b 0 b 0 b 0 b 32 [[5, 128, 28, 28]]
is_leaf 0.02% 16.584us 0.02% 16.584us 0.488us 0 b 0 b 0 b 0 b 34 [[5, 128, 28, 28]]
output_nr 0.01% 10.286us 0.01% 10.286us 0.411us 0 b 0 b 0 b 0 b 25 [[128]]
is_leaf 0.03% 18.603us 0.03% 18.603us 0.372us 0 b 0 b 0 b 0 b 50 [[128]]
size 0.30% 217.754us 0.30% 217.754us 1.675us 0 b 0 b 0 b 0 b 130 [[5, 128, 28, 28]]
select 0.88% 640.659us 1.41% 1.034ms 11.491us 0 b 0 b 0 b 0 b 90 [[5, 128, 28, 28]]
as_strided 0.27% 194.153us 0.27% 194.153us 2.157us 0 b 0 b 0 b 0 b 90 [[5, 128, 28, 28]]
empty_like 0.01% 10.400us 0.04% 31.671us 6.334us 9.57 Mb 0 b 0 b 0 b 5 [[5, 128, 28, 28]]
stride 0.05% 34.717us 0.05% 34.717us 0.579us 0 b 0 b 0 b 0 b 60 [[128]]
relu_ 0.08% 61.652us 0.30% 222.402us 55.601us 0 b 0 b 0 b 0 b 4 [[5, 128, 28, 28]]
threshold_ 0.21% 150.844us 0.21% 150.844us 37.711us 0 b 0 b 0 b 0 b 4 [[5, 128, 28, 28], [], []]
conv2d 0.01% 10.362us 7.68% 5.625ms 1.875ms 5.74 Mb 0 b 0 b 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
convolution 0.01% 9.222us 7.67% 5.615ms 1.872ms 5.74 Mb 0 b 0 b 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
_convolution 0.07% 48.068us 7.66% 5.606ms 1.869ms 5.74 Mb 0 b 0 b 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
contiguous 0.00% 2.938us 0.00% 2.938us 0.979us 0 b 0 b 0 b 0 b 3 [[128, 128, 3, 3]]
contiguous 0.00% 2.256us 0.00% 2.256us 0.451us 0 b 0 b 0 b 0 b 5 [[5, 128, 28, 28]]
mkldnn_convolution 7.52% 5.507ms 7.57% 5.546ms 1.849ms 5.74 Mb 0 b 0 b 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
output_nr 0.00% 1.170us 0.00% 1.170us 0.390us 0 b 0 b 0 b 0 b 3 [[128, 128, 3, 3]]
is_leaf 0.00% 2.340us 0.00% 2.340us 0.390us 0 b 0 b 0 b 0 b 6 [[128, 128, 3, 3]]
conv2d 0.00% 3.141us 1.95% 1.428ms 1.428ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
convolution 0.00% 2.822us 1.95% 1.425ms 1.425ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
_convolution 0.02% 16.119us 1.94% 1.422ms 1.422ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
contiguous 0.00% 0.929us 0.00% 0.929us 0.929us 0 b 0 b 0 b 0 b 1 [[128, 64, 1, 1]]
mkldnn_convolution 1.90% 1.391ms 1.92% 1.402ms 1.402ms 1.91 Mb 0 b 0 b 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
output_nr 0.00% 0.500us 0.00% 0.500us 0.500us 0 b 0 b 0 b 0 b 1 [[128, 64, 1, 1]]
is_leaf 0.00% 0.788us 0.00% 0.788us 0.394us 0 b 0 b 0 b 0 b 2 [[128, 64, 1, 1]]
add_ 0.21% 156.852us 0.22% 159.337us 79.668us 0 b 0 b 0 b 0 b 2 [[5, 128, 28, 28], [5, 128, 28, 28]
conv2d 0.01% 10.101us 2.20% 1.610ms 1.610ms 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
convolution 0.00% 3.251us 2.19% 1.600ms 1.600ms 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
_convolution 0.02% 15.261us 2.18% 1.597ms 1.597ms 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
contiguous 0.00% 0.947us 0.00% 0.947us 0.947us 0 b 0 b 0 b 0 b 1 [[256, 128, 3, 3]]
mkldnn_convolution 2.14% 1.567ms 2.16% 1.578ms 1.578ms 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
output_nr 0.00% 0.465us 0.00% 0.465us 0.465us 0 b 0 b 0 b 0 b 1 [[256, 128, 3, 3]]
is_leaf 0.00% 0.944us 0.00% 0.944us 0.472us 0 b 0 b 0 b 0 b 2 [[256, 128, 3, 3]]
as_strided_ 0.01% 7.662us 0.01% 7.662us 1.532us 0 b 0 b 0 b 0 b 5 [[5, 256, 14, 14]]
batch_norm 0.03% 22.351us 4.80% 3.511ms 702.250us 4.79 Mb 0 b 0 b 0 b 5 [[5, 256, 14, 14], [256], [256], [2
_batch_norm_impl_index 0.04% 25.725us 4.77% 3.489ms 697.780us 4.79 Mb 0 b 0 b 0 b 5 [[5, 256, 14, 14], [256], [256], [2
native_batch_norm 2.00% 1.463ms 4.71% 3.448ms 689.534us 4.79 Mb 0 b 0 b 0 b 5 [[5, 256, 14, 14], [256], [256], [2
output_nr 0.02% 14.737us 0.02% 14.737us 0.461us 0 b 0 b 0 b 0 b 32 [[5, 256, 14, 14]]
is_leaf 0.02% 16.926us 0.02% 16.926us 0.498us 0 b 0 b 0 b 0 b 34 [[5, 256, 14, 14]]
output_nr 0.01% 9.750us 0.01% 9.750us 0.390us 0 b 0 b 0 b 0 b 25 [[256]]
is_leaf 0.02% 18.275us 0.02% 18.275us 0.365us 0 b 0 b 0 b 0 b 50 [[256]]
size 0.21% 155.625us 0.21% 155.625us 0.759us 0 b 0 b 0 b 0 b 205 [[5, 256, 14, 14]]
select 1.79% 1.313ms 2.49% 1.823ms 11.047us 0 b 0 b 0 b 0 b 165 [[5, 256, 14, 14]]
as_strided 0.51% 374.809us 0.51% 374.809us 2.272us 0 b 0 b 0 b 0 b 165 [[5, 256, 14, 14]]
empty_like 0.01% 10.128us 0.04% 30.307us 6.061us 4.79 Mb 0 b 0 b 0 b 5 [[5, 256, 14, 14]]
stride 0.08% 61.793us 0.08% 61.793us 0.562us 0 b 0 b 0 b 0 b 110 [[256]]
relu_ 0.09% 62.566us 0.26% 193.869us 48.467us 0 b 0 b 0 b 0 b 4 [[5, 256, 14, 14]]
threshold_ 0.17% 121.053us 0.17% 121.053us 30.263us 0 b 0 b 0 b 0 b 4 [[5, 256, 14, 14], [], []]
conv2d 0.01% 10.139us 7.87% 5.761ms 1.920ms 2.87 Mb 0 b 0 b 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
convolution 0.01% 10.011us 7.86% 5.751ms 1.917ms 2.87 Mb 0 b 0 b 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
_convolution 0.09% 66.666us 7.84% 5.741ms 1.914ms 2.87 Mb 0 b 0 b 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
contiguous 0.00% 3.284us 0.00% 3.284us 1.095us 0 b 0 b 0 b 0 b 3 [[256, 256, 3, 3]]
contiguous 0.00% 2.513us 0.00% 2.513us 0.503us 0 b 0 b 0 b 0 b 5 [[5, 256, 14, 14]]
mkldnn_convolution 7.69% 5.630ms 7.73% 5.659ms 1.886ms 2.87 Mb 0 b 0 b 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
output_nr 0.00% 1.273us 0.00% 1.273us 0.424us 0 b 0 b 0 b 0 b 3 [[256, 256, 3, 3]]
is_leaf 0.00% 2.220us 0.00% 2.220us 0.370us 0 b 0 b 0 b 0 b 6 [[256, 256, 3, 3]]
conv2d 0.00% 3.361us 1.30% 949.595us 949.595us 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
convolution 0.00% 3.193us 1.29% 946.234us 946.234us 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
_convolution 0.02% 16.435us 1.29% 943.041us 943.041us 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
contiguous 0.00% 1.058us 0.00% 1.058us 1.058us 0 b 0 b 0 b 0 b 1 [[256, 128, 1, 1]]
mkldnn_convolution 1.25% 912.103us 1.26% 922.395us 922.395us 980.00 Kb 0 b 0 b 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
output_nr 0.00% 0.442us 0.00% 0.442us 0.442us 0 b 0 b 0 b 0 b 1 [[256, 128, 1, 1]]
is_leaf 0.00% 0.707us 0.00% 0.707us 0.353us 0 b 0 b 0 b 0 b 2 [[256, 128, 1, 1]]
add_ 0.18% 128.148us 0.18% 130.722us 65.361us 0 b 0 b 0 b 0 b 2 [[5, 256, 14, 14], [5, 256, 14, 14]
conv2d 0.00% 3.304us 2.57% 1.882ms 1.882ms 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
convolution 0.00% 3.296us 2.57% 1.879ms 1.879ms 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
_convolution 0.02% 16.281us 2.56% 1.875ms 1.875ms 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
contiguous 0.00% 1.013us 0.00% 1.013us 1.013us 0 b 0 b 0 b 0 b 1 [[512, 256, 3, 3]]
mkldnn_convolution 2.52% 1.845ms 2.53% 1.855ms 1.855ms 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
output_nr 0.00% 0.405us 0.00% 0.405us 0.405us 0 b 0 b 0 b 0 b 1 [[512, 256, 3, 3]]
is_leaf 0.00% 0.759us 0.00% 0.759us 0.379us 0 b 0 b 0 b 0 b 2 [[512, 256, 3, 3]]
as_strided_ 0.01% 7.783us 0.01% 7.783us 1.557us 0 b 0 b 0 b 0 b 5 [[5, 512, 7, 7]]
batch_norm 0.03% 22.837us 8.81% 6.449ms 1.290ms 2.41 Mb 0 b 0 b 0 b 5 [[5, 512, 7, 7], [512], [512], [512
_batch_norm_impl_index 0.04% 26.706us 8.78% 6.426ms 1.285ms 2.41 Mb 0 b 0 b 0 b 5 [[5, 512, 7, 7], [512], [512], [512
native_batch_norm 3.58% 2.618ms 8.72% 6.384ms 1.277ms 2.41 Mb 0 b 0 b 0 b 5 [[5, 512, 7, 7], [512], [512], [512
output_nr 0.02% 14.038us 0.02% 14.038us 0.484us 0 b 0 b 0 b 0 b 29 [[5, 512, 7, 7]]
is_leaf 0.02% 14.278us 0.02% 14.278us 0.476us 0 b 0 b 0 b 0 b 30 [[5, 512, 7, 7]]
output_nr 0.01% 9.751us 0.01% 9.751us 0.390us 0 b 0 b 0 b 0 b 25 [[512]]
is_leaf 0.02% 17.939us 0.02% 17.939us 0.359us 0 b 0 b 0 b 0 b 50 [[512]]
size 0.38% 279.889us 0.38% 279.889us 0.773us 0 b 0 b 0 b 0 b 362 [[5, 512, 7, 7]]
select 3.35% 2.452ms 4.84% 3.544ms 10.739us 0 b 0 b 0 b 0 b 330 [[5, 512, 7, 7]]
as_strided 1.13% 827.936us 1.13% 827.936us 2.509us 0 b 0 b 0 b 0 b 330 [[5, 512, 7, 7]]
empty_like 0.01% 9.976us 0.04% 28.651us 5.730us 2.39 Mb 0 b 0 b 0 b 5 [[5, 512, 7, 7]]
stride 0.17% 123.846us 0.17% 123.846us 0.563us 0 b 0 b 0 b 0 b 220 [[512]]
relu_ 0.08% 58.973us 0.26% 187.328us 46.832us 0 b 0 b 0 b 0 b 4 [[5, 512, 7, 7]]
threshold_ 0.16% 118.399us 0.16% 118.399us 29.600us 0 b 0 b 0 b 0 b 4 [[5, 512, 7, 7], [], []]
conv2d 0.01% 9.823us 10.39% 7.609ms 2.536ms 1.44 Mb 0 b 0 b 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
convolution 0.01% 10.222us 10.38% 7.599ms 2.533ms 1.44 Mb 0 b 0 b 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
_convolution 0.07% 48.605us 10.37% 7.589ms 2.530ms 1.44 Mb 0 b 0 b 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
contiguous 0.00% 2.841us 0.00% 2.841us 0.947us 0 b 0 b 0 b 0 b 3 [[512, 512, 3, 3]]
contiguous 0.00% 2.945us 0.00% 2.945us 0.736us 0 b 0 b 0 b 0 b 4 [[5, 512, 7, 7]]
mkldnn_convolution 10.24% 7.497ms 10.28% 7.528ms 2.509ms 1.44 Mb 0 b 0 b 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
output_nr 0.00% 1.177us 0.00% 1.177us 0.392us 0 b 0 b 0 b 0 b 3 [[512, 512, 3, 3]]
is_leaf 0.00% 2.343us 0.00% 2.343us 0.391us 0 b 0 b 0 b 0 b 6 [[512, 512, 3, 3]]
conv2d 0.00% 3.051us 0.89% 649.848us 649.848us 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
convolution 0.00% 2.928us 0.88% 646.797us 646.797us 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
_convolution 0.02% 15.670us 0.88% 643.869us 643.869us 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
contiguous 0.00% 0.992us 0.00% 0.992us 0.992us 0 b 0 b 0 b 0 b 1 [[512, 256, 1, 1]]
mkldnn_convolution 0.84% 614.094us 0.85% 624.382us 624.382us 490.00 Kb 0 b 0 b 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
output_nr 0.00% 0.357us 0.00% 0.357us 0.357us 0 b 0 b 0 b 0 b 1 [[512, 256, 1, 1]]
is_leaf 0.00% 0.732us 0.00% 0.732us 0.366us 0 b 0 b 0 b 0 b 2 [[512, 256, 1, 1]]
add_ 0.16% 119.187us 0.17% 121.948us 60.974us 0 b 0 b 0 b 0 b 2 [[5, 512, 7, 7], [5, 512, 7, 7], []
adaptive_avg_pool2d 0.03% 23.718us 0.38% 280.146us 280.146us 10.00 Kb 0 b 0 b 0 b 1 [[5, 512, 7, 7]]
view 0.04% 32.851us 0.05% 34.231us 34.231us 0 b 0 b 0 b 0 b 1 [[5, 512, 7, 7]]
_version 0.00% 1.441us 0.00% 1.441us 0.721us 0 b 0 b 0 b 0 b 2 [[2560, 49]]
mean 0.05% 36.089us 0.28% 206.474us 206.474us 10.00 Kb 0 b 0 b 0 b 1 [[2560, 49]]
output_nr 0.00% 0.736us 0.00% 0.736us 0.736us 0 b 0 b 0 b 0 b 1 [[2560, 49]]
size 0.00% 0.832us 0.00% 0.832us 0.832us 0 b 0 b 0 b 0 b 1 [[2560, 49]]
sum_out 0.13% 94.595us 0.15% 107.012us 107.012us 10.00 Kb 0 b 0 b 0 b 1 [[], [2560, 49]]
as_strided 0.00% 1.290us 0.00% 1.290us 1.290us 0 b 0 b 0 b 0 b 1 [[2560]]
fill_ 0.01% 5.478us 0.01% 5.478us 5.478us 0 b 0 b 0 b 0 b 1 [[2560, 1], []]
to 0.01% 7.911us 0.04% 29.021us 29.021us 4 b 0 b 0 b 0 b 1 [[]]
copy_ 0.02% 16.864us 0.02% 16.864us 16.864us 0 b 0 b 0 b 0 b 1 [[], []]
view 0.01% 9.991us 0.01% 10.886us 10.886us 0 b 0 b 0 b 0 b 1 [[2560]]
output_nr 0.00% 0.450us 0.00% 0.450us 0.450us 0 b 0 b 0 b 0 b 1 [[2560]]
_version 0.00% 1.255us 0.00% 1.255us 0.628us 0 b 0 b 0 b 0 b 2 [[5, 512, 1, 1]]
flatten 0.04% 32.328us 0.09% 69.254us 69.254us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
size 0.00% 1.050us 0.00% 1.050us 1.050us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
reshape 0.01% 6.157us 0.05% 35.876us 35.876us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
view 0.03% 19.516us 0.04% 29.719us 29.719us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
output_nr 0.00% 0.760us 0.00% 0.760us 0.760us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
_version 0.01% 9.405us 0.01% 9.405us 3.135us 0 b 0 b 0 b 0 b 3 [[5, 512]]
unsigned short 0.03% 19.873us 0.04% 26.051us 26.051us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
transpose 0.00% 3.461us 0.01% 5.625us 5.625us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
as_strided 0.00% 2.164us 0.00% 2.164us 2.164us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
_version 0.00% 1.543us 0.00% 1.543us 0.514us 0 b 0 b 0 b 0 b 3 [[512, 1000]]
output_nr 0.00% 0.806us 0.00% 0.806us 0.403us 0 b 0 b 0 b 0 b 2 [[5, 512]]
output_nr 0.00% 0.784us 0.00% 0.784us 0.392us 0 b 0 b 0 b 0 b 2 [[512, 1000]]
is_leaf 0.00% 0.831us 0.00% 0.831us 0.416us 0 b 0 b 0 b 0 b 2 [[5, 512]]
is_leaf 0.00% 1.200us 0.00% 1.200us 0.600us 0 b 0 b 0 b 0 b 2 [[512, 1000]]
size 0.00% 1.022us 0.00% 1.022us 1.022us 0 b 0 b 0 b 0 b 1 [[5, 512]]
size 0.00% 0.660us 0.00% 0.660us 0.660us 0 b 0 b 0 b 0 b 1 [[512, 1000]]
expand 0.01% 5.202us 0.01% 7.346us 7.346us 0 b 0 b 0 b 0 b 1 [[1000]]
as_strided 0.00% 2.144us 0.00% 2.144us 2.144us 0 b 0 b 0 b 0 b 1 [[1000]]
stride 0.00% 1.150us 0.00% 1.150us 1.150us 0 b 0 b 0 b 0 b 1 [[5, 1000]]
div_ 0.04% 31.259us 0.08% 61.060us 61.060us 0 b -4 b 0 b 0 b 1 [[2560], []]
root 4.85% 3.551ms 100.00% 73.213ms 73.213ms 160 b -106.30 Mb 0 b 0 b 1 []
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Self CPU time total: 73.213ms
Profiling CUDA Resnet model
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CPU Mem Self CPU Mem CUDA Mem Self CUDA Mem Number of Calls Input Shapes
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
empty 0.60% 4.779ms 0.60% 4.779ms 33.418us 0 b 0 b 161.98 Mb 161.98 Mb 143 []
resize_ 0.00% 9.756us 0.00% 9.756us 4.878us 0 b 0 b 12.41 Mb 12.41 Mb 2 [[0]]
addmm 24.42% 195.618ms 24.42% 195.633ms 195.633ms 0 b 0 b 20.00 Kb 20.00 Kb 1 [[1000], [5, 512], [512, 1000], [],
conv2d 0.00% 8.285us 73.90% 592.071ms 592.071ms 0 b 0 b 16.00 Mb 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
convolution 0.00% 7.180us 73.90% 592.063ms 592.063ms 0 b 0 b 16.00 Mb 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
_convolution 0.00% 39.672us 73.90% 592.055ms 592.055ms 0 b 0 b 16.00 Mb 0 b 1 [[5, 3, 224, 224], [64, 3, 7, 7], [
size 0.00% 19.240us 0.00% 19.240us 1.924us 0 b 0 b 0 b 0 b 10 [[5, 3, 224, 224]]
contiguous 0.00% 3.208us 0.00% 3.208us 1.604us 0 b 0 b 0 b 0 b 2 [[5, 3, 224, 224]]
output_nr 0.00% 0.747us 0.00% 0.747us 0.747us 0 b 0 b 0 b 0 b 1 [[5, 3, 224, 224]]
is_leaf 0.00% 1.420us 0.00% 1.420us 0.710us 0 b 0 b 0 b 0 b 2 [[5, 3, 224, 224]]
output_nr 0.00% 0.445us 0.00% 0.445us 0.445us 0 b 0 b 0 b 0 b 1 [[64, 3, 7, 7]]
is_leaf 0.00% 0.679us 0.00% 0.679us 0.340us 0 b 0 b 0 b 0 b 2 [[64, 3, 7, 7]]
contiguous 0.00% 1.967us 0.00% 1.967us 1.967us 0 b 0 b 0 b 0 b 1 [[64, 3, 7, 7]]
resize_ 0.00% 5.448us 0.00% 5.448us 5.448us 0 b 0 b 0 b 0 b 1 [[64, 3, 7, 7]]
resize_ 0.00% 0.482us 0.00% 0.482us 0.482us 0 b 0 b 0 b 0 b 1 [[5, 3, 224, 224]]
stride 0.00% 2.049us 0.00% 2.049us 0.512us 0 b 0 b 0 b 0 b 4 [[5, 3, 224, 224]]
size 0.00% 2.371us 0.00% 2.371us 0.296us 0 b 0 b 0 b 0 b 8 [[64, 3, 7, 7]]
is_complex 0.00% 19.207us 0.00% 19.207us 0.960us 0 b 0 b 0 b 0 b 20 [[]]
add 0.06% 473.972us 0.07% 563.815us 28.191us 0 b 0 b 10.00 Kb 0 b 20 [[], [], []]
batch_norm 0.00% 7.336us 0.06% 519.660us 519.660us 0 b 0 b 16.00 Mb 0 b 1 [[5, 64, 112, 112], [64], [64], [64
_batch_norm_impl_index 0.00% 21.293us 0.06% 512.324us 512.324us 0 b 0 b 16.00 Mb 0 b 1 [[5, 64, 112, 112], [64], [64], [64
size 0.00% 3.478us 0.00% 3.478us 0.497us 0 b 0 b 0 b 0 b 7 [[5, 64, 112, 112]]
contiguous 0.00% 14.472us 0.00% 14.472us 0.724us 0 b 0 b 0 b 0 b 20 [[64]]
contiguous 0.00% 1.567us 0.00% 1.567us 0.784us 0 b 0 b 0 b 0 b 2 [[5, 64, 112, 112]]
cudnn_batch_norm 0.01% 107.848us 0.06% 485.897us 485.897us 0 b 0 b 16.00 Mb 0 b 1 [[5, 64, 112, 112], [64], [64], [64
output_nr 0.00% 2.533us 0.00% 2.533us 0.422us 0 b 0 b 0 b 0 b 6 [[5, 64, 112, 112]]
is_leaf 0.00% 3.271us 0.00% 3.271us 0.467us 0 b 0 b 0 b 0 b 7 [[5, 64, 112, 112]]
output_nr 0.00% 9.392us 0.00% 9.392us 0.376us 0 b 0 b 0 b 0 b 25 [[64]]
is_leaf 0.00% 16.477us 0.00% 16.477us 0.330us 0 b 0 b 0 b 0 b 50 [[64]]
empty_like 0.00% 3.861us 0.04% 354.047us 354.047us 0 b 0 b 16.00 Mb 0 b 1 [[5, 64, 112, 112]]
view 0.00% 15.945us 0.00% 15.945us 3.189us 0 b 0 b 0 b 0 b 5 [[64]]
output_nr 0.00% 6.337us 0.00% 6.337us 0.317us 0 b 0 b 0 b 0 b 20 [[0]]
is_leaf 0.00% 12.135us 0.00% 12.135us 0.303us 0 b 0 b 0 b 0 b 40 [[0]]
relu_ 0.00% 18.927us 0.01% 71.842us 71.842us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112]]
threshold_ 0.01% 50.757us 0.01% 50.757us 50.757us 0 b 0 b 0 b 0 b 1 [[5, 64, 112, 112], [], []]
max_pool2d 0.00% 5.498us 0.01% 87.233us 87.233us 0 b 0 b 12.41 Mb 0 b 1 [[5, 64, 112, 112]]
max_pool2d_with_indices 0.01% 61.681us 0.01% 81.735us 81.735us 0 b 0 b 12.41 Mb 0 b 1 [[5, 64, 112, 112]]
stride 0.00% 1.491us 0.00% 1.491us 0.373us 0 b 0 b 0 b 0 b 4 [[5, 64, 112, 112]]
output_nr 0.01% 68.122us 0.01% 68.122us 2.064us 0 b 0 b 0 b 0 b 33 [[5, 64, 56, 56]]
is_leaf 0.00% 29.424us 0.00% 29.424us 0.817us 0 b 0 b 0 b 0 b 36 [[5, 64, 56, 56]]
conv2d 0.00% 15.755us 0.10% 792.535us 198.134us 0 b 0 b 16.17 Mb 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
convolution 0.00% 13.692us 0.10% 776.780us 194.195us 0 b 0 b 16.17 Mb 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
_convolution 0.01% 55.752us 0.10% 763.088us 190.772us 0 b 0 b 16.17 Mb 0 b 4 [[5, 64, 56, 56], [64, 64, 3, 3], [
size 0.01% 41.297us 0.01% 41.297us 0.574us 0 b 0 b 0 b 0 b 72 [[5, 64, 56, 56]]
contiguous 0.00% 12.176us 0.00% 12.176us 0.761us 0 b 0 b 0 b 0 b 16 [[5, 64, 56, 56]]
output_nr 0.00% 1.430us 0.00% 1.430us 0.358us 0 b 0 b 0 b 0 b 4 [[64, 64, 3, 3]]
is_leaf 0.00% 2.671us 0.00% 2.671us 0.334us 0 b 0 b 0 b 0 b 8 [[64, 64, 3, 3]]
contiguous 0.00% 3.651us 0.00% 3.651us 0.913us 0 b 0 b 0 b 0 b 4 [[64, 64, 3, 3]]
resize_ 0.00% 3.565us 0.00% 3.565us 0.891us 0 b 0 b 0 b 0 b 4 [[64, 64, 3, 3]]
resize_ 0.00% 2.813us 0.00% 2.813us 0.469us 0 b 0 b 0 b 0 b 6 [[5, 64, 56, 56]]
stride 0.00% 13.914us 0.00% 13.914us 0.580us 0 b 0 b 0 b 0 b 24 [[5, 64, 56, 56]]
size 0.00% 9.960us 0.00% 9.960us 0.311us 0 b 0 b 0 b 0 b 32 [[64, 64, 3, 3]]
batch_norm 0.00% 18.267us 0.11% 908.070us 227.018us 0 b 0 b 15.32 Mb 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
_batch_norm_impl_index 0.01% 46.027us 0.11% 889.803us 222.451us 0 b 0 b 15.32 Mb 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
cudnn_batch_norm 0.04% 328.777us 0.10% 827.404us 206.851us 0 b 0 b 15.32 Mb 0 b 4 [[5, 64, 56, 56], [64], [64], [64],
empty_like 0.00% 8.278us 0.05% 422.781us 105.695us 0 b 0 b 15.31 Mb 0 b 4 [[5, 64, 56, 56]]
relu_ 0.01% 53.044us 0.02% 133.412us 33.353us 0 b 0 b 0 b 0 b 4 [[5, 64, 56, 56]]
threshold_ 0.01% 58.932us 0.01% 58.932us 14.733us 0 b 0 b 0 b 0 b 4 [[5, 64, 56, 56], [], []]
add_ 0.01% 65.366us 0.01% 67.750us 33.875us 0 b 0 b 0 b 0 b 2 [[5, 64, 56, 56], [5, 64, 56, 56],
conv2d 0.00% 3.549us 0.03% 213.820us 213.820us 0 b 0 b 1.91 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
convolution 0.00% 2.934us 0.03% 210.271us 210.271us 0 b 0 b 1.91 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
_convolution 0.00% 22.169us 0.03% 207.337us 207.337us 0 b 0 b 1.91 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 3, 3],
output_nr 0.00% 0.337us 0.00% 0.337us 0.337us 0 b 0 b 0 b 0 b 1 [[128, 64, 3, 3]]
is_leaf 0.00% 0.673us 0.00% 0.673us 0.337us 0 b 0 b 0 b 0 b 2 [[128, 64, 3, 3]]
contiguous 0.00% 0.879us 0.00% 0.879us 0.879us 0 b 0 b 0 b 0 b 1 [[128, 64, 3, 3]]
resize_ 0.00% 0.897us 0.00% 0.897us 0.897us 0 b 0 b 0 b 0 b 1 [[128, 64, 3, 3]]
size 0.00% 2.520us 0.00% 2.520us 0.315us 0 b 0 b 0 b 0 b 8 [[128, 64, 3, 3]]
batch_norm 0.01% 40.545us 0.11% 877.500us 175.500us 0 b 0 b 9.58 Mb 0 b 5 [[5, 128, 28, 28], [128], [128], [1
_batch_norm_impl_index 0.01% 52.653us 0.10% 836.955us 167.391us 0 b 0 b 9.58 Mb 0 b 5 [[5, 128, 28, 28], [128], [128], [1
size 0.01% 41.579us 0.01% 41.579us 0.640us 0 b 0 b 0 b 0 b 65 [[5, 128, 28, 28]]
contiguous 0.00% 17.769us 0.00% 17.769us 0.888us 0 b 0 b 0 b 0 b 20 [[128]]
contiguous 0.00% 10.594us 0.00% 10.594us 0.706us 0 b 0 b 0 b 0 b 15 [[5, 128, 28, 28]]
cudnn_batch_norm 0.04% 316.076us 0.09% 752.344us 150.469us 0 b 0 b 9.58 Mb 0 b 5 [[5, 128, 28, 28], [128], [128], [1
output_nr 0.00% 18.278us 0.00% 18.278us 0.571us 0 b 0 b 0 b 0 b 32 [[5, 128, 28, 28]]
is_leaf 0.00% 16.469us 0.00% 16.469us 0.484us 0 b 0 b 0 b 0 b 34 [[5, 128, 28, 28]]
output_nr 0.00% 8.702us 0.00% 8.702us 0.348us 0 b 0 b 0 b 0 b 25 [[128]]
is_leaf 0.00% 15.634us 0.00% 15.634us 0.313us 0 b 0 b 0 b 0 b 50 [[128]]
empty_like 0.00% 9.997us 0.04% 343.082us 68.616us 0 b 0 b 9.57 Mb 0 b 5 [[5, 128, 28, 28]]
view 0.00% 12.846us 0.00% 12.846us 2.569us 0 b 0 b 0 b 0 b 5 [[128]]
relu_ 0.01% 53.069us 0.02% 127.187us 31.797us 0 b 0 b 0 b 0 b 4 [[5, 128, 28, 28]]
threshold_ 0.01% 59.635us 0.01% 59.635us 14.909us 0 b 0 b 0 b 0 b 4 [[5, 128, 28, 28], [], []]
conv2d 0.00% 10.173us 0.08% 658.278us 219.426us 0 b 0 b 5.74 Mb 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
convolution 0.00% 8.284us 0.08% 648.105us 216.035us 0 b 0 b 5.74 Mb 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
_convolution 0.00% 39.000us 0.08% 639.821us 213.274us 0 b 0 b 5.74 Mb 0 b 3 [[5, 128, 28, 28], [128, 128, 3, 3]
output_nr 0.00% 1.185us 0.00% 1.185us 0.395us 0 b 0 b 0 b 0 b 3 [[128, 128, 3, 3]]
is_leaf 0.00% 2.055us 0.00% 2.055us 0.342us 0 b 0 b 0 b 0 b 6 [[128, 128, 3, 3]]
contiguous 0.00% 2.306us 0.00% 2.306us 0.769us 0 b 0 b 0 b 0 b 3 [[128, 128, 3, 3]]
resize_ 0.00% 2.419us 0.00% 2.419us 0.806us 0 b 0 b 0 b 0 b 3 [[128, 128, 3, 3]]
resize_ 0.00% 1.998us 0.00% 1.998us 0.400us 0 b 0 b 0 b 0 b 5 [[5, 128, 28, 28]]
stride 0.00% 7.016us 0.00% 7.016us 0.351us 0 b 0 b 0 b 0 b 20 [[5, 128, 28, 28]]
size 0.00% 7.308us 0.00% 7.308us 0.304us 0 b 0 b 0 b 0 b 24 [[128, 128, 3, 3]]
conv2d 0.00% 3.269us 0.01% 119.554us 119.554us 0 b 0 b 2.77 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
convolution 0.00% 2.859us 0.01% 116.285us 116.285us 0 b 0 b 2.77 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
_convolution 0.00% 15.278us 0.01% 113.426us 113.426us 0 b 0 b 2.77 Mb 0 b 1 [[5, 64, 56, 56], [128, 64, 1, 1],
output_nr 0.00% 0.338us 0.00% 0.338us 0.338us 0 b 0 b 0 b 0 b 1 [[128, 64, 1, 1]]
is_leaf 0.00% 0.666us 0.00% 0.666us 0.333us 0 b 0 b 0 b 0 b 2 [[128, 64, 1, 1]]
contiguous 0.00% 0.764us 0.00% 0.764us 0.764us 0 b 0 b 0 b 0 b 1 [[128, 64, 1, 1]]
resize_ 0.00% 0.856us 0.00% 0.856us 0.856us 0 b 0 b 0 b 0 b 1 [[128, 64, 1, 1]]
size 0.00% 2.480us 0.00% 2.480us 0.310us 0 b 0 b 0 b 0 b 8 [[128, 64, 1, 1]]
add_ 0.01% 50.608us 0.01% 52.851us 26.426us 0 b 0 b 0 b 0 b 2 [[5, 128, 28, 28], [5, 128, 28, 28]
conv2d 0.00% 3.374us 0.01% 107.086us 107.086us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
convolution 0.00% 2.616us 0.01% 103.712us 103.712us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
_convolution 0.00% 11.694us 0.01% 101.096us 101.096us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 3, 3]
output_nr 0.00% 3.741us 0.00% 3.741us 3.741us 0 b 0 b 0 b 0 b 1 [[256, 128, 3, 3]]
is_leaf 0.00% 0.616us 0.00% 0.616us 0.308us 0 b 0 b 0 b 0 b 2 [[256, 128, 3, 3]]
contiguous 0.00% 0.716us 0.00% 0.716us 0.716us 0 b 0 b 0 b 0 b 1 [[256, 128, 3, 3]]
resize_ 0.00% 0.730us 0.00% 0.730us 0.730us 0 b 0 b 0 b 0 b 1 [[256, 128, 3, 3]]
size 0.00% 2.332us 0.00% 2.332us 0.291us 0 b 0 b 0 b 0 b 8 [[256, 128, 3, 3]]
batch_norm 0.00% 21.654us 0.17% 1.384ms 276.851us 0 b 0 b 4.79 Mb 0 b 5 [[5, 256, 14, 14], [256], [256], [2
_batch_norm_impl_index 0.01% 83.581us 0.17% 1.363ms 272.520us 0 b 0 b 4.79 Mb 0 b 5 [[5, 256, 14, 14], [256], [256], [2
size 0.00% 25.925us 0.00% 25.925us 0.399us 0 b 0 b 0 b 0 b 65 [[5, 256, 14, 14]]
contiguous 0.02% 139.290us 0.02% 139.290us 6.964us 0 b 0 b 0 b 0 b 20 [[256]]
contiguous 0.00% 12.802us 0.00% 12.802us 0.853us 0 b 0 b 0 b 0 b 15 [[5, 256, 14, 14]]
cudnn_batch_norm 0.04% 332.621us 0.14% 1.131ms 226.289us 0 b 0 b 4.79 Mb 0 b 5 [[5, 256, 14, 14], [256], [256], [2
output_nr 0.00% 19.658us 0.00% 19.658us 0.614us 0 b 0 b 0 b 0 b 32 [[5, 256, 14, 14]]
is_leaf 0.00% 20.439us 0.00% 20.439us 0.601us 0 b 0 b 0 b 0 b 34 [[5, 256, 14, 14]]
output_nr 0.00% 11.528us 0.00% 11.528us 0.461us 0 b 0 b 0 b 0 b 25 [[256]]
is_leaf 0.00% 15.916us 0.00% 15.916us 0.318us 0 b 0 b 0 b 0 b 50 [[256]]
empty_like 0.00% 14.464us 0.09% 703.294us 140.659us 0 b 0 b 4.79 Mb 0 b 5 [[5, 256, 14, 14]]
view 0.00% 13.441us 0.00% 13.441us 2.688us 0 b 0 b 0 b 0 b 5 [[256]]
relu_ 0.01% 70.222us 0.02% 158.959us 39.740us 0 b 0 b 0 b 0 b 4 [[5, 256, 14, 14]]
threshold_ 0.01% 73.271us 0.01% 73.271us 18.318us 0 b 0 b 0 b 0 b 4 [[5, 256, 14, 14], [], []]
conv2d 0.00% 10.235us 0.08% 624.915us 208.305us 0 b 0 b 2.87 Mb 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
convolution 0.00% 8.136us 0.08% 614.680us 204.893us 0 b 0 b 2.87 Mb 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
_convolution 0.00% 36.136us 0.08% 606.544us 202.181us 0 b 0 b 2.87 Mb 0 b 3 [[5, 256, 14, 14], [256, 256, 3, 3]
output_nr 0.00% 1.059us 0.00% 1.059us 0.353us 0 b 0 b 0 b 0 b 3 [[256, 256, 3, 3]]
is_leaf 0.00% 2.146us 0.00% 2.146us 0.358us 0 b 0 b 0 b 0 b 6 [[256, 256, 3, 3]]
contiguous 0.00% 6.138us 0.00% 6.138us 2.046us 0 b 0 b 0 b 0 b 3 [[256, 256, 3, 3]]
resize_ 0.00% 2.426us 0.00% 2.426us 0.809us 0 b 0 b 0 b 0 b 3 [[256, 256, 3, 3]]
resize_ 0.00% 2.027us 0.00% 2.027us 0.405us 0 b 0 b 0 b 0 b 5 [[5, 256, 14, 14]]
stride 0.00% 6.931us 0.00% 6.931us 0.347us 0 b 0 b 0 b 0 b 20 [[5, 256, 14, 14]]
size 0.00% 9.277us 0.00% 9.277us 0.387us 0 b 0 b 0 b 0 b 24 [[256, 256, 3, 3]]
conv2d 0.00% 3.078us 0.05% 435.402us 435.402us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
convolution 0.00% 2.686us 0.05% 432.324us 432.324us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
_convolution 0.00% 11.495us 0.05% 429.638us 429.638us 0 b 0 b 980.00 Kb 0 b 1 [[5, 128, 28, 28], [256, 128, 1, 1]
output_nr 0.00% 0.329us 0.00% 0.329us 0.329us 0 b 0 b 0 b 0 b 1 [[256, 128, 1, 1]]
is_leaf 0.00% 0.591us 0.00% 0.591us 0.296us 0 b 0 b 0 b 0 b 2 [[256, 128, 1, 1]]
contiguous 0.00% 1.370us 0.00% 1.370us 1.370us 0 b 0 b 0 b 0 b 1 [[256, 128, 1, 1]]
resize_ 0.00% 0.855us 0.00% 0.855us 0.855us 0 b 0 b 0 b 0 b 1 [[256, 128, 1, 1]]
size 0.00% 2.365us 0.00% 2.365us 0.296us 0 b 0 b 0 b 0 b 8 [[256, 128, 1, 1]]
add_ 0.01% 52.917us 0.01% 55.269us 27.634us 0 b 0 b 0 b 0 b 2 [[5, 256, 14, 14], [5, 256, 14, 14]
conv2d 0.00% 3.498us 0.02% 136.447us 136.447us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
convolution 0.00% 2.788us 0.02% 132.949us 132.949us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
_convolution 0.00% 12.592us 0.02% 130.161us 130.161us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
output_nr 0.00% 0.310us 0.00% 0.310us 0.310us 0 b 0 b 0 b 0 b 1 [[512, 256, 3, 3]]
is_leaf 0.00% 14.981us 0.00% 14.981us 7.490us 0 b 0 b 0 b 0 b 2 [[512, 256, 3, 3]]
contiguous 0.00% 0.704us 0.00% 0.704us 0.704us 0 b 0 b 0 b 0 b 1 [[512, 256, 3, 3]]
resize_ 0.00% 0.793us 0.00% 0.793us 0.793us 0 b 0 b 0 b 0 b 1 [[512, 256, 3, 3]]
size 0.00% 4.346us 0.00% 4.346us 0.543us 0 b 0 b 0 b 0 b 8 [[512, 256, 3, 3]]
batch_norm 0.00% 21.957us 0.11% 875.645us 175.129us 0 b 0 b 2.41 Mb 0 b 5 [[5, 512, 7, 7], [512], [512], [512
_batch_norm_impl_index 0.01% 59.302us 0.11% 853.688us 170.738us 0 b 0 b 2.41 Mb 0 b 5 [[5, 512, 7, 7], [512], [512], [512
size 0.00% 20.762us 0.00% 20.762us 0.424us 0 b 0 b 0 b 0 b 49 [[5, 512, 7, 7]]
contiguous 0.00% 20.677us 0.00% 20.677us 1.034us 0 b 0 b 0 b 0 b 20 [[512]]
contiguous 0.00% 8.507us 0.00% 8.507us 0.709us 0 b 0 b 0 b 0 b 12 [[5, 512, 7, 7]]
cudnn_batch_norm 0.04% 300.266us 0.10% 767.178us 153.436us 0 b 0 b 2.41 Mb 0 b 5 [[5, 512, 7, 7], [512], [512], [512
output_nr 0.00% 18.389us 0.00% 18.389us 0.634us 0 b 0 b 0 b 0 b 29 [[5, 512, 7, 7]]
is_leaf 0.00% 11.728us 0.00% 11.728us 0.391us 0 b 0 b 0 b 0 b 30 [[5, 512, 7, 7]]
output_nr 0.00% 13.743us 0.00% 13.743us 0.550us 0 b 0 b 0 b 0 b 25 [[512]]
is_leaf 0.00% 18.531us 0.00% 18.531us 0.371us 0 b 0 b 0 b 0 b 50 [[512]]
empty_like 0.00% 16.065us 0.05% 367.617us 73.523us 0 b 0 b 2.39 Mb 0 b 5 [[5, 512, 7, 7]]
view 0.00% 12.997us 0.00% 12.997us 2.599us 0 b 0 b 0 b 0 b 5 [[512]]
relu_ 0.01% 85.920us 0.02% 160.393us 40.098us 0 b 0 b 0 b 0 b 4 [[5, 512, 7, 7]]
threshold_ 0.01% 66.152us 0.01% 66.152us 16.538us 0 b 0 b 0 b 0 b 4 [[5, 512, 7, 7], [], []]
conv2d 0.00% 10.257us 0.12% 980.922us 326.974us 0 b 0 b 1.44 Mb 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
convolution 0.00% 8.409us 0.12% 970.665us 323.555us 0 b 0 b 1.44 Mb 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
_convolution 0.01% 47.663us 0.12% 962.256us 320.752us 0 b 0 b 1.44 Mb 0 b 3 [[5, 512, 7, 7], [512, 512, 3, 3],
output_nr 0.00% 0.967us 0.00% 0.967us 0.322us 0 b 0 b 0 b 0 b 3 [[512, 512, 3, 3]]
is_leaf 0.00% 2.025us 0.00% 2.025us 0.337us 0 b 0 b 0 b 0 b 6 [[512, 512, 3, 3]]
contiguous 0.00% 2.924us 0.00% 2.924us 0.975us 0 b 0 b 0 b 0 b 3 [[512, 512, 3, 3]]
resize_ 0.00% 5.685us 0.00% 5.685us 1.895us 0 b 0 b 0 b 0 b 3 [[512, 512, 3, 3]]
resize_ 0.00% 1.190us 0.00% 1.190us 0.397us 0 b 0 b 0 b 0 b 3 [[5, 512, 7, 7]]
stride 0.00% 15.763us 0.00% 15.763us 1.314us 0 b 0 b 0 b 0 b 12 [[5, 512, 7, 7]]
size 0.00% 14.329us 0.00% 14.329us 0.597us 0 b 0 b 0 b 0 b 24 [[512, 512, 3, 3]]
conv2d 0.00% 3.248us 0.02% 124.055us 124.055us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
convolution 0.00% 2.649us 0.02% 120.807us 120.807us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
_convolution 0.00% 11.461us 0.01% 118.158us 118.158us 0 b 0 b 490.00 Kb 0 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
output_nr 0.00% 0.326us 0.00% 0.326us 0.326us 0 b 0 b 0 b 0 b 1 [[512, 256, 1, 1]]
is_leaf 0.00% 0.618us 0.00% 0.618us 0.309us 0 b 0 b 0 b 0 b 2 [[512, 256, 1, 1]]
contiguous 0.00% 0.793us 0.00% 0.793us 0.793us 0 b 0 b 0 b 0 b 1 [[512, 256, 1, 1]]
resize_ 0.00% 0.813us 0.00% 0.813us 0.813us 0 b 0 b 0 b 0 b 1 [[512, 256, 1, 1]]
size 0.00% 2.271us 0.00% 2.271us 0.284us 0 b 0 b 0 b 0 b 8 [[512, 256, 1, 1]]
add_ 0.01% 52.932us 0.01% 55.218us 27.609us 0 b 0 b 0 b 0 b 2 [[5, 512, 7, 7], [5, 512, 7, 7], []
adaptive_avg_pool2d 0.00% 13.622us 0.01% 94.946us 94.946us 0 b 0 b 10.00 Kb 0 b 1 [[5, 512, 7, 7]]
view 0.00% 12.170us 0.00% 13.182us 13.182us 0 b 0 b 0 b 0 b 1 [[5, 512, 7, 7]]
_version 0.00% 0.957us 0.00% 0.957us 0.478us 0 b 0 b 0 b 0 b 2 [[2560, 49]]
mean 0.01% 48.108us 0.01% 54.856us 54.856us 0 b 0 b 10.00 Kb 0 b 1 [[2560, 49]]
output_nr 0.00% 0.309us 0.00% 0.309us 0.309us 0 b 0 b 0 b 0 b 1 [[2560, 49]]
as_strided 0.00% 1.924us 0.00% 1.924us 1.924us 0 b 0 b 0 b 0 b 1 [[2560]]
view 0.00% 9.572us 0.00% 10.354us 10.354us 0 b 0 b 0 b 0 b 1 [[2560]]
output_nr 0.00% 0.380us 0.00% 0.380us 0.380us 0 b 0 b 0 b 0 b 1 [[2560]]
_version 0.00% 0.690us 0.00% 0.690us 0.345us 0 b 0 b 0 b 0 b 2 [[5, 512, 1, 1]]
flatten 0.00% 6.140us 0.00% 17.509us 17.509us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
size 0.00% 0.477us 0.00% 0.477us 0.477us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
reshape 0.00% 3.012us 0.00% 10.892us 10.892us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
view 0.00% 6.953us 0.00% 7.880us 7.880us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
output_nr 0.00% 0.340us 0.00% 0.340us 0.340us 0 b 0 b 0 b 0 b 1 [[5, 512, 1, 1]]
_version 0.00% 0.903us 0.00% 0.903us 0.301us 0 b 0 b 0 b 0 b 3 [[5, 512]]
unsigned short 0.00% 10.484us 0.00% 15.294us 15.294us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
transpose 0.00% 2.729us 0.00% 4.471us 4.471us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
as_strided 0.00% 1.742us 0.00% 1.742us 1.742us 0 b 0 b 0 b 0 b 1 [[1000, 512]]
_version 0.00% 0.935us 0.00% 0.935us 0.312us 0 b 0 b 0 b 0 b 3 [[512, 1000]]
output_nr 0.00% 0.650us 0.00% 0.650us 0.325us 0 b 0 b 0 b 0 b 2 [[5, 512]]
output_nr 0.00% 0.580us 0.00% 0.580us 0.290us 0 b 0 b 0 b 0 b 2 [[512, 1000]]
is_leaf 0.00% 6.889us 0.00% 6.889us 3.444us 0 b 0 b 0 b 0 b 2 [[5, 512]]
is_leaf 0.00% 0.593us 0.00% 0.593us 0.296us 0 b 0 b 0 b 0 b 2 [[512, 1000]]
size 0.00% 0.501us 0.00% 0.501us 0.501us 0 b 0 b 0 b 0 b 1 [[5, 512]]
size 0.00% 0.329us 0.00% 0.329us 0.329us 0 b 0 b 0 b 0 b 1 [[512, 1000]]
expand 0.00% 3.425us 0.00% 4.466us 4.466us 0 b 0 b 0 b 0 b 1 [[1000]]
as_strided 0.00% 1.041us 0.00% 1.041us 1.041us 0 b 0 b 0 b 0 b 1 [[1000]]
cudnn_convolution 0.01% 80.192us 0.01% 114.033us 114.033us 0 b 0 b 490.00 Kb -512 b 1 [[5, 256, 14, 14], [512, 256, 3, 3]
cudnn_convolution 0.01% 86.486us 0.01% 103.607us 103.607us 0 b 0 b 490.00 Kb -512 b 1 [[5, 256, 14, 14], [512, 256, 1, 1]
cudnn_convolution 0.01% 65.716us 0.01% 86.196us 86.196us 0 b 0 b 980.00 Kb -1.50 Kb 1 [[5, 128, 28, 28], [256, 128, 3, 3]
cudnn_convolution 0.01% 76.627us 0.05% 415.116us 415.116us 0 b 0 b 980.00 Kb -1.50 Kb 1 [[5, 128, 28, 28], [256, 128, 1, 1]
cudnn_convolution 0.01% 89.671us 0.02% 169.932us 169.932us 0 b 0 b 1.91 Mb -5.00 Kb 1 [[5, 64, 56, 56], [128, 64, 3, 3]]
cudnn_convolution 0.01% 76.599us 0.01% 94.970us 94.970us 0 b 0 b 2.77 Mb -5.00 Kb 1 [[5, 64, 56, 56], [128, 64, 1, 1]]
cudnn_convolution 73.83% 591.544ms 73.89% 592.010ms 592.010ms 0 b 0 b 16.00 Mb -74.00 Kb 1 [[5, 3, 224, 224], [64, 3, 7, 7]]
cudnn_convolution 0.04% 297.932us 0.09% 694.063us 173.516us 0 b 0 b 16.17 Mb -1.00 Mb 4 [[5, 64, 56, 56], [64, 64, 3, 3]]
cudnn_convolution 0.03% 249.074us 0.07% 591.193us 197.064us 0 b 0 b 5.74 Mb -3.00 Mb 3 [[5, 128, 28, 28], [128, 128, 3, 3]
cudnn_convolution 0.03% 211.863us 0.07% 560.540us 186.847us 0 b 0 b 2.87 Mb -12.00 Mb 3 [[5, 256, 14, 14], [256, 256, 3, 3]
cudnn_convolution 0.03% 212.955us 0.11% 904.809us 301.603us 0 b 0 b 1.44 Mb -48.00 Mb 3 [[5, 512, 7, 7], [512, 512, 3, 3]]
root 0.38% 3.029ms 100.00% 801.172ms 801.172ms 0 b 0 b 0 b -110.33 Mb 1 []
--------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- -----------------------------------
Self CPU time total: 801.172ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment