using Metalhead
using Metalhead: classify
using TimerOutputs
vgg = VGG19()
img = load("190px-Asian_elephant_-_melbourne_zoo.jpg")
begin
reset_timer!()
classify(vgg, img)
print_timer()
end
Initial:
───────────────────────────────────────────────────────────────────────────────
Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 766ms / 89.0% 675MiB / 96.4%
Section ncalls time %tot avg alloc %tot avg
───────────────────────────────────────────────────────────────────────────────
conv 16 466ms 68.3% 29.1ms 510MiB 78.4% 31.9MiB
conv2d! 16 366ms 53.7% 22.9ms 396MiB 60.9% 24.8MiB
im2col2d! 16 195ms 28.6% 12.2ms 39.6MiB 6.09% 2.48MiB
im2col_2d! 16 183ms 26.8% 11.4ms - 0.00% -
gemm 16 141ms 20.6% 8.79ms - 0.00% -
maxpool2d! 5 139ms 20.3% 27.7ms 140MiB 21.5% 28.0MiB
max_pooling2d_fwd! 5 139ms 20.3% 27.7ms 140MiB 21.5% 28.0MiB
dense 3 77.3ms 11.3% 25.8ms 72.6KiB 0.01% 24.2KiB
───────────────────────────────────────────────────────────────────────────────
With diff
- y[pw, ph, c, n] = maximum(x[wstart:wend, hstart:hend, c, n])
+ m = typemin(T)
+ for j in hstart:hend
+ for i in wstart:wend
+ m = max(x[i, j, c, n])
+ end
+ end
+
+ y[pw, ph, c, n] = m
end
end
───────────────────────────────────────────────────────────────────────────────
Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 643ms / 88.6% 532MiB / 95.9%
Section ncalls time %tot avg alloc %tot avg
───────────────────────────────────────────────────────────────────────────────
conv 16 487ms 85.4% 30.4ms 510MiB 100% 31.8MiB
conv2d! 16 406ms 71.1% 25.3ms 396MiB 77.7% 24.8MiB
im2col2d! 16 195ms 34.1% 12.2ms 39.6MiB 7.78% 2.48MiB
im2col_2d! 16 182ms 31.9% 11.4ms - 0.00% -
gemm 16 122ms 21.3% 7.60ms - 0.00% -
dense 3 78.3ms 13.7% 26.1ms 72.6KiB 0.01% 24.2KiB
maxpool2d! 5 4.64ms 0.81% 928μs 1.53KiB 0.00% -
max_pooling2d_fwd! 5 4.64ms 0.81% 927μs - 0.00% -
───────────────────────────────────────────────────────────────────────────────