Last active
June 4, 2018 14:38
-
-
Save ih4cku/d53f691cc79b9b433e3703740195b9f7 to your computer and use it in GitHub Desktop.
calc DeepLab receptive field
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# net architecture of http://ccvl.stat.ucla.edu/ccvl/DeepLab/train.prototxt | |
# format: [kernel, stride] | |
deep_lab_vgg = [ | |
# 1 | |
[3, 1], | |
[3, 1], | |
[2, 2], | |
# 2 | |
[3, 1], | |
[3, 1], | |
[2, 2], | |
# 3 | |
[3, 1], | |
[3, 1], | |
[3, 1], | |
[2, 2], | |
# 4 | |
[3, 1], | |
[3, 1], | |
[3, 1], | |
[2, 1], | |
# 5 | |
[5, 1], | |
[5, 1], | |
[5, 1], | |
[3, 1], | |
# fc6, kernel=4, hole=4 | |
[13, 1] | |
] | |
# K: composed kernel, also the receptive field | |
# S: composed stride | |
K, S = 1, 1 | |
for i in range(len(deep_lab_vgg)): | |
k, s = deep_lab_vgg[i] | |
K = (k-1) * S + K | |
S = S * s | |
print 'layer:', i, deep_lab_vgg[i], '[', K, S, ']' |
Hi
The deeplab paper said "In the case of the VGG-16 net we consider, its receptive field is 224x224 (with zero-padding) and 404x404 pixels if the net is applied convolutionally" and "We have addressed this practical problem by spatially subsampling (by simple decimation) the first FC layer to 4x4 (or 3x3) spatial size. This has reduced the receptive field of the network down to 128x128 (with zero-padding) or 308x308 (in convolutional mode) "
How do they calculate and get the results of "224x224" and "128x128" ? What does "with zero-padding" mean? Zero-padding doesn't influence receptive field, does it?
I guess you guys might have answers. Thanks a lot!
i have the same questions!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
P.S.:
receptive filed of hole_conv:
r.f. = kernel + (kernel-1)*(hole-1)