Skip to content

Instantly share code, notes, and snippets.

@ksimonyan
Last active June 16, 2018 12:39
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ksimonyan/f0f3d010e6d5f0100274 to your computer and use it in GitHub Desktop.
Save ksimonyan/f0f3d010e6d5f0100274 to your computer and use it in GitHub Desktop.
CNN_M_1024 model from the BMVC-2014 paper "Return of the Devil in the Details: Delving Deep into Convolutional Nets"

##Information

name: CNN_M_1024 model from the BMVC-2014 paper: "Return of the Devil in the Details: Delving Deep into Convolutional Nets"

mean_file_mat: http://www.robots.ox.ac.uk/~vgg/software/deep_eval/releases/bvlc/VGG_mean.mat

mean_file_proto: http://www.robots.ox.ac.uk/~vgg/software/deep_eval/releases/bvlc/VGG_mean.binaryproto

caffemodel: VGG_CNN_M_1024

caffemodel_url: http://www.robots.ox.ac.uk/~vgg/software/deep_eval/releases/bvlc/VGG_CNN_M_1024.caffemodel

license: non-commercial use only

caffe_version: trained using a custom Caffe-based framework

gist_id: f0f3d010e6d5f0100274

Description

The CNN_M_1024 model is trained on the ILSVRC-2012 dataset. The details can be found in the following BMVC-2014 paper:

Return of the Devil in the Details: Delving Deep into Convolutional Nets
K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman
British Machine Vision Conference, 2014 (arXiv ref. cs1405.3531)

Please cite the paper if you use the model.

The model is trained on 224x224 crops sampled from images, rescaled so that the smallest side is 256 (preserving the aspect ratio). The released mean BGR image should be subtracted from 224x224 crops.

Further details can be found in the paper and on the project website: http://www.robots.ox.ac.uk/~vgg/research/deep_eval/

Note

The model is stored in a different format than the one released at http://www.robots.ox.ac.uk/~vgg/software/deep_eval/ to make it compatible with BVLC Caffe and BGR images (the network weights are the same). The class order is also different; the one used here corresponds to synsets.txt in http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz

ILSVRC-2012 performance

Using 10 test crops (corners, centre, and horizontal flips), the top-5 classification error on the validation set of ILSVRC-2012 is 13.7%.

Using a single central crop, the top-5 classification error on the validation set of ILSVRC-2012 is 16.0%

The details of the evaluation can be found in the paper.

name: "VGG_CNN_M_1024"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 224
input_dim: 224
layers {
bottom: "data"
top: "conv1"
name: "conv1"
type: CONVOLUTION
convolution_param {
num_output: 96
kernel_size: 7
stride: 2
}
}
layers {
bottom: "conv1"
top: "conv1"
name: "relu1"
type: RELU
}
layers {
bottom: "conv1"
top: "norm1"
name: "norm1"
type: LRN
lrn_param {
local_size: 5
alpha: 0.0005
beta: 0.75
k: 2
}
}
layers {
bottom: "norm1"
top: "pool1"
name: "pool1"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
bottom: "pool1"
top: "conv2"
name: "conv2"
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 5
stride: 2
}
}
layers {
bottom: "conv2"
top: "conv2"
name: "relu2"
type: RELU
}
layers {
bottom: "conv2"
top: "norm2"
name: "norm2"
type: LRN
lrn_param {
local_size: 5
alpha: 0.0005
beta: 0.75
k: 2
}
}
layers {
bottom: "norm2"
top: "pool2"
name: "pool2"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
bottom: "pool2"
top: "conv3"
name: "conv3"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv3"
top: "conv3"
name: "relu3"
type: RELU
}
layers {
bottom: "conv3"
top: "conv4"
name: "conv4"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv4"
top: "conv4"
name: "relu4"
type: RELU
}
layers {
bottom: "conv4"
top: "conv5"
name: "conv5"
type: CONVOLUTION
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layers {
bottom: "conv5"
top: "conv5"
name: "relu5"
type: RELU
}
layers {
bottom: "conv5"
top: "pool5"
name: "pool5"
type: POOLING
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
bottom: "pool5"
top: "fc6"
name: "fc6"
type: INNER_PRODUCT
inner_product_param {
num_output: 4096
}
}
layers {
bottom: "fc6"
top: "fc6"
name: "relu6"
type: RELU
}
layers {
bottom: "fc6"
top: "fc6"
name: "drop6"
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: "fc6"
top: "fc7"
name: "fc7"
type: INNER_PRODUCT
inner_product_param {
num_output: 1024
}
}
layers {
bottom: "fc7"
top: "fc7"
name: "relu7"
type: RELU
}
layers {
bottom: "fc7"
top: "fc7"
name: "drop7"
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: "fc7"
top: "fc8"
name: "fc8"
type: INNER_PRODUCT
inner_product_param {
num_output: 1000
}
}
layers {
bottom: "fc8"
top: "prob"
name: "prob"
type: SOFTMAX
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment