Skip to content

Instantly share code, notes, and snippets.

@mavenlin
Last active November 20, 2024 12:20
Show Gist options
  • Save mavenlin/e56253735ef32c3c296d to your computer and use it in GitHub Desktop.
Save mavenlin/e56253735ef32c3c296d to your computer and use it in GitHub Desktop.
Network in Network CIFAR10

Info

name: Network in Network CIFAR10 Model

caffemodel: cifar10_nin.caffemodel

caffemodel_url: https://www.dropbox.com/s/blrajqirr1p31v0/cifar10_nin.caffemodel?dl=1

license: BSD

sha1: 8e89c8fcd46e02780e16c867a5308e7bb7af0803

caffe_commit: c69b3b49084b503e23b95dc387329975245949c2

gist_id: e56253735ef32c3c296d

Descriptions

This model is a 3 layer Network in Network model trained on CIFAR10 dataset.

The performance of this model on validation set is 89.6% The detailed descriptions are in the paper Network in Network

The preprocessed CIFAR10 data is downloadable in lmdb format here:

License

The data used to train this model comes from http://www.cs.toronto.edu/~kriz/cifar.html Please follow the license there if used.

net: "train_test.prototxt"
test_iter: 100
test_interval: 500
base_lr: 0.1
momentum: 0.9
weight_decay: 0.0001
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 100
max_iter: 120000
snapshot: 10000
snapshot_prefix: "cifar10_nin"
solver_mode: GPU
name: "CIFAR10_full"
layers {
name: "cifar"
type: DATA
top: "data"
top: "label"
data_param {
source: "cifar-train-leveldb"
batch_size: 128
}
include: { phase: TRAIN }
}
layers {
name: "cifar"
type: DATA
top: "data"
top: "label"
data_param {
source: "cifar-test-leveldb"
batch_size: 100
}
include: { phase: TEST }
}
layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1.
weight_decay: 0.
convolution_param {
num_output: 192
pad: 2
kernel_size: 5
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
}
}
}
layers {
name: "relu1"
type: RELU
bottom: "conv1"
top: "conv1"
}
layers {
name: "cccp1"
type: CONVOLUTION
bottom: "conv1"
top: "cccp1"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 160
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp1"
type: RELU
bottom: "cccp1"
top: "cccp1"
}
layers {
name: "cccp2"
type: CONVOLUTION
bottom: "cccp1"
top: "cccp2"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 96
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp2"
type: RELU
bottom: "cccp2"
top: "cccp2"
}
layers {
name: "pool1"
type: POOLING
bottom: "cccp2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "drop3"
type: DROPOUT
bottom: "pool1"
top: "pool1"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "conv2"
type: CONVOLUTION
bottom: "pool1"
top: "conv2"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1.
weight_decay: 0.
convolution_param {
num_output: 192
pad: 2
kernel_size: 5
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
}
}
}
layers {
name: "relu2"
type: RELU
bottom: "conv2"
top: "conv2"
}
layers {
name: "cccp3"
type: CONVOLUTION
bottom: "conv2"
top: "cccp3"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 192
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp3"
type: RELU
bottom: "cccp3"
top: "cccp3"
}
layers {
name: "cccp4"
type: CONVOLUTION
bottom: "cccp3"
top: "cccp4"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 192
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp4"
type: RELU
bottom: "cccp4"
top: "cccp4"
}
layers {
name: "pool2"
type: POOLING
bottom: "cccp4"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layers {
name: "drop6"
type: DROPOUT
bottom: "pool2"
top: "pool2"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "conv3"
type: CONVOLUTION
bottom: "pool2"
top: "conv3"
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
}
}
}
layers {
name: "relu3"
type: RELU
bottom: "conv3"
top: "conv3"
}
layers {
name: "cccp5"
type: CONVOLUTION
bottom: "conv3"
top: "cccp5"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 192
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp5"
type: RELU
bottom: "cccp5"
top: "cccp5"
}
layers {
name: "cccp6"
type: CONVOLUTION
bottom: "cccp5"
top: "cccp6"
blobs_lr: 0.1
blobs_lr: 0.1
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 10
group: 1
kernel_size: 1
weight_filler {
type: "gaussian"
std: 0.05
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu_cccp6"
type: RELU
bottom: "cccp6"
top: "cccp6"
}
layers {
name: "pool3"
type: POOLING
bottom: "cccp6"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 8
stride: 1
}
}
layers {
name: "accuracy"
type: ACCURACY
bottom: "pool3"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
layers {
name: "loss"
type: SOFTMAX_LOSS
bottom: "pool3"
bottom: "label"
top: "loss"
}
@mollahosseini
Copy link

@kgl-prml, we followed the paper outline and did global contrast normalization and ZCA whitening and we were able to reproduce %10.4 error rate on cifar-10

@mollahosseini
Copy link

Has anybody been able to reproduce the paper results on cifar-100? We could get the paper result on cifar-10 (%10 error rate) but the same network doesn't converge to less than %75 error rate on cifar-100! Do you have any suggestion?

@rockstone533
Copy link

Hi, @hengck23. I try to reproduce NIN with caffe and get nearly 88% accuracy follow your advice.Now, I encounter some questions.
I want to use the network architecture to test a new image in python,
1.How should I implement GCN? I guess subtract mean and divide std, is it right?
2.Is the dropout layer needed to test a new image?
By the way, do you know how to visualize the feature map as shown in the paper? The patch size seems smaller than the input.

@tingtinglu
Copy link

I also want to implement this paper (NIN). But i am new in caffe and deep learning. I download the pre-trained model and the cifar-test-leveldb file from the website of the author(https://gist.github.com/mavenlin/e56253735ef32c3c296d), but my caffe cannot read the cifar-test-leveldb. What is wrong with this? and how can implement this experiment easily? thanks so much!

@Perseus14
Copy link

@mavenlin Can you upload the deploy.prototxt or show how to convert train_val.prototxt to deploy.prototxt as I don't know the input params. I want to take the model weights and biases without downloading the entire dataset.

@happyzhouch
Copy link

@hengck23
Hello,I also meet the problem that I can't read the downloaded data,I see you change your leveldb.Can you tell me how to change it ? I don't understand it.
I have to change my leveldb version to this "https://github.com/maxd/leveldbwin".

@happyzhouch
Copy link

@hengck23
How to use the leveldbwin? I can't understand it.I am a fresher in caffe and hope to get the solution from you sincerely.

@ectg
Copy link

ectg commented Feb 13, 2017

@Perseus14, did you find the deploy.prototxt ?

@sayadyaghoobi
Copy link

i want to finetune the nin model just for 3 classes, but i dont know what layer must be changed? i tried to rename cccp6 but it didnt work. anyone has idea about it? please share

@sayadyaghoobi
Copy link

i'm trying to train nin with my data, and i don't want to use pretrained weights, instead i'm going to try with random weights and use the nin network, my data consists of 3 classes and i just tried to to change the output of cccp6 from 10 to 3. when i run it i got wrong constant accuracy 0.3368 in every testing process. so if anyone has idea to help me with this error?
i did'nt change anything except the cccp6' output 10 to 3. thnx very much

@6eanut
Copy link

6eanut commented May 22, 2024

Does anyone have a deploy.prototxt for this model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment