Skip to content

Instantly share code, notes, and snippets.

@jimgoo
Last active October 11, 2022 08:58
Show Gist options
  • Save jimgoo/0179e52305ca768a601f to your computer and use it in GitHub Desktop.
Save jimgoo/0179e52305ca768a601f to your computer and use it in GitHub Desktop.
CaffeNet fine-tuned on the Oxford 102 category flower dataset

name: CaffeNet fine-tuned on the Oxford 102 category flower dataset

caffemodel: oxford102.caffemodel

caffemodel_url: https://s3.amazonaws.com/jgoode/oxford102.caffemodel

gist_id: 0179e52305ca768a601f

license: BSD-3

See https://github.com/jimgoo/caffe-oxford102 for full code.

The CNN is a BVLC reference CaffeNet fine-tuned for the Oxford 102 category flower dataset. The number of outputs in the inner product layer has been set to 102 to reflect the number of flower categories. Hyperparameter choices reflect those in Fine-tuning CaffeNet for Style Recognition on “Flickr Style” Data. The global learning rate is reduced while the learning rate for the final fully connected is increased relative to the other layers.

The split file (setid.mat) lists 6,149 images in the test set and 1,020 images in the training set. We have instead trained this model on the larger set of 6,149 images and tested against the smaller set of 1,020 images.

After 50,000 iterations, the top-1 error is 7% on the test set of 1,020 images:

I0215 15:28:06.417726  6585 solver.cpp:246] Iteration 50000, loss = 0.000120038
I0215 15:28:06.417789  6585 solver.cpp:264] Iteration 50000, Testing net (#0)
I0215 15:28:30.834987  6585 solver.cpp:315]     Test net output #0: accuracy = 0.9326
I0215 15:28:30.835072  6585 solver.cpp:251] Optimization Done.
I0215 15:28:30.835083  6585 caffe.cpp:121] Optimization Done.

Note that this uses the mean file for ILSVRC 2012 instead of the mean for the actual Oxford dataset.

net: "models/oxford102/train_val.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
display: 20
max_iter: 50000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "models/oxford102/oxford102"
# uncomment the following to default to CPU mode solving
solver_mode: GPU
name: "Oxford102CaffeNet"
layers {
name: "data"
type: IMAGE_DATA
top: "data"
top: "label"
image_data_param {
source: "data/oxford102/test.txt"
batch_size: 50
new_height: 256
new_width: 256
}
transform_param {
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
mirror: true
}
include: { phase: TRAIN }
}
layers {
name: "data"
type: IMAGE_DATA
top: "data"
top: "label"
image_data_param {
source: "data/oxford102/train.txt"
batch_size: 50
new_height: 256
new_width: 256
}
transform_param {
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
mirror: false
}
include: { phase: TEST }
}
layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu1"
type: RELU
bottom: "conv1"
top: "conv1"
}
layers {
name: "pool1"
type: POOLING
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "norm1"
type: LRN
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layers {
name: "conv2"
type: CONVOLUTION
bottom: "norm1"
top: "conv2"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layers {
name: "relu2"
type: RELU
bottom: "conv2"
top: "conv2"
}
layers {
name: "pool2"
type: POOLING
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "norm2"
type: LRN
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layers {
name: "conv3"
type: CONVOLUTION
bottom: "norm2"
top: "conv3"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "relu3"
type: RELU
bottom: "conv3"
top: "conv3"
}
layers {
name: "conv4"
type: CONVOLUTION
bottom: "conv3"
top: "conv4"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layers {
name: "relu4"
type: RELU
bottom: "conv4"
top: "conv4"
}
layers {
name: "conv5"
type: CONVOLUTION
bottom: "conv4"
top: "conv5"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layers {
name: "relu5"
type: RELU
bottom: "conv5"
top: "conv5"
}
layers {
name: "pool5"
type: POOLING
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "fc6"
type: INNER_PRODUCT
bottom: "pool5"
top: "fc6"
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layers {
name: "relu6"
type: RELU
bottom: "fc6"
top: "fc6"
}
layers {
name: "drop6"
type: DROPOUT
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "fc7"
type: INNER_PRODUCT
bottom: "fc6"
top: "fc7"
# Note that blobs_lr can be set to 0 to disable any fine-tuning of this, and any other, layer
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layers {
name: "relu7"
type: RELU
bottom: "fc7"
top: "fc7"
}
layers {
name: "drop7"
type: DROPOUT
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "fc8_oxford_102"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_oxford_102"
# blobs_lr is set to higher than for other layers, because this layer is starting from random while the others are already trained
blobs_lr: 10
blobs_lr: 20
weight_decay: 1
weight_decay: 0
inner_product_param {
num_output: 102
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layers {
name: "loss"
type: SOFTMAX_LOSS
bottom: "fc8_oxford_102"
bottom: "label"
}
layers {
name: "accuracy"
type: ACCURACY
bottom: "fc8_oxford_102"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
@Darwin2011
Copy link

I use your caffemodel to make predictions.But after that, I don't know the result classfication name.
Could you help me with that?

Best Regards

@jimgoo
Copy link
Author

jimgoo commented Apr 20, 2015

I have the same question myself, as I couldn't find the mappings between integer categories and actual flower names anywhere on the website (http://www.robots.ox.ac.uk/~vgg/data/flowers/). I did a quick sanity check by counting the number of integer labels for a given class and comparing that number to the category counts shown here: http://www.robots.ox.ac.uk/~vgg/data/flowers/102/categories.html. This however is not an effective strategy for all categories because some have the same number of counts (for instance corn poppy, globe-flower and grape hyacinth both have 41 images, so you can't tell which integer ID with 41 occurrences is which.

I suppose the authors could provide it if emailed. If you are able to get this info, please let me know.

@m-co
Copy link

m-co commented Jul 6, 2015

In case anyone else was interested, I inferred the classes by matching the thumbnail image names on this page against the labelled image set (imagelabels.mat). This list should match the model output:

['pink primrose', 'hard-leaved pocket orchid', 'canterbury bells', 'sweet pea', 'english marigold', 'tiger lily', 'moon orchid', 'bird of paradise', 'monkshood', 'globe thistle', 'snapdragon', "colt's foot", 'king protea', 'spear thistle', 'yellow iris', 'globe-flower', 'purple coneflower', 'peruvian lily', 'balloon flower', 'giant white arum lily', 'fire lily', 'pincushion flower', 'fritillary', 'red ginger', 'grape hyacinth', 'corn poppy', 'prince of wales feathers', 'stemless gentian', 'artichoke', 'sweet william', 'carnation', 'garden phlox', 'love in the mist', 'mexican aster', 'alpine sea holly', 'ruby-lipped cattleya', 'cape flower', 'great masterwort', 'siam tulip', 'lenten rose', 'barbeton daisy', 'daffodil', 'sword lily', 'poinsettia', 'bolero deep blue', 'wallflower', 'marigold', 'buttercup', 'oxeye daisy', 'common dandelion', 'petunia', 'wild pansy', 'primula', 'sunflower', 'pelargonium', 'bishop of llandaff', 'gaura', 'geranium', 'orange dahlia', 'pink-yellow dahlia?', 'cautleya spicata', 'japanese anemone', 'black-eyed susan', 'silverbush', 'californian poppy', 'osteospermum', 'spring crocus', 'bearded iris', 'windflower', 'tree poppy', 'gazania', 'azalea', 'water lily', 'rose', 'thorn apple', 'morning glory', 'passion flower', 'lotus', 'toad lily', 'anthurium', 'frangipani', 'clematis', 'hibiscus', 'columbine', 'desert-rose', 'tree mallow', 'magnolia', 'cyclamen ', 'watercress', 'canna lily', 'hippeastrum ', 'bee balm', 'ball moss', 'foxglove', 'bougainvillea', 'camellia', 'mallow', 'mexican petunia', 'bromelia', 'blanket flower', 'trumpet creeper', 'blackberry lily']

@elmorg
Copy link

elmorg commented Jul 7, 2015

I'm trying to use this with deepdream, however the model specification requires a 'deploy.prototxt' file. How do I generate this? Thanks.

@crypdick
Copy link

crypdick commented Jul 7, 2015

I'm also here wondering how to use the net with Google's deepdream

@m-co
Copy link

m-co commented Jul 7, 2015

This deploy.prototxt should work: http://pastie.org/10278385

Note you'll need to change the end layer parameter in deepdream to one of the layer names in this model, i.e. pool5, conv4, conv5 etc.

It seems to give somewhat less defined results with the default settings, presumably as the code is optimised for the googlenet model rather than the alexnet derived model this is based on.

@drewlustro
Copy link

thank you @m-co 💯

@jimgoo
Copy link
Author

jimgoo commented Aug 4, 2015

@m-co Awesome, thanks!

@linnanwang
Copy link

Hello,

I tried to train the network from the scratch. However, the network can not be able reach 80% accuracy as in the pertained model after 100000 iterations. Do you have any other suggestions that I can achieve the accuracy in the pertained model? Thank you.

@ProGamerGov
Copy link

Has anyone tested this model for use in neural-style: https://github.com/jcjohnson/neural-style yet? Would it work?

@masaff
Copy link

masaff commented Nov 10, 2016

Hi,
I'm new to deep learning. I want to fine-tune a large number of layers. How should I do that? should I change the lr_mult in those layers? should I change them to some numbers larger that 1? What about decay_lr? why all of decay_lr are equal to zero? I just changed last layer's name and the number of outputs and also set the lr_mult to 10. I'm so confused can you lease help me?

@eventsbeyondme
Copy link

the link to the deploy.prototxt appears to be broken, any chance for another?

@lynnwilliam
Copy link

The license says non commercial ?
Why can I contact about any commercial leases for this ?

@jimgoo
Copy link
Author

jimgoo commented Jul 15, 2022

@eventsbeyondme For the deploy.prototxt files, I updated the link to the full repo: https://github.com/jimgoo/caffe-oxford102. There you'll find the two files for AlexNet and VGG:

https://github.com/jimgoo/caffe-oxford102/blob/master/AlexNet/deploy.prototxt
https://github.com/jimgoo/caffe-oxford102/blob/master/VGG_S/deploy.prototxt

@lynnwilliam I've changed the license to BSD-3.

@lynnwilliam
Copy link

lynnwilliam commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment