Skip to content

Instantly share code, notes, and snippets.

@kevinlin311tw
Last active September 12, 2018 06:15
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save kevinlin311tw/266d4150a1db5810398e to your computer and use it in GitHub Desktop.
Save kevinlin311tw/266d4150a1db5810398e to your computer and use it in GitHub Desktop.
Deep Learning of Binary Hash Codes CIFAR10
## Deep Learning of Binary Hash Codes for Fast Image Retrieval
name: Binary Hash Codes CIFAR10
caffemodel: KevinNet_CIFAR10_48.caffemodel
caffemodel_url: https://www.dropbox.com/s/1om7xa8mz93wkzh/KevinNet_CIFAR10_48.caffemodel?d
gist_id: 266d4150a1db5810398e
##Description
This model is a replication of the model described in the paper: http://www.iis.sinica.edu.tw/~kevinlin311.tw/cvprw15.pdf
The model is the iteration 50,000 snapshot trained on CIFAR-10.
The number of neurons in the latent layer is 48, in order to learn 48 bits binary hash codes.
## License
The data used to train this model comes from the ImageNet project, which distributes its database to researchers who agree to a following term of access:
"Researcher shall use the Database only for non-commercial research and educational purposes."
Accordingly, this model is distributed under a non-commercial license.

Deep Learning of Binary Hash Codes for Fast Image Retrieval

name: Deep Binary Hash Codes CIFAR10

caffemodel: KevinNet_CIFAR10_48.caffemodel

caffemodel_url: https://www.dropbox.com/s/1om7xa8mz93wkzh/KevinNet_CIFAR10_48.caffemodel?d

gist_id: 266d4150a1db5810398e

##GitHub

Download our model and source code here: https://github.com/kevinlin311tw/caffe-cvprw15

##Description

This model is a replication of the model described in the paper: http://www.iis.sinica.edu.tw/~kevinlin311.tw/cvprw15.pdf

The model is the iteration 50,000 snapshot trained on CIFAR-10.

The number of neurons in the latent layer is 48, in order to learn 48 bits binary hash codes.

License

The data used to train this model comes from the ImageNet project, which distributes its database to researchers who agree to a following term of access: "Researcher shall use the Database only for non-commercial research and educational purposes." Accordingly, this model is distributed under a non-commercial license.

name: "KevinNet_CIFAR10"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}
layers {
name: "relu1"
type: RELU
bottom: "conv1"
top: "conv1"
}
layers {
name: "pool1"
type: POOLING
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "norm1"
type: LRN
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layers {
name: "conv2"
type: CONVOLUTION
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layers {
name: "relu2"
type: RELU
bottom: "conv2"
top: "conv2"
}
layers {
name: "pool2"
type: POOLING
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "norm2"
type: LRN
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layers {
name: "conv3"
type: CONVOLUTION
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layers {
name: "relu3"
type: RELU
bottom: "conv3"
top: "conv3"
}
layers {
name: "conv4"
type: CONVOLUTION
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layers {
name: "relu4"
type: RELU
bottom: "conv4"
top: "conv4"
}
layers {
name: "conv5"
type: CONVOLUTION
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layers {
name: "relu5"
type: RELU
bottom: "conv5"
top: "conv5"
}
layers {
name: "pool5"
type: POOLING
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
name: "fc6"
type: INNER_PRODUCT
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layers {
name: "relu6"
type: RELU
bottom: "fc6"
top: "fc6"
}
layers {
name: "drop6"
type: DROPOUT
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "fc7"
type: INNER_PRODUCT
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layers {
name: "relu7"
type: RELU
bottom: "fc7"
top: "fc7"
}
layers {
name: "drop7"
type: DROPOUT
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layers {
name: "fc8_kevin"
type: INNER_PRODUCT
bottom: "fc7"
top: "fc8_kevin"
inner_product_param {
num_output: 48
}
}
layers {
name: "fc8_kevin_encode"
bottom: "fc8_kevin"
top: "fc8_kevin_encode"
type: SIGMOID
}
train_net: "train_CIFAR10_48.prototxt"
test_net: "test_CIFAR10_48.prototxt"
test_iter: 100
test_interval: 100
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 25000
display: 20
max_iter: 50000
momentum: 0.9
weight_decay: 0.0005
snapshot: 2000
snapshot_prefix: "KevinNet_CIFAR10_48"
name: "KevinNet_CIFAR10"
layers {
layer {
name: "data"
type: "data"
source: "../../cifar10_val_leveldb"
meanfile: "/home/iis/deep/rcnn_packages/caffe-new/data/ilsvrc12/imagenet_mean.binaryproto"
batchsize: 32
cropsize: 227
mirror: true
det_context_pad: 16
det_crop_mode: "warp"
det_fg_threshold: 0.5
det_bg_threshold: 0.5
det_fg_fraction: 0.25
}
top: "data"
top: "label"
}
layers {
layer {
name: "conv1"
type: "conv"
num_output: 96
kernelsize: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "data"
top: "conv1"
}
layers {
layer {
name: "relu1"
type: "relu"
}
bottom: "conv1"
top: "conv1"
}
layers {
layer {
name: "pool1"
type: "pool"
pool: MAX
kernelsize: 3
stride: 2
}
bottom: "conv1"
top: "pool1"
}
layers {
layer {
name: "norm1"
type: "lrn"
local_size: 5
alpha: 0.0001
beta: 0.75
}
bottom: "pool1"
top: "norm1"
}
layers {
layer {
name: "pad2"
type: "padding"
pad: 2
}
bottom: "norm1"
top: "pad2"
}
layers {
layer {
name: "conv2"
type: "conv"
num_output: 256
group: 2
kernelsize: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad2"
top: "conv2"
}
layers {
layer {
name: "relu2"
type: "relu"
}
bottom: "conv2"
top: "conv2"
}
layers {
layer {
name: "pool2"
type: "pool"
pool: MAX
kernelsize: 3
stride: 2
}
bottom: "conv2"
top: "pool2"
}
layers {
layer {
name: "norm2"
type: "lrn"
local_size: 5
alpha: 0.0001
beta: 0.75
}
bottom: "pool2"
top: "norm2"
}
layers {
layer {
name: "pad3"
type: "padding"
pad: 1
}
bottom: "norm2"
top: "pad3"
}
layers {
layer {
name: "conv3"
type: "conv"
num_output: 384
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad3"
top: "conv3"
}
layers {
layer {
name: "relu3"
type: "relu"
}
bottom: "conv3"
top: "conv3"
}
layers {
layer {
name: "pad4"
type: "padding"
pad: 1
}
bottom: "conv3"
top: "pad4"
}
layers {
layer {
name: "conv4"
type: "conv"
num_output: 384
group: 2
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad4"
top: "conv4"
}
layers {
layer {
name: "relu4"
type: "relu"
}
bottom: "conv4"
top: "conv4"
}
layers {
layer {
name: "pad5"
type: "padding"
pad: 1
}
bottom: "conv4"
top: "pad5"
}
layers {
layer {
name: "conv5"
type: "conv"
num_output: 256
group: 2
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad5"
top: "conv5"
}
layers {
layer {
name: "relu5"
type: "relu"
}
bottom: "conv5"
top: "conv5"
}
layers {
layer {
name: "pool5"
type: "pool"
kernelsize: 3
pool: MAX
stride: 2
}
bottom: "conv5"
top: "pool5"
}
layers {
layer {
name: "fc6"
type: "innerproduct"
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pool5"
top: "fc6"
}
layers {
layer {
name: "relu6"
type: "relu"
}
bottom: "fc6"
top: "fc6"
}
layers {
layer {
name: "drop6"
type: "dropout"
dropout_ratio: 0.5
}
bottom: "fc6"
top: "fc6"
}
layers {
layer {
name: "fc7"
type: "innerproduct"
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc6"
top: "fc7"
}
layers {
layer {
name: "relu7"
type: "relu"
}
bottom: "fc7"
top: "fc7"
}
layers {
layer {
name: "drop7"
type: "dropout"
dropout_ratio: 0.5
}
bottom: "fc7"
top: "fc7"
}
layers {
layer {
name: "fc8_kevin"
type: "innerproduct"
num_output: 48
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc7"
top: "fc8_kevin"
}
layers {
layer {
name: "fc8_kevin_encode"
type: "sigmoid"
}
bottom: "fc8_kevin"
top: "fc8_kevin_encode"
}
layers {
layer {
# We name this fc8_pascal so that the initialization
# network doesn't populate this layer with its fc8
name: "fc8_pascal"
type: "innerproduct"
num_output: 10
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
blobs_lr: 10.
blobs_lr: 20.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc8_kevin_encode"
top: "fc8_pascal"
}
layers {
layer {
name: "prob"
type: "softmax"
}
bottom: "fc8_pascal"
top: "prob"
}
layers {
layer {
name: "accuracy"
type: "accuracy"
}
bottom: "prob"
bottom: "label"
top: "accuracy"
}
name: "KevinNet_CIFAR10"
layers {
layer {
name: "data"
type: "data"
source: "../../cifar10_train_leveldb"
meanfile: "/home/iis/deep/rcnn_packages/caffe-new/data/ilsvrc12/imagenet_mean.binaryproto"
batchsize: 32
cropsize: 227
mirror: true
det_context_pad: 16
det_crop_mode: "warp"
det_fg_threshold: 0.5
det_bg_threshold: 0.5
det_fg_fraction: 0.25
}
top: "data"
top: "label"
}
layers {
layer {
name: "conv1"
type: "conv"
num_output: 96
kernelsize: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "data"
top: "conv1"
}
layers {
layer {
name: "relu1"
type: "relu"
}
bottom: "conv1"
top: "conv1"
}
layers {
layer {
name: "pool1"
type: "pool"
pool: MAX
kernelsize: 3
stride: 2
}
bottom: "conv1"
top: "pool1"
}
layers {
layer {
name: "norm1"
type: "lrn"
local_size: 5
alpha: 0.0001
beta: 0.75
}
bottom: "pool1"
top: "norm1"
}
layers {
layer {
name: "pad2"
type: "padding"
pad: 2
}
bottom: "norm1"
top: "pad2"
}
layers {
layer {
name: "conv2"
type: "conv"
num_output: 256
group: 2
kernelsize: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad2"
top: "conv2"
}
layers {
layer {
name: "relu2"
type: "relu"
}
bottom: "conv2"
top: "conv2"
}
layers {
layer {
name: "pool2"
type: "pool"
pool: MAX
kernelsize: 3
stride: 2
}
bottom: "conv2"
top: "pool2"
}
layers {
layer {
name: "norm2"
type: "lrn"
local_size: 5
alpha: 0.0001
beta: 0.75
}
bottom: "pool2"
top: "norm2"
}
layers {
layer {
name: "pad3"
type: "padding"
pad: 1
}
bottom: "norm2"
top: "pad3"
}
layers {
layer {
name: "conv3"
type: "conv"
num_output: 384
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad3"
top: "conv3"
}
layers {
layer {
name: "relu3"
type: "relu"
}
bottom: "conv3"
top: "conv3"
}
layers {
layer {
name: "pad4"
type: "padding"
pad: 1
}
bottom: "conv3"
top: "pad4"
}
layers {
layer {
name: "conv4"
type: "conv"
num_output: 384
group: 2
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad4"
top: "conv4"
}
layers {
layer {
name: "relu4"
type: "relu"
}
bottom: "conv4"
top: "conv4"
}
layers {
layer {
name: "pad5"
type: "padding"
pad: 1
}
bottom: "conv4"
top: "pad5"
}
layers {
layer {
name: "conv5"
type: "conv"
num_output: 256
group: 2
kernelsize: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pad5"
top: "conv5"
}
layers {
layer {
name: "relu5"
type: "relu"
}
bottom: "conv5"
top: "conv5"
}
layers {
layer {
name: "pool5"
type: "pool"
kernelsize: 3
pool: MAX
stride: 2
}
bottom: "conv5"
top: "pool5"
}
layers {
layer {
name: "fc6"
type: "innerproduct"
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "pool5"
top: "fc6"
}
layers {
layer {
name: "relu6"
type: "relu"
}
bottom: "fc6"
top: "fc6"
}
layers {
layer {
name: "drop6"
type: "dropout"
dropout_ratio: 0.5
}
bottom: "fc6"
top: "fc6"
}
layers {
layer {
name: "fc7"
type: "innerproduct"
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc6"
top: "fc7"
}
layers {
layer {
name: "relu7"
type: "relu"
}
bottom: "fc7"
top: "fc7"
}
layers {
layer {
name: "drop7"
type: "dropout"
dropout_ratio: 0.5
}
bottom: "fc7"
top: "fc7"
}
layers {
layer {
name: "fc8_kevin"
type: "innerproduct"
num_output: 48
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1.
}
blobs_lr: 1.
blobs_lr: 2.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc7"
top: "fc8_kevin"
}
layers {
layer {
name: "fc8_kevin_encode"
type: "sigmoid"
}
bottom: "fc8_kevin"
top: "fc8_kevin_encode"
}
layers {
layer {
# We name this fc8_pascal so that the initialization
# network doesn't populate this layer with its fc8
name: "fc8_pascal"
type: "innerproduct"
num_output: 10
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
blobs_lr: 10.
blobs_lr: 20.
weight_decay: 1.
weight_decay: 0.
}
bottom: "fc8_kevin_encode"
top: "fc8_pascal"
}
layers {
layer {
name: "loss"
type: "softmax_loss"
}
bottom: "fc8_pascal"
bottom: "label"
}
@PeterPan1990
Copy link

hello, kevin, I have two question about your KevinNet,

  1. what is your caffe version about this work?
    image,
    I have not use this syntax before.
  2. how do you preprocess the cifar 10 data, because the AlexNet has input as 227_227, but cifar10 has image as 32_32? Thanks a lot.

@kevinlin311tw
Copy link
Author

Sorry for the late reply.

  1. Caffe has many updates. Since we started this work one year ago, it is hard to find the exact version we used. You can download our source code (and Caffe) here: https://github.com/kevinlin311tw/caffe-cvprw15
  2. Images of all datasets are normalized to 256x256 and then center-cropped to 227x227 as the network input.

@prathmeshrmadhu
Copy link

Hello @kevinlin311tw
In create_imagenet.sh file you have used $DATA/train.txt path to convert the dataset into leveldb format. I am not able to convert the images. So I looked into the path and what I found was that there was no train.txt or val.txt file. Instead of that there were 4 files. Data and labels file for both train and test. Can you provide the guidelines that how can we convert the dataset using these four files or can you provide the train.txt or val.txt file?

@Nperv
Copy link

Nperv commented Aug 3, 2016

hi kevin, I think question of peter is how you can perform transfer learning using the model which is pre-trained in higher image size for e.g. 227 X 227 and can adjust the parameter of the image which is of smaller szie for e.g. 32 X 32. Doesn't the size of image get distorted if you resized it for so large? ... What will be the final output from convolution layers if you plot? Are you able to figure out what dies the image look like after performing CNN experiment ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment