Skip to content

Instantly share code, notes, and snippets.

@melvincabatuan
Created May 29, 2015 07:17
Show Gist options
  • Save melvincabatuan/840abc13b8931ee2099d to your computer and use it in GitHub Desktop.
Save melvincabatuan/840abc13b8931ee2099d to your computer and use it in GitHub Desktop.
Feature extraction with Caffe C++ code
---
title: Feature extraction with Caffe C++ code.
description: Extract CaffeNet / AlexNet features using the Caffe utility.
category: example
include_in_docs: true
priority: 10
---
Extracting Features
===================
In this tutorial, we will extract features using a pre-trained model with the included C++ utility.
Note that we recommend using the Python interface for this task, as for example in the [filter visualization example](http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/filter_visualization.ipynb).
Follow instructions for [installing Caffe](../../installation.html) and run `scripts/download_model_binary.py models/bvlc_reference_caffenet` from caffe root directory.
If you need detailed information about the tools below, please consult their source code, in which additional documentation is usually provided.
Select data to run on
---------------------
We'll make a temporary folder to store things into.
mkdir examples/_temp
Generate a list of the files to process.
We're going to use the images that ship with caffe.
find `pwd`/examples/images -type f -exec echo {} \; > examples/_temp/temp.txt
The `ImageDataLayer` we'll use expects labels after each filenames, so let's add a 0 to the end of each line
sed "s/$/ 0/" examples/_temp/temp.txt > examples/_temp/file_list.txt
Define the Feature Extraction Network Architecture
--------------------------------------------------
In practice, subtracting the mean image from a dataset significantly improves classification accuracies.
Download the mean image of the ILSVRC dataset.
./data/ilsvrc12/get_ilsvrc_aux.sh
We will use `data/ilsvrc212/imagenet_mean.binaryproto` in the network definition prototxt.
Let's copy and modify the network definition.
We'll be using the `ImageDataLayer`, which will load and resize images for us.
cp examples/feature_extraction/imagenet_val.prototxt examples/_temp
Extract Features
----------------
Now everything necessary is in place.
./build/tools/extract_features.bin models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel examples/_temp/imagenet_val.prototxt fc7 examples/_temp/features 10 lmdb
The name of feature blob that you extract is `fc7`, which represents the highest level feature of the reference model.
We can use any other layer, as well, such as `conv5` or `pool3`.
The last parameter above is the number of data mini-batches.
The features are stored to LevelDB `examples/_temp/features`, ready for access by some other code.
If you meet with the error "Check failed: status.ok() Failed to open leveldb examples/_temp/features", it is because the directory examples/_temp/features has been created the last time you run the command. Remove it and run again.
rm -rf examples/_temp/features/
If you'd like to use the Python wrapper for extracting features, check out the [layer visualization notebook](http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/filter_visualization.ipynb).
Clean Up
--------
Let's remove the temporary directory now.
rm -r examples/_temp
@melvincabatuan
Copy link
Author

../build/tools/caffe: /root/anaconda/lib/libtiff.so.5: no version information available (required by /usr/local/lib/libopencv_imgcodecs.so.3.0)
I0529 16:42:23.045181 25165 caffe.cpp:117] Use CPU.
I0529 16:42:23.045523 25165 caffe.cpp:121] Starting Optimization
I0529 16:42:23.045619 25165 solver.cpp:32] Initializing solver from parameters:
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "hdf5_classification/data/train"
solver_mode: CPU
net: "hdf5_classification/train_val.prototxt"
I0529 16:42:23.045701 25165 solver.cpp:70] Creating training net from net file: hdf5_classification/train_val.prototxt
I0529 16:42:23.045889 25165 net.cpp:287] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0529 16:42:23.045915 25165 net.cpp:287] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0529 16:42:23.045977 25165 net.cpp:42] Initializing net from parameters:
name: "LogisticRegressionNet"
state {
phase: TRAIN
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "hdf5_classification/data/train.txt"
batch_size: 10
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "data"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc1"
bottom: "label"
top: "loss"
}
I0529 16:42:23.046212 25165 layer_factory.hpp:74] Creating layer data
I0529 16:42:23.046237 25165 net.cpp:90] Creating Layer data
I0529 16:42:23.046254 25165 net.cpp:368] data -> data
I0529 16:42:23.046289 25165 net.cpp:368] data -> label
I0529 16:42:23.046314 25165 net.cpp:120] Setting up data
I0529 16:42:23.046329 25165 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: hdf5_classification/data/train.txt
I0529 16:42:23.046409 25165 hdf5_data_layer.cpp:94] Number of HDF5 files: 2
I0529 16:42:23.048463 25165 net.cpp:127] Top shape: 10 4 (40)
I0529 16:42:23.048490 25165 net.cpp:127] Top shape: 10 (10)
I0529 16:42:23.048504 25165 layer_factory.hpp:74] Creating layer fc1
I0529 16:42:23.048527 25165 net.cpp:90] Creating Layer fc1
I0529 16:42:23.048542 25165 net.cpp:410] fc1 <- data
I0529 16:42:23.048565 25165 net.cpp:368] fc1 -> fc1
I0529 16:42:23.048586 25165 net.cpp:120] Setting up fc1
I0529 16:42:23.049022 25165 net.cpp:127] Top shape: 10 2 (20)
I0529 16:42:23.049048 25165 layer_factory.hpp:74] Creating layer loss
I0529 16:42:23.049069 25165 net.cpp:90] Creating Layer loss
I0529 16:42:23.049088 25165 net.cpp:410] loss <- fc1
I0529 16:42:23.049101 25165 net.cpp:410] loss <- label
I0529 16:42:23.049116 25165 net.cpp:368] loss -> loss
I0529 16:42:23.049134 25165 net.cpp:120] Setting up loss
I0529 16:42:23.049151 25165 layer_factory.hpp:74] Creating layer loss
I0529 16:42:23.049185 25165 net.cpp:127] Top shape: (1)
I0529 16:42:23.049198 25165 net.cpp:129] with loss weight 1
I0529 16:42:23.049226 25165 net.cpp:192] loss needs backward computation.
I0529 16:42:23.049239 25165 net.cpp:192] fc1 needs backward computation.
I0529 16:42:23.049252 25165 net.cpp:194] data does not need backward computation.
I0529 16:42:23.049263 25165 net.cpp:235] This network produces output loss
I0529 16:42:23.049276 25165 net.cpp:482] Collecting Learning Rate and Weight Decay.
I0529 16:42:23.049291 25165 net.cpp:247] Network initialization done.
I0529 16:42:23.049302 25165 net.cpp:248] Memory required for data: 284
I0529 16:42:23.049466 25165 solver.cpp:154] Creating test net (#0) specified by net file: hdf5_classification/train_val.prototxt
I0529 16:42:23.049494 25165 net.cpp:287] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0529 16:42:23.049556 25165 net.cpp:42] Initializing net from parameters:
name: "LogisticRegressionNet"
state {
phase: TEST
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TEST
}
hdf5_data_param {
source: "hdf5_classification/data/test.txt"
batch_size: 10
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "data"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc1"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc1"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
I0529 16:42:23.049821 25165 layer_factory.hpp:74] Creating layer data
I0529 16:42:23.049840 25165 net.cpp:90] Creating Layer data
I0529 16:42:23.049854 25165 net.cpp:368] data -> data
I0529 16:42:23.049873 25165 net.cpp:368] data -> label
I0529 16:42:23.049890 25165 net.cpp:120] Setting up data
I0529 16:42:23.049901 25165 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: hdf5_classification/data/test.txt
I0529 16:42:23.049927 25165 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0529 16:42:23.051518 25165 net.cpp:127] Top shape: 10 4 (40)
I0529 16:42:23.051543 25165 net.cpp:127] Top shape: 10 (10)
I0529 16:42:23.051558 25165 layer_factory.hpp:74] Creating layer label_data_1_split
I0529 16:42:23.051575 25165 net.cpp:90] Creating Layer label_data_1_split
I0529 16:42:23.051589 25165 net.cpp:410] label_data_1_split <- label
I0529 16:42:23.051602 25165 net.cpp:368] label_data_1_split -> label_data_1_split_0
I0529 16:42:23.051619 25165 net.cpp:368] label_data_1_split -> label_data_1_split_1
I0529 16:42:23.051633 25165 net.cpp:120] Setting up label_data_1_split
I0529 16:42:23.051651 25165 net.cpp:127] Top shape: 10 (10)
I0529 16:42:23.051662 25165 net.cpp:127] Top shape: 10 (10)
I0529 16:42:23.051674 25165 layer_factory.hpp:74] Creating layer fc1
I0529 16:42:23.051689 25165 net.cpp:90] Creating Layer fc1
I0529 16:42:23.051702 25165 net.cpp:410] fc1 <- data
I0529 16:42:23.051715 25165 net.cpp:368] fc1 -> fc1
I0529 16:42:23.051731 25165 net.cpp:120] Setting up fc1
I0529 16:42:23.051751 25165 net.cpp:127] Top shape: 10 2 (20)
I0529 16:42:23.051769 25165 layer_factory.hpp:74] Creating layer fc1_fc1_0_split
I0529 16:42:23.051782 25165 net.cpp:90] Creating Layer fc1_fc1_0_split
I0529 16:42:23.051795 25165 net.cpp:410] fc1_fc1_0_split <- fc1
I0529 16:42:23.051807 25165 net.cpp:368] fc1_fc1_0_split -> fc1_fc1_0_split_0
I0529 16:42:23.051822 25165 net.cpp:368] fc1_fc1_0_split -> fc1_fc1_0_split_1
I0529 16:42:23.051836 25165 net.cpp:120] Setting up fc1_fc1_0_split
I0529 16:42:23.051851 25165 net.cpp:127] Top shape: 10 2 (20)
I0529 16:42:23.051863 25165 net.cpp:127] Top shape: 10 2 (20)
I0529 16:42:23.051874 25165 layer_factory.hpp:74] Creating layer loss
I0529 16:42:23.051889 25165 net.cpp:90] Creating Layer loss
I0529 16:42:23.051901 25165 net.cpp:410] loss <- fc1_fc1_0_split_0
I0529 16:42:23.051914 25165 net.cpp:410] loss <- label_data_1_split_0
I0529 16:42:23.051928 25165 net.cpp:368] loss -> loss
I0529 16:42:23.051941 25165 net.cpp:120] Setting up loss
I0529 16:42:23.051955 25165 layer_factory.hpp:74] Creating layer loss
I0529 16:42:23.051975 25165 net.cpp:127] Top shape: (1)
I0529 16:42:23.051986 25165 net.cpp:129] with loss weight 1
I0529 16:42:23.052001 25165 layer_factory.hpp:74] Creating layer accuracy
I0529 16:42:23.052017 25165 net.cpp:90] Creating Layer accuracy
I0529 16:42:23.052028 25165 net.cpp:410] accuracy <- fc1_fc1_0_split_1
I0529 16:42:23.052042 25165 net.cpp:410] accuracy <- label_data_1_split_1
I0529 16:42:23.052054 25165 net.cpp:368] accuracy -> accuracy
I0529 16:42:23.052073 25165 net.cpp:120] Setting up accuracy
I0529 16:42:23.052090 25165 net.cpp:127] Top shape: (1)
I0529 16:42:23.052103 25165 net.cpp:194] accuracy does not need backward computation.
I0529 16:42:23.052114 25165 net.cpp:192] loss needs backward computation.
I0529 16:42:23.052126 25165 net.cpp:192] fc1_fc1_0_split needs backward computation.
I0529 16:42:23.052137 25165 net.cpp:192] fc1 needs backward computation.
I0529 16:42:23.052163 25165 net.cpp:194] label_data_1_split does not need backward computation.
I0529 16:42:23.052177 25165 net.cpp:194] data does not need backward computation.
I0529 16:42:23.052187 25165 net.cpp:235] This network produces output accuracy
I0529 16:42:23.052199 25165 net.cpp:235] This network produces output loss
I0529 16:42:23.052216 25165 net.cpp:482] Collecting Learning Rate and Weight Decay.
I0529 16:42:23.052227 25165 net.cpp:247] Network initialization done.
I0529 16:42:23.052240 25165 net.cpp:248] Memory required for data: 528
I0529 16:42:23.052271 25165 solver.cpp:42] Solver scaffolding done.
I0529 16:42:23.052291 25165 solver.cpp:226] Solving LogisticRegressionNet
I0529 16:42:23.052304 25165 solver.cpp:227] Learning Rate Policy: step
I0529 16:42:23.052317 25165 solver.cpp:270] Iteration 0, Testing net (#0)
I0529 16:42:23.054368 25165 solver.cpp:319] Test net output #0: accuracy = 0.4024
I0529 16:42:23.054397 25165 solver.cpp:319] Test net output #1: loss = 0.696094 (* 1 = 0.696094 loss)
I0529 16:42:23.054430 25165 solver.cpp:189] Iteration 0, loss = 0.703411
I0529 16:42:23.054451 25165 solver.cpp:204] Train net output #0: loss = 0.703411 (* 1 = 0.703411 loss)
I0529 16:42:23.054472 25165 solver.cpp:467] Iteration 0, lr = 0.01
I0529 16:42:23.062499 25165 solver.cpp:270] Iteration 1000, Testing net (#0)
I0529 16:42:23.064363 25165 solver.cpp:319] Test net output #0: accuracy = 0.73
I0529 16:42:23.064388 25165 solver.cpp:319] Test net output #1: loss = 0.59733 (* 1 = 0.59733 loss)
I0529 16:42:23.064414 25165 solver.cpp:189] Iteration 1000, loss = 0.683364
I0529 16:42:23.064431 25165 solver.cpp:204] Train net output #0: loss = 0.683364 (* 1 = 0.683364 loss)
I0529 16:42:23.064445 25165 solver.cpp:467] Iteration 1000, lr = 0.01
I0529 16:42:23.072008 25165 solver.cpp:270] Iteration 2000, Testing net (#0)
I0529 16:42:23.073807 25165 solver.cpp:319] Test net output #0: accuracy = 0.7548
I0529 16:42:23.073832 25165 solver.cpp:319] Test net output #1: loss = 0.609978 (* 1 = 0.609978 loss)
I0529 16:42:23.073858 25165 solver.cpp:189] Iteration 2000, loss = 0.542591
I0529 16:42:23.073875 25165 solver.cpp:204] Train net output #0: loss = 0.542591 (* 1 = 0.542591 loss)
I0529 16:42:23.073889 25165 solver.cpp:467] Iteration 2000, lr = 0.01
I0529 16:42:23.081267 25165 solver.cpp:270] Iteration 3000, Testing net (#0)
I0529 16:42:23.083039 25165 solver.cpp:319] Test net output #0: accuracy = 0.7236
I0529 16:42:23.083065 25165 solver.cpp:319] Test net output #1: loss = 0.597759 (* 1 = 0.597759 loss)
I0529 16:42:23.083569 25165 solver.cpp:189] Iteration 3000, loss = 0.414657
I0529 16:42:23.083593 25165 solver.cpp:204] Train net output #0: loss = 0.414657 (* 1 = 0.414657 loss)
I0529 16:42:23.083607 25165 solver.cpp:467] Iteration 3000, lr = 0.01
I0529 16:42:23.092474 25165 solver.cpp:270] Iteration 4000, Testing net (#0)
I0529 16:42:23.095293 25165 solver.cpp:319] Test net output #0: accuracy = 0.73
I0529 16:42:23.095335 25165 solver.cpp:319] Test net output #1: loss = 0.59733 (* 1 = 0.59733 loss)
I0529 16:42:23.095366 25165 solver.cpp:189] Iteration 4000, loss = 0.683365
I0529 16:42:23.095383 25165 solver.cpp:204] Train net output #0: loss = 0.683364 (* 1 = 0.683364 loss)
I0529 16:42:23.095397 25165 solver.cpp:467] Iteration 4000, lr = 0.01
I0529 16:42:23.102486 25165 solver.cpp:270] Iteration 5000, Testing net (#0)
I0529 16:42:23.104099 25165 solver.cpp:319] Test net output #0: accuracy = 0.7548
I0529 16:42:23.104121 25165 solver.cpp:319] Test net output #1: loss = 0.609978 (* 1 = 0.609978 loss)
I0529 16:42:23.104143 25165 solver.cpp:189] Iteration 5000, loss = 0.542591
I0529 16:42:23.104159 25165 solver.cpp:204] Train net output #0: loss = 0.542591 (* 1 = 0.542591 loss)
I0529 16:42:23.104171 25165 solver.cpp:467] Iteration 5000, lr = 0.001
I0529 16:42:23.110808 25165 solver.cpp:270] Iteration 6000, Testing net (#0)
I0529 16:42:23.112450 25165 solver.cpp:319] Test net output #0: accuracy = 0.7668
I0529 16:42:23.112471 25165 solver.cpp:319] Test net output #1: loss = 0.597519 (* 1 = 0.597519 loss)
I0529 16:42:23.112946 25165 solver.cpp:189] Iteration 6000, loss = 0.405885
I0529 16:42:23.112968 25165 solver.cpp:204] Train net output #0: loss = 0.405885 (* 1 = 0.405885 loss)
I0529 16:42:23.112982 25165 solver.cpp:467] Iteration 6000, lr = 0.001
I0529 16:42:23.119367 25165 solver.cpp:270] Iteration 7000, Testing net (#0)
I0529 16:42:23.120959 25165 solver.cpp:319] Test net output #0: accuracy = 0.7648
I0529 16:42:23.120983 25165 solver.cpp:319] Test net output #1: loss = 0.596939 (* 1 = 0.596939 loss)
I0529 16:42:23.121013 25165 solver.cpp:189] Iteration 7000, loss = 0.672852
I0529 16:42:23.121038 25165 solver.cpp:204] Train net output #0: loss = 0.672852 (* 1 = 0.672852 loss)
I0529 16:42:23.121050 25165 solver.cpp:467] Iteration 7000, lr = 0.001
I0529 16:42:23.127130 25165 solver.cpp:270] Iteration 8000, Testing net (#0)
I0529 16:42:23.128564 25165 solver.cpp:319] Test net output #0: accuracy = 0.76
I0529 16:42:23.128587 25165 solver.cpp:319] Test net output #1: loss = 0.598483 (* 1 = 0.598483 loss)
I0529 16:42:23.128607 25165 solver.cpp:189] Iteration 8000, loss = 0.571235
I0529 16:42:23.128628 25165 solver.cpp:204] Train net output #0: loss = 0.571234 (* 1 = 0.571234 loss)
I0529 16:42:23.128639 25165 solver.cpp:467] Iteration 8000, lr = 0.001
I0529 16:42:23.135079 25165 solver.cpp:270] Iteration 9000, Testing net (#0)
I0529 16:42:23.136579 25165 solver.cpp:319] Test net output #0: accuracy = 0.764
I0529 16:42:23.136606 25165 solver.cpp:319] Test net output #1: loss = 0.597175 (* 1 = 0.597175 loss)
I0529 16:42:23.137107 25165 solver.cpp:189] Iteration 9000, loss = 0.411332
I0529 16:42:23.137130 25165 solver.cpp:204] Train net output #0: loss = 0.411332 (* 1 = 0.411332 loss)
I0529 16:42:23.137142 25165 solver.cpp:467] Iteration 9000, lr = 0.001
I0529 16:42:23.143101 25165 solver.cpp:337] Snapshotting to hdf5_classification/data/train_iter_10000.caffemodel
I0529 16:42:23.143313 25165 solver.cpp:345] Snapshotting solver state to hdf5_classification/data/train_iter_10000.solverstate
I0529 16:42:23.143407 25165 solver.cpp:252] Iteration 10000, loss = 0.671619
I0529 16:42:23.143424 25165 solver.cpp:270] Iteration 10000, Testing net (#0)
I0529 16:42:23.144834 25165 solver.cpp:319] Test net output #0: accuracy = 0.7644
I0529 16:42:23.144852 25165 solver.cpp:319] Test net output #1: loss = 0.59689 (* 1 = 0.59689 loss)
I0529 16:42:23.144863 25165 solver.cpp:257] Optimization Done.
I0529 16:42:23.144886 25165 caffe.cpp:134] Optimization Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment