Create a gist now

Instantly share code, notes, and snippets.

VGG-16 pre-trained model for Keras

##VGG16 model for Keras

This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman
arXiv:1409.1556

In the paper, the VGG-16 model is denoted as configuration D. It achieves 7.5% top-5 error on ILSVRC-2012-val, 7.4% top-5 error on ILSVRC-2012-test.

Please cite the paper if you use the models.

###Contents:

model and usage demo: see vgg-16_keras.py

weights: vgg16_weights.h5

from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD
import cv2, numpy as np
def VGG_16(weights_path=None):
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))
if weights_path:
model.load_weights(weights_path)
return model
if __name__ == "__main__":
im = cv2.resize(cv2.imread('cat.jpg'), (224, 224)).astype(np.float32)
im[:,:,0] -= 103.939
im[:,:,1] -= 116.779
im[:,:,2] -= 123.68
im = im.transpose((2,0,1))
im = np.expand_dims(im, axis=0)
# Test pretrained model
model = VGG_16('vgg16_weights.h5')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
out = model.predict(im)
print np.argmax(out)
@rodrigob

was this converted model tested ? which accuracy did you get then ?

@mikedewar

@baraldilorenzo (and @Zebreu!) thanks for this!

Is there a way I can use this to actually label images? I can get an index into the class vector (my 'cat.jpg' comes out to class 669) but I've no idea how to actually find the class labels the index refers to...

@mikedewar

I'm showing this network a bunch of different images and using model.predict_classes(im) to get class indices (I think). I'm using the pre-trained weights.

Most things I'm showing it (cat, dog, shark, mushroom) come out to index 669. A cup of coffee returns 999. Am I being too naive here?

@Zebreu
Zebreu commented Nov 9, 2015

@mikedewar You can download it at http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz (there are other files too, but one of them has a thousand lines with synset IDs and names, so you can just use that as an index). Works well in my case.
Also, from the out array, you can do numpy.argmax(out) to get the class with the highest probability.

Also, there is no normalization done in the gist above. If you want accurate results, you better do those steps to any input image:

    img = cv2.resize(cv2.imread('../../Downloads/cat2.jpg'), (224, 224))

    mean_pixel = [103.939, 116.779, 123.68]
    img = img.astype(np.float32, copy=False)
    for c in range(3):
        img[:, :, c] = img[:, :, c] - mean_pixel[c]
    img = img.transpose((2,0,1))
    img = np.expand_dims(img, axis=0)

The mean pixel values are taken from the VGG authors, which are the values computed from the training dataset.

@Zebreu
Zebreu commented Nov 9, 2015

@arushk1 You better reinstall Theano right from their github repo (and Keras as well). Do not rely on python package managers for them.

@mikedewar

@Zebreu thanks! I have included the image pre-processing in my script. For some reason my network thinks everything is now n03724870 mask assuming, as I am, that line number in that file corresponds to index.

Can i confirm that I should be able to show this network arbitrary jpgs and expect to see different classes for different pictures using the supplied weights and image preprocessing? I'd like to differentiate between a simple dumb bug on my part vs a fundamental misunderstanding of what magic this network ought to be able to perform.

@mikedewar

just got this working - thanks everyone for the help! I think the last hurdle was that I was using skimage's imread instead of opencv's. I'm not 100% sure this is the fix, but it's working fantastically now. Thanks so much @Zebreu and @baraldilorenzo !

@Zebreu
Zebreu commented Nov 11, 2015

I can explain for scikit-image vs. OpenCV: scikit-image, when it reads the color channels, outputs them in a RGB order, while OpenCV outputs them as BGR.

And yes, you can show anything, I had it read my webcam output and it worked quite well.

@arushk1
arushk1 commented Dec 8, 2015

VGG Should give a 4096D vector as an output right? I'm getting a 1000d one

@lireagan

@arushk1 1000d is the class amount...

@victoriastuart

Broken link (weights: vgg16_weights.h5)?

https://drive.google.com/file/d/0Bz7KyqmuGsilT0J5dmRCM0ROVHc/view?usp=sharing

"/tmp/mozilla_victoria0/7xCqqjnk.bin.part could not be saved, because the source file could not be read. Try again later, or contact the server administrator."

Update: Downloads in Chrome (v47) not my version of Firefox (v43 + numerous add-ons, blockers). left as comment in case others have the same 'problem.'

Needed 'vgg16_weights.h5' for the example in this blog post/tutorial:
http://blog.christianperone.com/2016/01/convolutional-hypercolumns-in-python/

@manakatie

I'm getting an error on model.compile. It's returning the following error. Any suggestions? Thanks.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-c87da87eedb0> in <module>()
     10     model = VGG_16('vgg16_weights.h5')
     11     sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
---> 12     model.compile(optimizer=sgd, loss='categorical_crossentropy')
     13     out = model.predict(im)
     14     print np.argmax(out)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/models.pyc in compile(self, optimizer, loss, class_mode)
    433         self.X_test = self.get_input(train=False)
    434 
--> 435         self.y_train = self.get_output(train=True)
    436         self.y_test = self.get_output(train=False)
    437 

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/containers.pyc in get_output(self, train)
    126 
    127     def get_output(self, train=False):
--> 128         return self.layers[-1].get_output(train)
    129 
    130     def set_input(self):

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    960 
    961     def get_output(self, train=False):
--> 962         X = self.get_input(train)
    963         output = self.activation(K.dot(X, self.W) + self.b)
    964         return output

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
    171                 if previous_layer_id in self.layer_cache:
    172                     return self.layer_cache[previous_layer_id]
--> 173             previous_output = self.previous.get_output(train=train)
    174             if hasattr(self, 'layer_cache') and self.cache_enabled:
    175                 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    635 
    636     def get_output(self, train=False):
--> 637         X = self.get_input(train)
    638         if self.p > 0.:
    639             if train:

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
    171                 if previous_layer_id in self.layer_cache:
    172                     return self.layer_cache[previous_layer_id]
--> 173             previous_output = self.previous.get_output(train=train)
    174             if hasattr(self, 'layer_cache') and self.cache_enabled:
    175                 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    960 
    961     def get_output(self, train=False):
--> 962         X = self.get_input(train)
    963         output = self.activation(K.dot(X, self.W) + self.b)
    964         return output

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
    171                 if previous_layer_id in self.layer_cache:
    172                     return self.layer_cache[previous_layer_id]
--> 173             previous_output = self.previous.get_output(train=train)
    174             if hasattr(self, 'layer_cache') and self.cache_enabled:
    175                 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    635 
    636     def get_output(self, train=False):
--> 637         X = self.get_input(train)
    638         if self.p > 0.:
    639             if train:

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
    171                 if previous_layer_id in self.layer_cache:
    172                     return self.layer_cache[previous_layer_id]
--> 173             previous_output = self.previous.get_output(train=train)
    174             if hasattr(self, 'layer_cache') and self.cache_enabled:
    175                 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    960 
    961     def get_output(self, train=False):
--> 962         X = self.get_input(train)
    963         output = self.activation(K.dot(X, self.W) + self.b)
    964         return output

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
    171                 if previous_layer_id in self.layer_cache:
    172                     return self.layer_cache[previous_layer_id]
--> 173             previous_output = self.previous.get_output(train=train)
    174             if hasattr(self, 'layer_cache') and self.cache_enabled:
    175                 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
    833     def get_output(self, train=False):
    834         X = self.get_input(train)
--> 835         return K.batch_flatten(X)
    836 
    837 

/usr/local/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/backend/tensorflow_backend.pyc in batch_flatten(x)
    307     the first dimension is conserved.
    308     '''
--> 309     x = tf.reshape(x, [-1, np.prod(x.get_shape()[1:].as_list())])
    310     return x
    311 

/usr/local/lib/python2.7/site-packages/numpy/core/fromnumeric.pyc in prod(a, axis, dtype, out, keepdims)
   2479         except AttributeError:
   2480             return _methods._prod(a, axis=axis, dtype=dtype,
-> 2481                                   out=out, keepdims=keepdims)
   2482         return prod(axis=axis, dtype=dtype, out=out)
   2483     else:

/usr/local/lib/python2.7/site-packages/numpy/core/_methods.pyc in _prod(a, axis, dtype, out, keepdims)
     33 
     34 def _prod(a, axis=None, dtype=None, out=None, keepdims=False):
---> 35     return umr_prod(a, axis, dtype, out, keepdims)
     36 
     37 def _any(a, axis=None, dtype=None, out=None, keepdims=False):

TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'
@dragosbulugean

@manakatie it only does that if you use the tensorflow backend. trying to fix it. will post if i do

@MarcBS
MarcBS commented Feb 10, 2016

If anybody is interested in converting other kinds of Caffe models to Keras you can use this fork which has a conversion module:
https://github.com/MarcBS/keras

Best,
Marc

@wadhwasahil

I have converted this model to Graph model. However, when I tend to load the weights, the following error is shown.
KeyError: "Unable to open object (Object 'graph' doesn't exist)"

@hasnainv

I have extended @MarcBS's fork to add support for Sequential models. This is the fork and this gist might be helpful.

@atique81

Hi @baraldilorenzo, @Zebreu, @mikedewar
I have run this model on ImageNet2012 validation set. If I normalize the images (by dividing by 255), the output classes for all the images are the same (class ineex 669). If I do no normalize, then the output classes are different for different images, but still the predictions are all wrong.
Could you advise what's going wrong with the normalization, or why the results are that much wrong? Please note that the result is the same (wrong predictions) regardless of the order of the input images channel (i.e.; RGB or BGR in both cases, wrong predictions)

@baraldilorenzo
Owner

Hi @atique81, could you please post your code?

@atique81

Hi @baraldilorenzo,

I have figured out that I was actually validating against a wrong ground truth file for the ImageNet2012 validation set. Now that I am using the correct one, its giving me expected results, though I am not still able to run it on tensorflow backend as reported above by @manakatie.

Thank you very much for replying.

@baraldilorenzo
Owner

@atique81 ok, that seems more reasonable now.

@atique81 @manakatie regarding Tensorflow, it seems that Keras has an issue with TF when the first layer is a ZeroPadding one. See: fchollet/keras#1135

@atique81

thanks a lot @baraldilorenzo

@mdda
mdda commented Mar 13, 2016

If you use keras from git, the ZeroPadding issue has been fixed for TensorFlow (the Theano backend was always Ok).
Funny thing is, though, that I'm getting different output classes when I switch to TensorFlow (it's simple to A/B test the backends). Theano results are as expected...

@issey173

Hi! I'm trying to run your example with the weight you provide and I'm getting the following error:
Traceback (most recent call last): File "vgg-16_keras.py", line 69, in <module> model = VGG_16('vgg16_weights.h5') File "vgg-16_keras.py", line 48, in VGG_16 model.add(Dense(4096, activation='relu')) File "build/bdist.linux-x86_64/egg/keras/layers/containers.py", line 70, in add File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 153, in set_previous File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 1015, in build File "build/bdist.linux-x86_64/egg/keras/initializations.py", line 59, in glorot_uniform File "build/bdist.linux-x86_64/egg/keras/initializations.py", line 31, in uniform File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 34, in variable File "/imatge/imasuda/kerasenv/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 474, in asarray return array(a, dtype, copy=False, order=order) MemoryError
Even if it says MemoryError, I've run it with 3GB of RAM dedicated to it (with GPU acceleration) and still throwing the same error.
Any suggestions on how to fix this problem? Thks!

@atique81

Hi @baraldilorenzo,
using your model, I am trying to fine tune on ImageNet dataset for only a fewer number of classes. I am normalizing the data by dividing each image pixel by 255 and then feeding the images into your network. But, it seems that, the training is sometimes ok, and sometimes not, as the loss starts to go down sometimes and sometimes do not (rather jumps around).

Also, it seems that keras having issues with gpu memory, as a batch size larger than 5 throws memory error, though my GPU memory is 2GB.

Could you please advise on these?

@baraldilorenzo
Owner

@mdda i will look into that.
@issey173, @atique81 you should ask usage questions at https://groups.google.com/forum/#!forum/keras-users

@wongjingping

Hi @baraldilorenzo, just checking is there any scaling of the inputs involved? It seems like there's only centering, but no division by standard deviation. If there isn't, just wondering if the original VGG model was trained with only the centered (but not scaled) inputs?

@baraldilorenzo
Owner

Hi @wongjingping, yes, in the original VGG paper the only preprocessing step was mean subtraction. No scaling.

@bayraktare

I am a little bit confused. What is the output of this code that we print in the last row by "print np.argmax(out)"
I got a value of 621 and I think this has to be the class of the object of the given image, am I right?
If I am right, where can I find these numbers belong to the object classes? Otherwise what is the meaning of this number?
In addition, can I find an example of top-5 predictions as the output of the code?

@baraldilorenzo
Owner

Hi @bayraktare, that's correct. The printed number is the index of the predicted class, i.e. the class with the highest score. There is a comment above by @Zebreu explaining how to convert these indexes into class names.
To get the top-5 predictions, just sort out in descending order and take the first 5 elements. Something like np.argsort(out)[::-1][:5]

@bayraktare

Ok @baraldilorenzo I got it very well and thank you very much for your fast reply.

@shubham1310

I am getting this error:
model = VGG_16('vgg16_weights.h5')
File "cnn.py", line 9, in VGG_16
model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
TypeError: init() got an unexpected keyword argument 'input_shape'
on keras it doesn't even say that there is any attribute like 'input_shape'. Please help as I am a beginner.

@baraldilorenzo
Owner

@shubham1310 you seem to use an old version of Keras. Updating Keras should solve the problem.

@shubham1310

Thanks @baraldilorenzo there was some issues with my pip updating python3 libraries and hence it wasn't working. Now it is working all fine.
I wanted to get the feature vector instead of the softmax result. How can I change the code to remove that layer and still used the pre-trained weights ?

@111hypo
111hypo commented Apr 7, 2016

I have met the totally same problem as @shubham1310 did , I would like to know how you solve it , by the way ,I am using python27,should I update python27 libraries as you?

@111hypo
111hypo commented Apr 7, 2016

eh...In fcat ,everytime when the code run to "model.compile(optimizer=sgd, loss='categorical_crossentropy')", the error "TypeError: init() got an unexpected keyword argument 'input_shape'" will come up. Should I reinstall keras and python27?@baraldilorenzo

@shubham1310

@111hypo Yes you should reinstall/update both keras and theano and make sure you do sudo pip2 so that it install for python2.x

@morningsky

if I want to use this weight file, must I use the image which the size is 224*224?

@dcodecasa

Thank you @baraldilorenzo for the conversion.
Note that if you use TF backend you get different and less accurate results.

@jerpint
jerpint commented Apr 13, 2016

Hi,
I would like to use this pretrained net on my own data set with only 15 labels. Also, this dataset is grayscale, so only 1 channel. I am thinking of concatenating the images to be of size (3,224,224), so 3 identical channels, as opposed to (1,224,224), would this work?

Also, how should I modify the last line of the model to output only 15 labels? if I change

model.add(Dense(1000, activation='softmax'))
to
model.add(Dense(15, activation='softmax'))

this obviously generates an error when loading the weights. Should I try adding a new layer instead and putting the previous one to relu only once I have loaded weights? I am a bit new to this.

Thanks!

J

@zo7
zo7 commented Apr 15, 2016

@jerpint Take the code above and load the weights (without changing anything), then pop the last layer off the model (since that only does ImageNet classification on the outputs of the last 4096 layer) and add a new one sized for your own dataset.

# Load weights...
model.layers.pop()
model.add(Dense(15, activation='softmax'))

Then you can go and fit the network to your data -- either by fixing the weights of everything except the layer you added, or by allowing back-propagation to update the entire network. (this is known as fine-tuning)

What you're doing for grayscale images should be fine, since that's how a gray image would be represented in RGB anyway. There might be some color-specific filters learned in the VGG network that won't work as well, but I don't think it'll be too much of a problem.

@jerpint
jerpint commented Apr 15, 2016

Hi @zo7, thanks for the reply, just to be sure i understand, before I run the command

model.fit(X_train, Y_train, batch_size=32, nb_epoch=15 ,show_accuracy=True)

How do I specify to fix the weights of the previous layers? Or specify back propagation?

@arktrin
arktrin commented Apr 21, 2016

Is there any ability in keras to get actual values after forward propagation at a specific layer of the model?

@gulcincaner

Hi

I use the code above to classify into 10 classes.
After convolution layers, I add the following:

f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
    if k >= len(model.layers):
        # we don't look at the last (fully-connected) layers in the savefile
        break
    g = f['layer_{}'.format(k)]
    weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
    model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))  #### Instead of 4096 classes,  I have got 10 classes.

I have the following error, which I couldn't figure it out for some time now. Any help is appreciated! Thank you so much.

ValueError: GpuCorrMM images and kernel must have the same stack size
Apply node that caused the error: GpuCorrMM{valid, (1, 1)}(GpuContiguous.0, GpuContiguous.0)
Toposort index: 177
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D)]
Inputs shapes: [(32, 1, 26, 34), (64, 3, 3, 3)]
Inputs strides: [(884, 0, 34, 1), (27, 9, 3, 1)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuElemwise{add,no_inplace}(GpuCorrMM{valid, (1, 1)}.0, GpuReshape{4}.0)]]

@Cospel
Cospel commented Apr 28, 2016 edited

Hi this is the output for cat image in tensorflow:
Top1: n02483362 gibbon, Hylobates lar
Top5: ['n02483362 gibbon, Hylobates lar', 'n01580077 jay', 'n02277742 ringlet, ringlet butterfly', 'n02489166 proboscis monkey, Nasalis larvatus', 'n01833805 hummingbird']

and this is for theano:
Top1: n02119789 kit fox, Vulpes macrotis
Top5: ['n02119789 kit fox, Vulpes macrotis', 'n02119022 red fox, Vulpes vulpes', 'n02123159 tiger cat', 'n02124075 Egyptian cat', 'n02123045 tabby, tabby cat']

  1. Do you know how to solve this? I would like to use tensorflow backend but the output is wrong :)
  2. Do you know other implementation of vgg in keras which works in tensorflow?
@lyjgeorge

Hi everyone, I have a question regrading the prediction process.
Since the function VGG16 will load the weights as well as the neural network structure.
Why would we still do "compile" and "sgd" before the "predict" process?

thanks!

@Cospel
Cospel commented Apr 30, 2016 edited
@worldwar2008

What's the problem for the result has no change in fine tuning process?
14351/14351 [==============================] - 17224s - loss: 14.3683 - acc: 0.1086 - val_loss: 14.3176 - val_acc: 0.1117
Epoch 3/12
14351/14351 [==============================] - 14717s - loss: 14.3649 - acc: 0.1088 - val_loss: 14.3176 - val_acc: 0.1117
Epoch 4/12
14351/14351 [==============================] - 12891s - loss: 14.3649 - acc: 0.1088 - val_loss: 14.3176 - val_acc: 0.1117
Epoch 5/12
14351/14351 [==============================] - 12926s - loss: 14.3649 - acc: 0.1088 - val_loss: 14.3176 - val_acc: 0.1117

@michelbl
michelbl commented May 4, 2016

@baraldilorenzo Thanks a lot !

@preksha12

In this after taking testing image say cat I am getting output as 286 so that signifies class in the imagenet but in above posts it is mentioned that http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz is the link for the dataset in which list of each image and its label is mentioned but I am not able to get that list it is showing different list.

@111hypo
111hypo commented May 13, 2016

Can we see your code of fine tuning?@worldwar2008

@preksha12

It is not giving approriate class when I am running this code It is giving some other class output.

@dspoka
dspoka commented May 19, 2016 edited

I'm having a similar issue to @preksha12 and @mikedewar. This pretrained model is definitely meant for the ILSVRC2014 which has 1000 categories but I can't find the file that describes the category number to the category online (the one posted above is the '12 competition and has only 200 categories). If anyone has the link would be greatly appreciated.

Ran on image of seal with 5 top tags and got this [685 922 925 675 986] couldn't find anything that made sense.

@moon5648

Why is it im.transpose((2,0,1))??
Isn't opencv color order bgr, hence im.transpose((2,1,0)) to change it to rgb?

@wongjingping

@moon5648 transpose shuffles the dimensions. If you have an image read in as an array of shape (height, width, channels), transpose((2,0,1)) would convert that array into one of shape (channels, height, width). The cv2 docs are rather sparse but transpose has nothing to do with rgb/bgr in this case.

On a side note you can read in the image data as bgr (using cv2.imread) or rgb (using PIL or skimage imread), but you would have to change the order of the subtraction of mean pixels for each of the channels as per your choice.

@wongjingping

@worldwar2008

I ran into a similar symptom, and the issue turned out to be some unexpected behaviour with the newer version of keras. Check out the comment from z07 in this thread on how to resolve it.

@yinniyu
yinniyu commented Jun 7, 2016

Hi, why does the input image raw data have to be np.float32? What if my image is already uint8 format, does it matter if I don't convert to float32 for the single decimal point? Thanks.

@hawkerpl

I anyone would wonder why id won't work with scikit-image (skimage) instead of opencv, you need convert image to opencvs representation.
I tweaked this demo a bit here https://gist.github.com/hawkerpl/4da12874c860fec77edefed2b17e64ff
It is compatible with newest opencv and gives rather sane results.

@Honglin-Zheng

How to interpret the printed result? I got 285 for a cat and 837 for a person. I downloaded the file @Zebreu mentioned and got some lists of words, but none of them matches the result I got. I would appreciate that if you can help me out with that.

@baraldilorenzo
Owner

@Honglin-Zheng in http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz there is a file called synsets.txt. If you have 285 as the predicted class id, simply go to line 285 in that file and get the synset (which is, in your case, n02123597). If you then want the words corresponding to that synset, look it up in synset_words.txt ("Siamese cat, Siamese").

@baraldilorenzo
Owner
baraldilorenzo commented Jul 14, 2016 edited

@hawkerpl It seems that you did not convert your image to BGR (see also @wongjingping's comment).

@baraldilorenzo
Owner

@yinniyu you have to, in order to subtract the mean from each color channel.

@shaayaansayed

@worldwar2008 @wongjingping
I believe the issue was resolved in keras 1.0, but the network does not seem to be outputting correct values. Can anyone else confirm?

@tlehman
tlehman commented Jul 25, 2016

Does anyone here think that VGG-16 could run on a raspberry pi 3?

@libphy
libphy commented Aug 18, 2016

I can't even load the weights to the model or maybe it's compile which fails.
I keep getting this error: InternalError: Dst tensor is not initialized.
I'm using TF backend. Did anyone have a success with TF backend yet?

@Arsey
Arsey commented Sep 6, 2016

@lyjgeorge, @baraldilorenzo you don't need to compile model for prediction. Just load weights and predict.

@GodOfProbability
GodOfProbability commented Sep 17, 2016 edited

How exactly did you convert the VGG model to Keras model? I am trying to convert the VGG-M model using the code by @MarcBS. But due to version problem, I am not able to do it. Is there any better way to it?

@halwai
halwai commented Sep 24, 2016

How can i get hidden layer outputs?

@hamedf
hamedf commented Oct 8, 2016

hi
I run the code but I had this exception error. whats the problem?
Thanks.

Using Theano backend.
Traceback (most recent call last):
File "C:/Users/sf3052/PycharmProjects/untitled2/problem2/test.py", line 67, in
model = VGG_16('vgg16_weights.h5')
File "C:/Users/sf3052/PycharmProjects/untitled2/problem2/test.py", line 46, in VGG_16
model.add(Flatten())
File "C:\Users\sf3052\AppData\Local\Continuum\Anaconda2\lib\site-packages\keras\models.py", line 308, in add
output_tensor = layer(self.outputs[0])
File "C:\Users\sf3052\AppData\Local\Continuum\Anaconda2\lib\site-packages\keras\engine\topology.py", line 514, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "C:\Users\sf3052\AppData\Local\Continuum\Anaconda2\lib\site-packages\keras\engine\topology.py", line 572, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "C:\Users\sf3052\AppData\Local\Continuum\Anaconda2\lib\site-packages\keras\engine\topology.py", line 152, in create_node
output_shapes = to_list(outbound_layer.get_output_shape_for(input_shapes[0]))
File "C:\Users\sf3052\AppData\Local\Continuum\Anaconda2\lib\site-packages\keras\layers\core.py", line 402, in get_output_shape_for
'(got ' + str(input_shape[1:]) + '. '
Exception: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

@adwin5
adwin5 commented Oct 23, 2016

@libphy did you fix it. same problem when TF as backend

@sergii-bond

To make it work for TF backend I had to make these changes:

  1. Change the input shape in the first layer: model.add(ZeroPadding2D((1,1),input_shape=(224,224,3)))
  2. Instead of model.load_weights(), load weights layer by layer manually:
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
    if k >= len(model.layers) - 1:
        # we don't look at the last two layers in the savefile (fully-connected and activation)
        break
    g = f['layer_{}'.format(k)]
    weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
    layer = model.layers[k]

    if layer.__class__.__name__ in ['Convolution1D', 'Convolution2D', 'Convolution3D', 'AtrousConvolution2D']:
        weights[0] = np.transpose(weights[0], (2, 3, 1, 0))

    layer.set_weights(weights)

f.close()

Solution in keras wiki didn't work for me, as model.load_weights() throws an error about dimensions incompatibility.

@u-phoria

To get this working with a TF backend, I added K.set_image_dim_ordering('th') at the beginning.

@titikid
titikid commented Nov 2, 2016

@hamed you can fix it by using @u-phoria comment above
Can anyone provide code for fine-tuning?

@antumin
antumin commented Nov 30, 2016

I have a question regarding the implementation of the net. Isn't there L2-regularization misssing? In the mentioned paper they write:

The training was regularised by weight decay (the L2 penalty multiplier set to 5 · 10−4) and dropout regularisation for the first two fully-connected layers (dropout ratio set to 0.5)

@Ripppah
Ripppah commented Nov 30, 2016

How to use this pre-tained model and existing weight file to train new things?

@Denzelmon

Could you'll share dataset (i.e. labeled images) to test network vgg16?

@srv902
srv902 commented Dec 6, 2016

In Keras the default VGGNet-16 says that the default input shape is 224x224x3. So Can I change it to a different value?

@blythed
blythed commented Dec 6, 2016 edited

I think that there is something up with the implementation. It gives reasonable results but doesn't agree (in the negative sense) with the lasagne pretrained model (https://github.com/Lasagne/Recipes/blob/master/examples/ImageNet%20Pretrained%20Network%20(VGG_S).ipynb)

I tried on this image (a bus).

https://upload.wikimedia.org/wikipedia/commons/thumb/a/a8/Blue_Bird_Vision_Montevideo_54.jpg/250px-Blue_Bird_Vision_Montevideo_54.jpg

I got top5 of:

['crane', 'school bus', 'moving van', 'minibus', 'power drill']

But with the lasagne version I got:

['school bus', 'minibus', 'amphibian, amphibious vehicle', 'passenger car, coach, carriage', 'trolleybus, trolley coach, trackless trolley']

Can someone confirm this?
On the face of it the lasagne results seem more reasonable.

@pr3mar
pr3mar commented Dec 7, 2016

@GodOfProbability did you convert the VGG-M for keras? If so could you please post a link?
Thanks

@Euphemiasama
Euphemiasama commented Dec 7, 2016 edited

Hello, I would like to know what is the difference between these two weight files of VGG16 and VGG19 trained on imagenet for keras provided by @baraldilorenzo and the ones provided by keras in here. could somebody please explain?

@clu2033
clu2033 commented Dec 20, 2016

Where is the cat.jpg used for testing?

@adwin5
adwin5 commented Dec 22, 2016

@issey173 what is the GPU you used? I think memory error is about memory on GPU is small

@adwin5
adwin5 commented Dec 22, 2016

@Ripppah The code loads pre-trained from the file "vgg16_weights.h5" in the same directory as code. You can download it from the link mentioned in the readme.md.
or I just copy the linke vgg16_weights.h5

@adwin5
adwin5 commented Dec 22, 2016

@srv902
I think if you want load the pre-trained weights to your VGG, you can't change the input size. If you don't need pre-trained, you can set what value you want.

@mxbi
mxbi commented Dec 26, 2016 edited

I am also experiencing what @hamedf is seeing (Exception: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512))). Theano backend, GPU. This bug occurs in every version of Keras 1.1.0+, and does not occur with any version prior to that (I downgraded to 1.0.8). Maybe there was a change in the API which breaks this model?

EDIT: This can be fixed in later version of keras by adding "image_dim_ordering": "th" in ~/.keras/keras.json. Hopefully this helps someone :)

@redeian
redeian commented Dec 28, 2016

@mxbi OMG you're a life saver!!

@kolachalama
kolachalama commented Jan 4, 2017 edited

I am trying to load the vgg16 pre-trained model in Keras but getting this IO Error. Can anyone please help?

screen shot 2017-01-03 at 11 02 33 pm

@suresh-chinta
suresh-chinta commented Jan 16, 2017 edited

The below issue has been fixed with reupload of weights file.

I am getting the same IOError as above, can someone please suggest a solution, thanks

/home/ubuntu/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.pyc in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
90 if swmr and swmr_support:
91 flags |= h5f.ACC_SWMR_READ
---> 92 fid = h5f.open(name, flags, fapl=fapl)
93 elif mode == 'r+':
94 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)()

h5py/h5f.pyx in h5py.h5f.open (/home/ilan/minonda/conda-bld/work/h5py/h5f.c:1942)()

IOError: Unable to open file (Truncated file: eof = 102580224, sblock->base_addr = 0, stored_eoa = 553479920)

@sounakdey

I don't think the above issue is fixed, I still get the same error with keras.version = 1.2.2, Please suggest a solution

@dmarx
dmarx commented Feb 21, 2017

Experiencing this issue with 1.1.0, keras.json is already set per @mxbi suggestion:

{
    "image_dim_ordering": "th",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "theano"
}
@parikshit95
parikshit95 commented Feb 25, 2017 edited

Hi I want to apply the model described above on multiple images in a directory. While I have a way to loop through multiple images in a directory I don't know how to predict the images and extract accuracy.
Help would be appreciated.

@cswwp
cswwp commented Mar 2, 2017

hello, i used this code, but i have bug on the load-weights, and the bug is "Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_11' (op: 'MaxPool') with input shapes: [?,1,112,128].". I use the tensorflow backend. How to fix it ?

@warchildmd

@cswwp just insert this code at the top.

from keras import backend as K

K.set_image_dim_ordering('th')
@meshiguge

is there any reference paper, this visualization method ? thanks

@emurina
emurina commented Mar 13, 2017

Is there a vgg version with weights when you are using tensorflow backend? So far, I couldnt find anything...

@MarcoForte

@sergii-bond thanks for your code. I just want to let people know that in keras2 the code must be altered slightly, replace Convolution2D by Conv2D. Otherwise you get the error, ValueError: Layer weight shape (3, 3, 3, 64) not compatible with provided weight shape (64, 3, 3, 3)

@alexminnaar

I'm getting the error KeyError: "Can't open attribute (Can't locate attribute: 'layer_names')" in the line model = VGG_16('vgg16_weights.h5') I'm using tensorflow backend

@Banyutong

same issue @sounakdey
ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512).

@NerminSalem

nermin@Nermin:~/deep-learning-models$ python test_imagenet.py --image images/dog_beagle.png
Using TensorFlow backend.
[INFO] loading and preprocessing image...
[INFO] loading network...
Traceback (most recent call last):
File "test_imagenet.py", line 40, in
model = VGG16(weights="imagenet")
File "/home/nermin/deep-learning-models/vgg16.py", line 170, in VGG16
model.load_weights(weights_path)
File "/usr/local/lib/python2.7/dist-packages/Keras-2.0.2-py2.7.egg/keras/engine/topology.py", line 2489, in load_weights
f = h5py.File(filepath, mode='r')
File "/usr/lib/python2.7/dist-packages/h5py/_hl/files.py", line 272, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/usr/lib/python2.7/dist-packages/h5py/_hl/files.py", line 92, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/build/h5py-nQFNYZ/h5py-2.6.0/h5py/_objects.c:2577)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/build/h5py-nQFNYZ/h5py-2.6.0/h5py/_objects.c:2536)
File "h5py/h5f.pyx", line 76, in h5py.h5f.open (/build/h5py-nQFNYZ/h5py-2.6.0/h5py/h5f.c:1811)
IOError: Unable to open file (Truncated file: eof = 11853824, sblock->base_addr = 0, stored_eoa = 553467096)
i keep getting this error when i m trying to run ImageNet classification with Python and Keras. i m using tensorflow backend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment