Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Fine-tuning a Keras model. Updated to the Keras 2.0 API.
'''This script goes along the blog post
"Building powerful image classification models using very little data"
from blog.keras.io.
It uses data that can be downloaded at:
https://www.kaggle.com/c/dogs-vs-cats/data
In our setup, we:
- created a data/ folder
- created train/ and validation/ subfolders inside data/
- created cats/ and dogs/ subfolders inside train/ and validation/
- put the cat pictures index 0-999 in data/train/cats
- put the cat pictures index 1000-1400 in data/validation/cats
- put the dogs pictures index 12500-13499 in data/train/dogs
- put the dog pictures index 13500-13900 in data/validation/dogs
So that we have 1000 training examples for each class, and 400 validation examples for each class.
In summary, this is our directory structure:
```
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
```
'''
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'cats_and_dogs_small/train'
validation_data_dir = 'cats_and_dogs_small/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)
# add the model on top of the convolutional base
model.add(top_model)
# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:25]:
layer.trainable = False
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
# fine-tune the model
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
@SchumErik

This comment has been minimized.

Show comment
Hide comment
@SchumErik

SchumErik Jul 18, 2016

Great example, thanks for sharing Francois!! Aside from the comment from yurkor regarding the proper VGG image pre-processing (which I agree with), I have a few questions about the general procedure:

  1. Would it not be better to train the top model with different samples than you will later on use to fine tune your model? In the cat/dog example that would mean put aside 10-20% as validation set and then use 50% of the remainder for creating the top model and the other 50% for fine tuning. Any views?
  2. Do you have any insights regarding the best size of the top model when your final goal is a segmentation into 10 classes rather than 2? I tried to go back to the original VGG16 architecture, experimenting with two fully connected layers of 1024 and 512, but it appears there still is substantial overfitting. Any advice?

Great example, thanks for sharing Francois!! Aside from the comment from yurkor regarding the proper VGG image pre-processing (which I agree with), I have a few questions about the general procedure:

  1. Would it not be better to train the top model with different samples than you will later on use to fine tune your model? In the cat/dog example that would mean put aside 10-20% as validation set and then use 50% of the remainder for creating the top model and the other 50% for fine tuning. Any views?
  2. Do you have any insights regarding the best size of the top model when your final goal is a segmentation into 10 classes rather than 2? I tried to go back to the original VGG16 architecture, experimenting with two fully connected layers of 1024 and 512, but it appears there still is substantial overfitting. Any advice?
@junfenglx

This comment has been minimized.

Show comment
Hide comment
@junfenglx

junfenglx Aug 27, 2016

@SchumErik

  1. I think this approach need more data
  2. again more data if you train large network. You can use regularization to defeat overfitting

By the way, if you have data size like ImageNet, you even can train VGG16 from scratch, not use fine tune at all.

@SchumErik

  1. I think this approach need more data
  2. again more data if you train large network. You can use regularization to defeat overfitting

By the way, if you have data size like ImageNet, you even can train VGG16 from scratch, not use fine tune at all.

@Arsey

This comment has been minimized.

Show comment
Hide comment
@Arsey

Arsey Sep 11, 2016

@SchumErik, about the second question, my experiments shows that the top model from the original VGG16 works better than model with decreased dimension. Here I classify 102 classes https://github.com/Arsey/keras-transfer-learning-for-oxford102

Arsey commented Sep 11, 2016

@SchumErik, about the second question, my experiments shows that the top model from the original VGG16 works better than model with decreased dimension. Here I classify 102 classes https://github.com/Arsey/keras-transfer-learning-for-oxford102

@alinagithub

This comment has been minimized.

Show comment
Hide comment
@alinagithub

alinagithub Sep 27, 2016

Hi Francois, first of all, thanks for all this great material.
Quick question... is there a place from which we can download the fc_model.h5 ?
Many thanks,
AL.

Hi Francois, first of all, thanks for all this great material.
Quick question... is there a place from which we can download the fc_model.h5 ?
Many thanks,
AL.

@vishsangale

This comment has been minimized.

Show comment
Hide comment
@vishsangale

vishsangale Sep 29, 2016

@alinagithub - You have to train this model before fine tuning. There is one more tutorial from Francois about how to generate that model.

@alinagithub - You have to train this model before fine tuning. There is one more tutorial from Francois about how to generate that model.

@bayraktare

This comment has been minimized.

Show comment
Hide comment
@bayraktare

bayraktare Oct 1, 2016

If we want to categorize 10 or 20 classes, is it necessary just to fine tune the same code over 10 or 20 classes by adding them as sub-folders into the train and validation data?

If we want to categorize 10 or 20 classes, is it necessary just to fine tune the same code over 10 or 20 classes by adding them as sub-folders into the train and validation data?

@alinagithub

This comment has been minimized.

Show comment
Hide comment

@vishsangale - many thanks!

@AakashKumarNain

This comment has been minimized.

Show comment
Hide comment
@AakashKumarNain

AakashKumarNain Oct 1, 2016

I implemented the above model and everything worked fine. Now I have a sample of 1000 images in my test_data named folder which I want to predict using the model.predict() method. I first generated the test features as we generated the train_data above. This is shown below:
generator = datagen.flow_from_directory(test_dir, batch_size = 32, target_size =(img_rows, img_columns), classes = None, shuffle =False)

test_data_features = model.predict_generator(generator, 1000)
np.save(open('test_data_features.npy','wb'), test_data_features)
test_data = np.load(open('test_data_features.npy', 'rb'))

Now when I make predictions like model.predict(test_data), I get a numpy array of predictions. How can I test what each prediction is represting? The predictions are like this:
array([[ 9.99770999e-01], [ 7.65574304e-03], [ 2.06944350e-07], [ 9.96615469e-01], [ 4.59789817e-07], [ 9.93980138e-05], [ 5.27667798e-05],..........)

How can I compare my predictions to my images?

I implemented the above model and everything worked fine. Now I have a sample of 1000 images in my test_data named folder which I want to predict using the model.predict() method. I first generated the test features as we generated the train_data above. This is shown below:
generator = datagen.flow_from_directory(test_dir, batch_size = 32, target_size =(img_rows, img_columns), classes = None, shuffle =False)

test_data_features = model.predict_generator(generator, 1000)
np.save(open('test_data_features.npy','wb'), test_data_features)
test_data = np.load(open('test_data_features.npy', 'rb'))

Now when I make predictions like model.predict(test_data), I get a numpy array of predictions. How can I test what each prediction is represting? The predictions are like this:
array([[ 9.99770999e-01], [ 7.65574304e-03], [ 2.06944350e-07], [ 9.96615469e-01], [ 4.59789817e-07], [ 9.93980138e-05], [ 5.27667798e-05],..........)

How can I compare my predictions to my images?

@anujshah1003

This comment has been minimized.

Show comment
Hide comment
@anujshah1003

anujshah1003 Oct 2, 2016

you can try model.predict_classes(test_data)

you can try model.predict_classes(test_data)

@bayraktare

This comment has been minimized.

Show comment
Hide comment
@bayraktare

bayraktare Oct 2, 2016

In addition to my yesterday's question, I would like to know that if we want to classify an object which is not belong to any of the categories in ImageNet what I have to do?

In addition to my yesterday's question, I would like to know that if we want to classify an object which is not belong to any of the categories in ImageNet what I have to do?

@Arsey

This comment has been minimized.

Show comment
Hide comment
@Arsey

Arsey Oct 3, 2016

I'm trying to adopt this approach for multiclass classification. And the first step when we're training the top model from bottlenecks works fine and I have 76% accuracy for 102 classes. But when I proceed to fine-tune step, the model with each next epoch decreases in accuracy. So if at the end of 1st epoch of fine-tune I have about 45% of accuracy, on the 2nd epoch I'm getting 40%, etc. Here's repo with the code https://github.com/Arsey/keras-oxford102. Can anyone help? I'm trying to find the issue for a few weeks but no results :(

Arsey commented Oct 3, 2016

I'm trying to adopt this approach for multiclass classification. And the first step when we're training the top model from bottlenecks works fine and I have 76% accuracy for 102 classes. But when I proceed to fine-tune step, the model with each next epoch decreases in accuracy. So if at the end of 1st epoch of fine-tune I have about 45% of accuracy, on the 2nd epoch I'm getting 40%, etc. Here's repo with the code https://github.com/Arsey/keras-oxford102. Can anyone help? I'm trying to find the issue for a few weeks but no results :(

@apapiu

This comment has been minimized.

Show comment
Hide comment
@apapiu

apapiu Oct 5, 2016

Great Post, I am learning a lot. I do have a question about the top model. If I understand correctly I need to have the weights properly tuned already before sticking it to the convolutional layers. But how do I compute the weights to begin with?

To be more specific: In this line: top_model_weights_path = 'fc_model.h5' how did you compute the weights for the top model?

apapiu commented Oct 5, 2016

Great Post, I am learning a lot. I do have a question about the top model. If I understand correctly I need to have the weights properly tuned already before sticking it to the convolutional layers. But how do I compute the weights to begin with?

To be more specific: In this line: top_model_weights_path = 'fc_model.h5' how did you compute the weights for the top model?

@CharlesNord

This comment has been minimized.

Show comment
Hide comment
@CharlesNord

CharlesNord Oct 6, 2016

@alinagithub Did you find the tutorial about how to train fc_model.h5 ? I can't find it.

@alinagithub Did you find the tutorial about how to train fc_model.h5 ? I can't find it.

@Golly

This comment has been minimized.

Show comment
Hide comment
@Golly

Golly Oct 8, 2016

@Arsey Same problem with fine tuning. Im trying now and model decreased in accuracy from 87% to 50%.

Golly commented Oct 8, 2016

@Arsey Same problem with fine tuning. Im trying now and model decreased in accuracy from 87% to 50%.

@Euphemiasama

This comment has been minimized.

Show comment
Hide comment
@Euphemiasama

Euphemiasama Oct 9, 2016

@apapiu @CharlesNord You can compute the weights to get the 'fc_model.h5' file by running part 2 of this tutorial entitled 'Using the bottleneck features of a pre-trained network: 90% accuracy in a minute' it creates a file named 'bottleneck_fc_model.h5' you rename it to 'fc_model.h5' and run the code.

Euphemiasama commented Oct 9, 2016

@apapiu @CharlesNord You can compute the weights to get the 'fc_model.h5' file by running part 2 of this tutorial entitled 'Using the bottleneck features of a pre-trained network: 90% accuracy in a minute' it creates a file named 'bottleneck_fc_model.h5' you rename it to 'fc_model.h5' and run the code.

@jamescfli

This comment has been minimized.

Show comment
Hide comment
@jamescfli

jamescfli Oct 10, 2016

@fchollet
In line 156 and 162, target_size=(img_height, img_width)
which should go first 'img_height' or 'img_width'?
In classifier_from_little_data_script_2.py, it is in reverse order. I am confused here.

@fchollet
In line 156 and 162, target_size=(img_height, img_width)
which should go first 'img_height' or 'img_width'?
In classifier_from_little_data_script_2.py, it is in reverse order. I am confused here.

@CharlesNord

This comment has been minimized.

Show comment
Hide comment
@CharlesNord

CharlesNord Oct 13, 2016

@Euphemiasama Thank you very much. Another question, has any one reached 95% accuracy by using the fine tuning method? Before fine tuning, my accuracy is about 90%, and after fine tuning, it reaches 92% and I cannot make the result better anymore

@Euphemiasama Thank you very much. Another question, has any one reached 95% accuracy by using the fine tuning method? Before fine tuning, my accuracy is about 90%, and after fine tuning, it reaches 92% and I cannot make the result better anymore

@aquibjaved

This comment has been minimized.

Show comment
Hide comment
@aquibjaved

aquibjaved Oct 20, 2016

How do I predict_classes on a new a Image(which is not in data base),
well I am trying the code below to load model and load a new Image pass through the predict_classes but I am getting an error:

import keras
from keras.models import load_model
from keras.models import Sequential
import cv2
import numpy as np 
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

model = Sequential()

model =load_model('firstmodel.h5')
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

img = cv2.imread('cat.jpg',0).astype(np.float32) / 255
img = cv2.resize(img, (150, 150))
#img = prep_data(img)
#img = img.flatten()
img = np.expand_dims(img, axis=0)
# img1 = np.expand_dims(img, axis=0)
# img = cv2.imread('cat.jpg')
# img = cv2.resize(img,(150,150))
# x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)
classes = model.predict_classes(img)
print classes

I am getting error:

using Theano backend.
Traceback (most recent call last):
  File "detect.py", line 28, in <module>
    classes = model.predict_classes(img)
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 780, in predict_classes
    proba = self.predict(x, batch_size=batch_size, verbose=verbose)
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 672, in predict
    return self.model.predict(x, batch_size=batch_size, verbose=verbose)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1174, in predict
    check_batch_dim=False)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 100, in standardize_input_data
    str(array.shape))
Exception: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (1, 150, 150)

How do I predict_classes on a new a Image(which is not in data base),
well I am trying the code below to load model and load a new Image pass through the predict_classes but I am getting an error:

import keras
from keras.models import load_model
from keras.models import Sequential
import cv2
import numpy as np 
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

model = Sequential()

model =load_model('firstmodel.h5')
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

img = cv2.imread('cat.jpg',0).astype(np.float32) / 255
img = cv2.resize(img, (150, 150))
#img = prep_data(img)
#img = img.flatten()
img = np.expand_dims(img, axis=0)
# img1 = np.expand_dims(img, axis=0)
# img = cv2.imread('cat.jpg')
# img = cv2.resize(img,(150,150))
# x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)
classes = model.predict_classes(img)
print classes

I am getting error:

using Theano backend.
Traceback (most recent call last):
  File "detect.py", line 28, in <module>
    classes = model.predict_classes(img)
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 780, in predict_classes
    proba = self.predict(x, batch_size=batch_size, verbose=verbose)
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 672, in predict
    return self.model.predict(x, batch_size=batch_size, verbose=verbose)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1174, in predict
    check_batch_dim=False)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 100, in standardize_input_data
    str(array.shape))
Exception: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (1, 150, 150)

@nathmo

This comment has been minimized.

Show comment
Hide comment
@nathmo

nathmo Dec 14, 2016

hi, i made a derrived script from your code and get the following error :
i'm running tensor flow 0.12.0-rc1 With python 2.7
i have installed the latest version of keras from git.
What i have done wrong ?

Source code : http://pastebin.com/JEqcZe5x
Full Error : http://pastebin.com/yEFVHWEK

"ValueError: Error when checking model input: expected convolution2d_input_1 to have shape (None, 3, 32, 32) but got array with shape (32, 32, 32, 3)"

Thank in advance

Nathann

nathmo commented Dec 14, 2016

hi, i made a derrived script from your code and get the following error :
i'm running tensor flow 0.12.0-rc1 With python 2.7
i have installed the latest version of keras from git.
What i have done wrong ?

Source code : http://pastebin.com/JEqcZe5x
Full Error : http://pastebin.com/yEFVHWEK

"ValueError: Error when checking model input: expected convolution2d_input_1 to have shape (None, 3, 32, 32) but got array with shape (32, 32, 32, 3)"

Thank in advance

Nathann

@Abusnina

This comment has been minimized.

Show comment
Hide comment
@Abusnina

Abusnina Jan 5, 2017

How can we use the modified pre-trained model to predict test data? I compiled the model successfully, but when I try to predict my test data i get the following error:
Error when checking : expected flatten_input_1 to have shape (None, 4, 4, 512) but got array with shape (1, 3, 150, 150)
Any advice on this?

Abusnina commented Jan 5, 2017

How can we use the modified pre-trained model to predict test data? I compiled the model successfully, but when I try to predict my test data i get the following error:
Error when checking : expected flatten_input_1 to have shape (None, 4, 4, 512) but got array with shape (1, 3, 150, 150)
Any advice on this?

@austinchencym

This comment has been minimized.

Show comment
Hide comment
@austinchencym

austinchencym Jan 10, 2017

@Golly
@Arsey

Same problem, before fine-tuning my model for 5 classes reached 98% accuracy but the first epoch of fine-tuning dropped to 20%. Did you or does anyone work it out for multi-class problem? I guess we need more train data to feed our model

@Golly
@Arsey

Same problem, before fine-tuning my model for 5 classes reached 98% accuracy but the first epoch of fine-tuning dropped to 20%. Did you or does anyone work it out for multi-class problem? I guess we need more train data to feed our model

@varunagrawal

This comment has been minimized.

Show comment
Hide comment
@varunagrawal

varunagrawal Jan 12, 2017

Is the slice of :25 correct for setting the non-trainable parameter? If you do model.layers[:26] you see that the last layer is also a Conv layer.

Is the slice of :25 correct for setting the non-trainable parameter? If you do model.layers[:26] you see that the last layer is also a Conv layer.

@KamalOthman

This comment has been minimized.

Show comment
Hide comment
@KamalOthman

KamalOthman Jan 27, 2017

Hi everyone,
I am following this example on GRAYSCALE images, which I converted them from color scale using PIL convert method. I expected the input_sahpe in the model should be (1, w, h). but I get this error:
Exception: Error when checking model input: expected convolution2d_input_1 to have shape (None, 1, 150, 150) but got array with shape (32, 3, 150, 150)
When I checked the image shape, I got only (w,h)
However, It still works with input_shape=(3, w, h). Isn't it strange?

My goal to compare the learning on gray images to color images. But I do not know how to expand the dimension of gray images within flow_from_directory.

Anyone can help please?

Thanks
Kamal

Hi everyone,
I am following this example on GRAYSCALE images, which I converted them from color scale using PIL convert method. I expected the input_sahpe in the model should be (1, w, h). but I get this error:
Exception: Error when checking model input: expected convolution2d_input_1 to have shape (None, 1, 150, 150) but got array with shape (32, 3, 150, 150)
When I checked the image shape, I got only (w,h)
However, It still works with input_shape=(3, w, h). Isn't it strange?

My goal to compare the learning on gray images to color images. But I do not know how to expand the dimension of gray images within flow_from_directory.

Anyone can help please?

Thanks
Kamal

@cobir

This comment has been minimized.

Show comment
Hide comment
@cobir

cobir Jan 30, 2017

Hi,

The input shape of the model is (3, img_width, img_height). Does this mean that we can only work with theano as backend when working with this pre-trained VGG16 model?

Thanks,
OC

cobir commented Jan 30, 2017

Hi,

The input shape of the model is (3, img_width, img_height). Does this mean that we can only work with theano as backend when working with this pre-trained VGG16 model?

Thanks,
OC

@austinchencym

This comment has been minimized.

Show comment
Hide comment
@austinchencym

austinchencym Jan 31, 2017

@cobir

I used Tensorflow backend it works fine.

@cobir

I used Tensorflow backend it works fine.

@biswagsingh

This comment has been minimized.

Show comment
Hide comment
@biswagsingh

biswagsingh Feb 2, 2017

I get the following error when fine tuning a 8 class multi class classification. Any idea anyone please help

nb_val_samples=nb_validation_samples)

File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\models.py", line 935, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1553, in fit_generator
class_weight=class_weight)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1310, in train_on_batch
check_batch_axis=True)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1034, in _standardize_user_data
exception_prefix='model target')
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 124, in standardize_input_data
str(array.shape))
ValueError: Error when checking model target: expected sequential_2 to have shape (None, 1) but got array with shape (32L, 8L)

Thanks,
Biswa

I get the following error when fine tuning a 8 class multi class classification. Any idea anyone please help

nb_val_samples=nb_validation_samples)

File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\models.py", line 935, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1553, in fit_generator
class_weight=class_weight)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1310, in train_on_batch
check_batch_axis=True)
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 1034, in _standardize_user_data
exception_prefix='model target')
File "C:\Users\bgsingh\Anaconda2\lib\site-packages\keras\engine\training.py", line 124, in standardize_input_data
str(array.shape))
ValueError: Error when checking model target: expected sequential_2 to have shape (None, 1) but got array with shape (32L, 8L)

Thanks,
Biswa

@a-ozbek

This comment has been minimized.

Show comment
Hide comment
@a-ozbek

a-ozbek Feb 5, 2017

Excuse me if this issue was brought up before about this script. I couldn't find a resolution to this in the comments.

In this script, during the fine-tuning, the "train_generator" and "validation_generator" do not seem to do VGG16 pre-processing which is

      # 'RGB'->'BGR'  
        x = x[:, :, :, ::-1]  
        # Zero-center by mean pixel  
        x[:, :, :, 0] -= 103.939  
        x[:, :, :, 1] -= 116.779  
        x[:, :, :, 2] -= 123.68  

Isn't it wrong to do fine-tuning of VGG16 without this pre-processing step?

a-ozbek commented Feb 5, 2017

Excuse me if this issue was brought up before about this script. I couldn't find a resolution to this in the comments.

In this script, during the fine-tuning, the "train_generator" and "validation_generator" do not seem to do VGG16 pre-processing which is

      # 'RGB'->'BGR'  
        x = x[:, :, :, ::-1]  
        # Zero-center by mean pixel  
        x[:, :, :, 0] -= 103.939  
        x[:, :, :, 1] -= 116.779  
        x[:, :, :, 2] -= 123.68  

Isn't it wrong to do fine-tuning of VGG16 without this pre-processing step?

@aidiary

This comment has been minimized.

Show comment
Hide comment
@aidiary

aidiary Feb 16, 2017

@a-ozbek

I have the same question.
I have experimented with and without this pre-processing and get the slightly better result in the case of without this pre-processing...

without pre-processing => val_acc = 93.5%@ epoch 50
with pre-processing      => val_acc = 92.8% @ epoch 50

I think this pre-processing is essential to use keras.applications.vgg16..
Does anyone know why?

# with pre-precessing version
# Use keras 1.2.2 for preprocessing_function

# x = 3D tensor version
def preprocess_input(x):
    # 'RGB'->'BGR'
    x = x[:, :, ::-1]
    # Zero-center by mean pixel
    x[:, :, 0] -= 103.939
    x[:, :, 1] -= 116.779
    x[:, :, 2] -= 123.68
    return x

train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input)

aidiary commented Feb 16, 2017

@a-ozbek

I have the same question.
I have experimented with and without this pre-processing and get the slightly better result in the case of without this pre-processing...

without pre-processing => val_acc = 93.5%@ epoch 50
with pre-processing      => val_acc = 92.8% @ epoch 50

I think this pre-processing is essential to use keras.applications.vgg16..
Does anyone know why?

# with pre-precessing version
# Use keras 1.2.2 for preprocessing_function

# x = 3D tensor version
def preprocess_input(x):
    # 'RGB'->'BGR'
    x = x[:, :, ::-1]
    # Zero-center by mean pixel
    x[:, :, 0] -= 103.939
    x[:, :, 1] -= 116.779
    x[:, :, 2] -= 123.68
    return x

train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input)
@jdelange

This comment has been minimized.

Show comment
Hide comment
@jdelange

jdelange Feb 25, 2017

@CharlesNord: same here, at best 91.75%. Also tried the VGG16 net included in keras\applications, but that gave similar results.

@CharlesNord: same here, at best 91.75%. Also tried the VGG16 net included in keras\applications, but that gave similar results.

@embanner

This comment has been minimized.

Show comment
Hide comment
@embanner

embanner Mar 5, 2017

See this notebook for an example of fine-tuning a keras.applications.vgg16.VGG16 that hooks together keras.preprocessing.image.ImageDataGenerator withkeras.applications.vgg16.preprocess_input() for image preprocessing.

Note, I'm using the theano backend.

Original gist at https://gist.github.com/embanner/6149bba89c174af3bfd69537b72bca74.

embanner commented Mar 5, 2017

See this notebook for an example of fine-tuning a keras.applications.vgg16.VGG16 that hooks together keras.preprocessing.image.ImageDataGenerator withkeras.applications.vgg16.preprocess_input() for image preprocessing.

Note, I'm using the theano backend.

Original gist at https://gist.github.com/embanner/6149bba89c174af3bfd69537b72bca74.

@cswwp

This comment has been minimized.

Show comment
Hide comment
@cswwp

cswwp Mar 7, 2017

@biswatherockstar I also have the same problem, do you solove it ?

cswwp commented Mar 7, 2017

@biswatherockstar I also have the same problem, do you solove it ?

@Irtza

This comment has been minimized.

Show comment
Hide comment
@Irtza

Irtza Mar 15, 2017

The applications.VGG16 model is defined using the Functional API .. when I'm trying to model.'add' the instance returned by VGG16 base class on imagenet weights. I get Attribute error: that metaclass keras.engine.training.Model has no arrtibute add? has there been a change ? or am I missing something?

Irtza commented Mar 15, 2017

The applications.VGG16 model is defined using the Functional API .. when I'm trying to model.'add' the instance returned by VGG16 base class on imagenet weights. I get Attribute error: that metaclass keras.engine.training.Model has no arrtibute add? has there been a change ? or am I missing something?

@sampathweb

This comment has been minimized.

Show comment
Hide comment
@sampathweb

sampathweb Mar 15, 2017

@Irtza Yes, keras.applications.vgg16 uses Functional API. You can only use the "add" method to a Sequential API. Functional API is actually more flexible and you can build out a graph. For an example of how to add your own layers on top, checkout this notebook posted by @embanner couple of posts above here.

sampathweb commented Mar 15, 2017

@Irtza Yes, keras.applications.vgg16 uses Functional API. You can only use the "add" method to a Sequential API. Functional API is actually more flexible and you can build out a graph. For an example of how to add your own layers on top, checkout this notebook posted by @embanner couple of posts above here.

@fwahhab89

This comment has been minimized.

Show comment
Hide comment
@fwahhab89

fwahhab89 Mar 17, 2017

I am trying to finetune the VGG16 for catsVSdogs datasets by exactly replicating the gist above. I am using tensorflow as my backend.
But I am getting the error at line:
top_model.add(Flatten(input_shape=model.output_shape[1:]))

The error is:
ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

I am already up to date with the Keras 2 API. Please help me out here.

I am trying to finetune the VGG16 for catsVSdogs datasets by exactly replicating the gist above. I am using tensorflow as my backend.
But I am getting the error at line:
top_model.add(Flatten(input_shape=model.output_shape[1:]))

The error is:
ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

I am already up to date with the Keras 2 API. Please help me out here.

@rikenmehta03

This comment has been minimized.

Show comment
Hide comment
@rikenmehta03

rikenmehta03 Mar 17, 2017

While trying above code, I am getting this error when adding top layers.
AttributeError: 'Model' object has no attribute 'add'

While trying above code, I am getting this error when adding top layers.
AttributeError: 'Model' object has no attribute 'add'

@telesphore

This comment has been minimized.

Show comment
Hide comment
@telesphore

telesphore Mar 18, 2017

@fwahhab89

I ran into a similar issue. I think you may need to specify the input shape for your VGG16 model. I'm not doing "cats and dogs" here but for my case I did.

model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

Remember tensorflow vs theano order

telesphore commented Mar 18, 2017

@fwahhab89

I ran into a similar issue. I think you may need to specify the input shape for your VGG16 model. I'm not doing "cats and dogs" here but for my case I did.

model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

Remember tensorflow vs theano order

@kevinpatricksmith

This comment has been minimized.

Show comment
Hide comment
@kevinpatricksmith

kevinpatricksmith Mar 19, 2017

I am working through this set of three tutorials with tensorflow 1.0 on GPU and keras 2

When I try to create the top_model and as a first add step call Flatten, I get the following error:

ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model

When fix as per @fwahhab89 with input-shape=(150,150,3), I then get the same error as rikenmehta03:
model.add(top_model)
AttributeError: 'Model' object has no attribute 'add'

Sequential model has an add function, but VGG16 is based on Model.

kevinpatricksmith commented Mar 19, 2017

I am working through this set of three tutorials with tensorflow 1.0 on GPU and keras 2

When I try to create the top_model and as a first add step call Flatten, I get the following error:

ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model

When fix as per @fwahhab89 with input-shape=(150,150,3), I then get the same error as rikenmehta03:
model.add(top_model)
AttributeError: 'Model' object has no attribute 'add'

Sequential model has an add function, but VGG16 is based on Model.

@Omegamon

This comment has been minimized.

Show comment
Hide comment
@Omegamon

Omegamon Mar 19, 2017

Got error like this:

 base_model.add(top_model)
AttributeError: 'Model' object has no attribute 'add'

Got error like this:

 base_model.add(top_model)
AttributeError: 'Model' object has no attribute 'add'
@kevinpatricksmith

This comment has been minimized.

Show comment
Hide comment
@kevinpatricksmith

kevinpatricksmith Mar 19, 2017

This "AttributeError: 'Model' object has no attribute 'add'" issue seems to be related to this issue: keras-team/keras#3465

Their approach would work if we weren't trying to reload the top model weights from part 2 of the tutorial

Should it perhaps be something like:
model = Model(input=model.input, output=top_model)

kevinpatricksmith commented Mar 19, 2017

This "AttributeError: 'Model' object has no attribute 'add'" issue seems to be related to this issue: keras-team/keras#3465

Their approach would work if we weren't trying to reload the top model weights from part 2 of the tutorial

Should it perhaps be something like:
model = Model(input=model.input, output=top_model)

@Omegamon

This comment has been minimized.

Show comment
Hide comment
@Omegamon

Omegamon Mar 20, 2017

@kevinpatricksmith
Thanks!
Fixed by this :

  input_tensor = Input(shape=(150,150,3))
  base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
  top_model = Sequential()
  top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
  top_model.add(Dense(256, activation='relu'))
  top_model.add(Dropout(0.5))
  top_model.add(Dense(1, activation='sigmoid'))
  top_model.load_weights('bootlneck_fc_model.h5')
  model = Model(input= base_model.input, output= top_model(base_model.output))

Omegamon commented Mar 20, 2017

@kevinpatricksmith
Thanks!
Fixed by this :

  input_tensor = Input(shape=(150,150,3))
  base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
  top_model = Sequential()
  top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
  top_model.add(Dense(256, activation='relu'))
  top_model.add(Dropout(0.5))
  top_model.add(Dense(1, activation='sigmoid'))
  top_model.load_weights('bootlneck_fc_model.h5')
  model = Model(input= base_model.input, output= top_model(base_model.output))
@hiroyachiba

This comment has been minimized.

Show comment
Hide comment
@hiroyachiba

hiroyachiba Mar 20, 2017

I guess this part needs to be updated for Keras 2 API.

# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

I guess this part needs to be updated for Keras 2 API.

# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)
@kevinpatricksmith

This comment has been minimized.

Show comment
Hide comment
@kevinpatricksmith

kevinpatricksmith Mar 21, 2017

@Omegamon and @hiroyachiba
Thanks.
I used both your suggestions and can now get training to complete.

I am now getting accuracy of 92% and validation set accuracy of almost 90%. However it does not really improve much from epoch to epoch.

When I use model.summary() to print out the model, I am seeing only one additional layer after the last set of VGG16 convolution/pooling. It is called Sequential_1. I don't know if the top_model was added correctly:

model.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
sequential_1 (Sequential)    (None, 1)                 2097665   
=================================================================

kevinpatricksmith commented Mar 21, 2017

@Omegamon and @hiroyachiba
Thanks.
I used both your suggestions and can now get training to complete.

I am now getting accuracy of 92% and validation set accuracy of almost 90%. However it does not really improve much from epoch to epoch.

When I use model.summary() to print out the model, I am seeing only one additional layer after the last set of VGG16 convolution/pooling. It is called Sequential_1. I don't know if the top_model was added correctly:

model.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
sequential_1 (Sequential)    (None, 1)                 2097665   
=================================================================
@shnayder

This comment has been minimized.

Show comment
Hide comment
@shnayder

shnayder Mar 21, 2017

This code and https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html says to freeze the first 25 layers, but applications.VGG16 doesn't have that many! Pretty sure you need to freeze the first 15, not 25: here's my model.layers, including the top dense sequential model):

[<keras.engine.topology.InputLayer at 0x11aa49290>,
 <keras.layers.convolutional.Conv2D at 0x11aa49610>,
 <keras.layers.convolutional.Conv2D at 0x11a710490>,
 <keras.layers.pooling.MaxPooling2D at 0x11aa6c5d0>,
 <keras.layers.convolutional.Conv2D at 0x11aa49310>,
 <keras.layers.convolutional.Conv2D at 0x11aa78310>,
 <keras.layers.pooling.MaxPooling2D at 0x119c18fd0>,
 <keras.layers.convolutional.Conv2D at 0x119ca7fd0>,
 <keras.layers.convolutional.Conv2D at 0x11aa13510>,
 <keras.layers.convolutional.Conv2D at 0x11ac1da90>,
 <keras.layers.pooling.MaxPooling2D at 0x11b4b6f90>,
 <keras.layers.convolutional.Conv2D at 0x11b4c2c90>,
 <keras.layers.convolutional.Conv2D at 0x11b535990>,
 <keras.layers.convolutional.Conv2D at 0x11b518cd0>,
 <keras.layers.pooling.MaxPooling2D at 0x11b57dd50>, # want to freeze up to and including this layer
 <keras.layers.convolutional.Conv2D at 0x11b5a9950>,
 <keras.layers.convolutional.Conv2D at 0x11b84f990>,
 <keras.layers.convolutional.Conv2D at 0x11b58af90>,
 <keras.layers.pooling.MaxPooling2D at 0x11bb59bd0>,
 <keras.models.Sequential at 0x1179c58d0>]

This code and https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html says to freeze the first 25 layers, but applications.VGG16 doesn't have that many! Pretty sure you need to freeze the first 15, not 25: here's my model.layers, including the top dense sequential model):

[<keras.engine.topology.InputLayer at 0x11aa49290>,
 <keras.layers.convolutional.Conv2D at 0x11aa49610>,
 <keras.layers.convolutional.Conv2D at 0x11a710490>,
 <keras.layers.pooling.MaxPooling2D at 0x11aa6c5d0>,
 <keras.layers.convolutional.Conv2D at 0x11aa49310>,
 <keras.layers.convolutional.Conv2D at 0x11aa78310>,
 <keras.layers.pooling.MaxPooling2D at 0x119c18fd0>,
 <keras.layers.convolutional.Conv2D at 0x119ca7fd0>,
 <keras.layers.convolutional.Conv2D at 0x11aa13510>,
 <keras.layers.convolutional.Conv2D at 0x11ac1da90>,
 <keras.layers.pooling.MaxPooling2D at 0x11b4b6f90>,
 <keras.layers.convolutional.Conv2D at 0x11b4c2c90>,
 <keras.layers.convolutional.Conv2D at 0x11b535990>,
 <keras.layers.convolutional.Conv2D at 0x11b518cd0>,
 <keras.layers.pooling.MaxPooling2D at 0x11b57dd50>, # want to freeze up to and including this layer
 <keras.layers.convolutional.Conv2D at 0x11b5a9950>,
 <keras.layers.convolutional.Conv2D at 0x11b84f990>,
 <keras.layers.convolutional.Conv2D at 0x11b58af90>,
 <keras.layers.pooling.MaxPooling2D at 0x11bb59bd0>,
 <keras.models.Sequential at 0x1179c58d0>]
@Omegamon

This comment has been minimized.

Show comment
Hide comment
@Omegamon

Omegamon Mar 21, 2017

@kevinpatricksmith
My model structure is the same as yours. I froze the first 15 layers by:

for layer in model.layers[:15]:
    layer.trainable = False

After 100 epoch, I got accuracy of 99.8% and validation set accuracy of 93.75%. After 50 epoch, the validation set accuracy
stabilized in the range of 93% ~ 94%

@kevinpatricksmith
My model structure is the same as yours. I froze the first 15 layers by:

for layer in model.layers[:15]:
    layer.trainable = False

After 100 epoch, I got accuracy of 99.8% and validation set accuracy of 93.75%. After 50 epoch, the validation set accuracy
stabilized in the range of 93% ~ 94%

@codemukul95

This comment has been minimized.

Show comment
Hide comment
@codemukul95

codemukul95 Mar 25, 2017

Hi! If we look at the VGG16 source code (https://github.com/fchollet/keras/blob/master/keras/applications/vgg16.py) number of layers till last convolutional block are 18. they why learning rate of first 25 layers is set to false in the above code.

for layer in model.layers[:25]:
 layer.trainable = False 

Please clarify it ASAP.
Thanks in anticipation.

codemukul95 commented Mar 25, 2017

Hi! If we look at the VGG16 source code (https://github.com/fchollet/keras/blob/master/keras/applications/vgg16.py) number of layers till last convolutional block are 18. they why learning rate of first 25 layers is set to false in the above code.

for layer in model.layers[:25]:
 layer.trainable = False 

Please clarify it ASAP.
Thanks in anticipation.

@liangxiao05

This comment has been minimized.

Show comment
Hide comment
@liangxiao05

liangxiao05 Apr 1, 2017

for layer in model.layers[:25]
[:25] shoule be [:15]

block5_conv1 (Convolution2D) (None, 512, 9, 9) 2359808 block4_pool[0][0]


block5_conv2 (Convolution2D) (None, 512, 9, 9) 2359808 block5_conv1[0][0]


block5_conv3 (Convolution2D) (None, 512, 9, 9) 2359808 block5_conv2[0][0]


block5_pool (MaxPooling2D) (None, 512, 4, 4) 0 block5_conv3[0][0]


sequential_1 (Sequential) (None, 1) 2097665 block5_pool[0][0]


Total params: 16,812,353
Trainable params: 9,177,089 ----------------#9177089=2359808*3+2097665
Non-trainable params: 7,635,264

liangxiao05 commented Apr 1, 2017

for layer in model.layers[:25]
[:25] shoule be [:15]

block5_conv1 (Convolution2D) (None, 512, 9, 9) 2359808 block4_pool[0][0]


block5_conv2 (Convolution2D) (None, 512, 9, 9) 2359808 block5_conv1[0][0]


block5_conv3 (Convolution2D) (None, 512, 9, 9) 2359808 block5_conv2[0][0]


block5_pool (MaxPooling2D) (None, 512, 4, 4) 0 block5_conv3[0][0]


sequential_1 (Sequential) (None, 1) 2097665 block5_pool[0][0]


Total params: 16,812,353
Trainable params: 9,177,089 ----------------#9177089=2359808*3+2097665
Non-trainable params: 7,635,264

@BoldinovaNatalya

This comment has been minimized.

Show comment
Hide comment
@BoldinovaNatalya

BoldinovaNatalya Apr 4, 2017

@Abusnina How did you solve this problem?

@Abusnina How did you solve this problem?

@Prakashvanapalli

This comment has been minimized.

Show comment
Hide comment
@Prakashvanapalli

Prakashvanapalli Apr 6, 2017

What if we want to use multi-gpu?

What if we want to use multi-gpu?

@CP121

This comment has been minimized.

Show comment
Hide comment
@CP121

CP121 Apr 8, 2017

What is the Keras API 2.0 version of this:

model = Model(input=base_model.input, output=top_model(base_model.output))

This line produces the user warning:

UserWarning: Update yourModelcall to the Keras 2 API:Model(outputs=Tensor("se..., inputs=Tensor("in...)

CP121 commented Apr 8, 2017

What is the Keras API 2.0 version of this:

model = Model(input=base_model.input, output=top_model(base_model.output))

This line produces the user warning:

UserWarning: Update yourModelcall to the Keras 2 API:Model(outputs=Tensor("se..., inputs=Tensor("in...)

@lusob

This comment has been minimized.

Show comment
Hide comment
@lusob

lusob Apr 16, 2017

@JWGS1 Use plurals in the args:

model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

lusob commented Apr 16, 2017

@JWGS1 Use plurals in the args:

model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

@bhavsarpratik

This comment has been minimized.

Show comment
Hide comment
@bhavsarpratik

bhavsarpratik Apr 18, 2017

It was quite a trouble running this. Finally did it with your inputs @Omegamon @hiroyachiba @liangxiao05 :)

It was quite a trouble running this. Finally did it with your inputs @Omegamon @hiroyachiba @liangxiao05 :)

@saulthu

This comment has been minimized.

Show comment
Hide comment
@saulthu

saulthu Apr 19, 2017

Working code (below) was updated as per comments to run with Keras 2.

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense

# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

# build the VGG16 network
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
# model.add(top_model)
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:15]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

model.summary()

# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    verbose=2)

saulthu commented Apr 19, 2017

Working code (below) was updated as per comments to run with Keras 2.

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense

# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

# build the VGG16 network
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
# model.add(top_model)
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:15]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

model.summary()

# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    verbose=2)
@bis-carbon

This comment has been minimized.

Show comment
Hide comment
@bis-carbon

bis-carbon Apr 25, 2017

Can any one tell me what is the "predict_class" equivalent in keras 2.0.3?
model.predict(x) gives me array of probas

Thank you.

Can any one tell me what is the "predict_class" equivalent in keras 2.0.3?
model.predict(x) gives me array of probas

Thank you.

@akaashagarwal

This comment has been minimized.

Show comment
Hide comment
@akaashagarwal

akaashagarwal Apr 25, 2017

@bis-carbon Believe it's predict_classes now. If you use anaconda, you can take a look at models.py under ~/anaconda2/lib/python2.7/site-packages/keras/

@bis-carbon Believe it's predict_classes now. If you use anaconda, you can take a look at models.py under ~/anaconda2/lib/python2.7/site-packages/keras/

@raaju-shiv

This comment has been minimized.

Show comment
Hide comment
@raaju-shiv

raaju-shiv Apr 27, 2017

Hi, I have a dimension mismatch error on running the code:
`import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
from keras import backend as K
K.set_image_dim_ordering('tf')
from keras import optimizers
from keras.models import Model

path to the model weights files.

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'
img_width, img_height = 150, 150
train_data_dir = 'data4/train'
validation_data_dir = 'data4/validation'
nb_train_samples = 20
nb_validation_samples = 20
epochs = 5
batch_size = 1
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
print('Model loaded.')
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(128, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
top_model.load_weights('bottleneck_fc_model.h5')
model = Model(input= base_model.input, output= top_model(base_model.output))
for layer in model.layers[:15]:
layer.trainable = False
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])

prepare data augmentation configuration

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
model.summary()

fine-tune the model

model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size,
verbose=2)`

The error is: Dimension 0 in both shapes must be equal, but are 8192 and 115200 for 'Assign_734' (op: 'Assign') with input shapes: [8192,128], [115200,128].
The error occurs when I try to execute the comment:
top_model.load_weights('bottleneck_fc_model.h5').

Kindly help.

raaju-shiv commented Apr 27, 2017

Hi, I have a dimension mismatch error on running the code:
`import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
from keras import backend as K
K.set_image_dim_ordering('tf')
from keras import optimizers
from keras.models import Model

path to the model weights files.

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'
img_width, img_height = 150, 150
train_data_dir = 'data4/train'
validation_data_dir = 'data4/validation'
nb_train_samples = 20
nb_validation_samples = 20
epochs = 5
batch_size = 1
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
print('Model loaded.')
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(128, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
top_model.load_weights('bottleneck_fc_model.h5')
model = Model(input= base_model.input, output= top_model(base_model.output))
for layer in model.layers[:15]:
layer.trainable = False
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])

prepare data augmentation configuration

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
model.summary()

fine-tune the model

model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size,
verbose=2)`

The error is: Dimension 0 in both shapes must be equal, but are 8192 and 115200 for 'Assign_734' (op: 'Assign') with input shapes: [8192,128], [115200,128].
The error occurs when I try to execute the comment:
top_model.load_weights('bottleneck_fc_model.h5').

Kindly help.

@skoch9

This comment has been minimized.

Show comment
Hide comment
@skoch9

skoch9 Apr 28, 2017

I run a slightly modified version of this example which only fine tunes the top layers (with Keras 2.0.3/Tensorflow on Ubuntu). This looks like the following:

img_width, img_height = 150, 150
train_data_dir = 'data/train_s'
validation_data_dir = 'data/val_s'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 10
batch_size = 16

base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))

top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dense(1, activation='sigmoid'))

model = Model(inputs=base_model.input, outputs=top_model(base_model.output))
model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary', shuffle=False)

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    verbose=2, workers=12)

score = model.evaluate_generator(validation_generator, nb_validation_samples/batch_size, workers=12)

scores = model.predict_generator(validation_generator, nb_validation_samples/batch_size, workers=12)

correct = 0
for i, n in enumerate(validation_generator.filenames):
    if n.startswith("cats") and scores[i][0] <= 0.5:
        correct += 1
    if n.startswith("dogs") and scores[i][0] > 0.5:
        correct += 1

print("Correct:", correct, " Total: ", len(validation_generator.filenames))
print("Loss: ", score[0], "Accuracy: ", score[1])

With this, I get arbitrary validation accuracy results. For example, predict_generator predicts 640 out of 800 (80%) classes correctly whereas evaluate_generator produces an accuracy score of 95%. Someone in #3477 suggests to remove the rescale=1. / 255 parameter from the validation generator, then I get results of 365/800=45% and 89% from evaluate_generator.

Is there something wrong with my evaluation or is this due to a bug? There are many similar issues (e.g. #3849, #6245) where the stated accuracy (during training and afterwards) doesn't match the actual predictions. Could someone experienced maybe shine some light onto this problem?

skoch9 commented Apr 28, 2017

I run a slightly modified version of this example which only fine tunes the top layers (with Keras 2.0.3/Tensorflow on Ubuntu). This looks like the following:

img_width, img_height = 150, 150
train_data_dir = 'data/train_s'
validation_data_dir = 'data/val_s'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 10
batch_size = 16

base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))

top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dense(1, activation='sigmoid'))

model = Model(inputs=base_model.input, outputs=top_model(base_model.output))
model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary', shuffle=False)

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    verbose=2, workers=12)

score = model.evaluate_generator(validation_generator, nb_validation_samples/batch_size, workers=12)

scores = model.predict_generator(validation_generator, nb_validation_samples/batch_size, workers=12)

correct = 0
for i, n in enumerate(validation_generator.filenames):
    if n.startswith("cats") and scores[i][0] <= 0.5:
        correct += 1
    if n.startswith("dogs") and scores[i][0] > 0.5:
        correct += 1

print("Correct:", correct, " Total: ", len(validation_generator.filenames))
print("Loss: ", score[0], "Accuracy: ", score[1])

With this, I get arbitrary validation accuracy results. For example, predict_generator predicts 640 out of 800 (80%) classes correctly whereas evaluate_generator produces an accuracy score of 95%. Someone in #3477 suggests to remove the rescale=1. / 255 parameter from the validation generator, then I get results of 365/800=45% and 89% from evaluate_generator.

Is there something wrong with my evaluation or is this due to a bug? There are many similar issues (e.g. #3849, #6245) where the stated accuracy (during training and afterwards) doesn't match the actual predictions. Could someone experienced maybe shine some light onto this problem?

@JinnyZhao

This comment has been minimized.

Show comment
Hide comment
@JinnyZhao

JinnyZhao May 1, 2017

@Golly
@Arsey
@austinchencym

Same problem, before fine-tuning my model for 10 classes reached 68% accuracy but the first epoch of fine-tuning dropped to 30%. Did you or does anyone work out for multi-class problem? Thanks!

@Golly
@Arsey
@austinchencym

Same problem, before fine-tuning my model for 10 classes reached 68% accuracy but the first epoch of fine-tuning dropped to 30%. Did you or does anyone work out for multi-class problem? Thanks!

@sakvaua

This comment has been minimized.

Show comment
Hide comment
@sakvaua

sakvaua May 3, 2017

Why do you guys not de-mean images but rescale=1. / 255, while pretrained VGG models require only de-meaning?

sakvaua commented May 3, 2017

Why do you guys not de-mean images but rescale=1. / 255, while pretrained VGG models require only de-meaning?

@chairath

This comment has been minimized.

Show comment
Hide comment
@chairath

chairath May 5, 2017

after I finetuned already,but How can I extract feature from the added layer (sequentail_1 > dense)

chairath commented May 5, 2017

after I finetuned already,but How can I extract feature from the added layer (sequentail_1 > dense)

@jinopallickal

This comment has been minimized.

Show comment
Hide comment
@jinopallickal

jinopallickal May 11, 2017

what is the equivalent of predict_classes in keras 2.0

what is the equivalent of predict_classes in keras 2.0

@mahernadar

This comment has been minimized.

Show comment
Hide comment
@mahernadar

mahernadar May 12, 2017

Hello there :)

1- I would like to know (please) what changes to do in order to make this work on python 3.5.

2- Also, i would like to know why in the code of tutorial 2 (for the bottleneck pre-trained features), we use put input shape as 'train_data.shape' inside the "flatten" layer, whereas in this code we choose the input shape to be 'model.output_shape' (see the line below for the particular 2 pieces of codes that i am talking about):

"model.add(Flatten(input_shape=train_data.shape[1:]))" versus " top_model.add(Flatten(input_shape=model.output_shape[1:]))"

Thank you for your time :)

Hello there :)

1- I would like to know (please) what changes to do in order to make this work on python 3.5.

2- Also, i would like to know why in the code of tutorial 2 (for the bottleneck pre-trained features), we use put input shape as 'train_data.shape' inside the "flatten" layer, whereas in this code we choose the input shape to be 'model.output_shape' (see the line below for the particular 2 pieces of codes that i am talking about):

"model.add(Flatten(input_shape=train_data.shape[1:]))" versus " top_model.add(Flatten(input_shape=model.output_shape[1:]))"

Thank you for your time :)

@mahernadar

This comment has been minimized.

Show comment
Hide comment
@mahernadar

mahernadar May 12, 2017

@saulthu thank you for sharing your working code :) u da man!!
and yeah i don't understand why in the initial posted code all 24 layers were set to non trainable when we wanted to keep the last Conv Block free.

@saulthu thank you for sharing your working code :) u da man!!
and yeah i don't understand why in the initial posted code all 24 layers were set to non trainable when we wanted to keep the last Conv Block free.

@raaju-shiv

This comment has been minimized.

Show comment
Hide comment
@raaju-shiv

raaju-shiv May 12, 2017

@saulthu
@akaashagarwal
Hi Guys,
What do you feel is the best way to fine-tune the Keras Xception network?

# set the first 100 layers 
for layer in model.layers[:100]:
layer.trainable = False

Can I do like this or do I have to begin the fine-tuning specifically from where the exit_flow starts and not in the entry and middle_flow?
Looking forward...

@saulthu
@akaashagarwal
Hi Guys,
What do you feel is the best way to fine-tune the Keras Xception network?

# set the first 100 layers 
for layer in model.layers[:100]:
layer.trainable = False

Can I do like this or do I have to begin the fine-tuning specifically from where the exit_flow starts and not in the entry and middle_flow?
Looking forward...

@micklexqg

This comment has been minimized.

Show comment
Hide comment
@micklexqg

micklexqg May 17, 2017

@raaju-shiv , i ran into the same error, "Dimension 0 in both shapes must be equal, but are 25088 and 8192 for 'Assign_26'"
do you have solved it?

@raaju-shiv , i ran into the same error, "Dimension 0 in both shapes must be equal, but are 25088 and 8192 for 'Assign_26'"
do you have solved it?

@jroberayalas

This comment has been minimized.

Show comment
Hide comment
@jroberayalas

jroberayalas May 20, 2017

Quick questions:

  • Once you trained your model and you want to try it with new images, is there a fast way to convert it to the corresponding input dimensions, i.e. (150, 150, 3)? Can we use the image generator somehow to achieve this?
  • model.predict(x) computes the class for a given input x. Is there a way to get the probability of belonging to such class?

jroberayalas commented May 20, 2017

Quick questions:

  • Once you trained your model and you want to try it with new images, is there a fast way to convert it to the corresponding input dimensions, i.e. (150, 150, 3)? Can we use the image generator somehow to achieve this?
  • model.predict(x) computes the class for a given input x. Is there a way to get the probability of belonging to such class?
@kai06046

This comment has been minimized.

Show comment
Hide comment
@kai06046

kai06046 Jun 4, 2017

Hi there, can anyone tell me why start with a fully-trained classifier is necessary?

top_model.load_weights(top_model_weights_path)

kai06046 commented Jun 4, 2017

Hi there, can anyone tell me why start with a fully-trained classifier is necessary?

top_model.load_weights(top_model_weights_path)

@oxydron

This comment has been minimized.

Show comment
Hide comment
@oxydron

oxydron Jun 6, 2017

Line 75 doesn't work for application ResNet50. Any tips?

oxydron commented Jun 6, 2017

Line 75 doesn't work for application ResNet50. Any tips?

@Tchaikovic

This comment has been minimized.

Show comment
Hide comment
@Tchaikovic

Tchaikovic Jun 20, 2017

How do you add auc computation to this code?

How do you add auc computation to this code?

@HodaGH

This comment has been minimized.

Show comment
Hide comment
@HodaGH

HodaGH Jun 29, 2017

I really couldn't find any solution to get rid of this error: Error when checking : expected flatten_input_1 to have shape (None, 4, 4, 512) but got array with shape (1, 3, 150, 150). Any ideas ? appreciate any help :)

HodaGH commented Jun 29, 2017

I really couldn't find any solution to get rid of this error: Error when checking : expected flatten_input_1 to have shape (None, 4, 4, 512) but got array with shape (1, 3, 150, 150). Any ideas ? appreciate any help :)

@val314159

This comment has been minimized.

Show comment
Hide comment
@val314159

val314159 Jul 3, 2017

I forked and updated this gist for Keras 1.0.5

I still get some warnings but, hey, they're not errors (yet!)

classifier_from_little_data_script_3.py:75: UserWarning: Update your 'Model' call to the Keras 2 API: 'Model(outputs=Tensor("se..., inputs=Tensor("in...)'
model = Model(input= base_model.input, output= top_model(base_model.output))
Model loaded.
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
classifier_from_little_data_script_3.py:117: UserWarning: Update your 'fit_generator' call to the Keras 2 API: 'fit_generator(<keras.pre..., validation_data=<keras.pre..., steps_per_epoch=125, epochs=50, validation_steps=800)' nb_val_samples=nb_validation_samples)

I forked and updated this gist for Keras 1.0.5

I still get some warnings but, hey, they're not errors (yet!)

classifier_from_little_data_script_3.py:75: UserWarning: Update your 'Model' call to the Keras 2 API: 'Model(outputs=Tensor("se..., inputs=Tensor("in...)'
model = Model(input= base_model.input, output= top_model(base_model.output))
Model loaded.
Found 20000 images belonging to 2 classes.
Found 5000 images belonging to 2 classes.
classifier_from_little_data_script_3.py:117: UserWarning: Update your 'fit_generator' call to the Keras 2 API: 'fit_generator(<keras.pre..., validation_data=<keras.pre..., steps_per_epoch=125, epochs=50, validation_steps=800)' nb_val_samples=nb_validation_samples)

@radhakrishnancegit

This comment has been minimized.

Show comment
Hide comment
@radhakrishnancegit

radhakrishnancegit Jul 4, 2017

How do we create the 'bottleneck_fc_model.h5'?

Is that the same model we save by transfer-learning in Step 2 of the blog post? https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

How do we create the 'bottleneck_fc_model.h5'?

Is that the same model we save by transfer-learning in Step 2 of the blog post? https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

@mahernadar

This comment has been minimized.

Show comment
Hide comment
@mahernadar

mahernadar Jul 6, 2017

Dear @fchollet, @saulthu, (or others :) )

Could you please explain to me why we do not need the training and validation labels in this example?

Is it because the labeling is done automatically through the sub-folders within the database folder? If so, why did we use labeling in tutorial of the bottleneck features?

Thanks

mahernadar commented Jul 6, 2017

Dear @fchollet, @saulthu, (or others :) )

Could you please explain to me why we do not need the training and validation labels in this example?

Is it because the labeling is done automatically through the sub-folders within the database folder? If so, why did we use labeling in tutorial of the bottleneck features?

Thanks

@mahernadar

This comment has been minimized.

Show comment
Hide comment

@radhakrishnancegit yes it is the same :)

@ardianumam

This comment has been minimized.

Show comment
Hide comment
@ardianumam

ardianumam Jul 10, 2017

@hiroyachiba @kevinpatricksmith : For mine, by setting "steps_per_epoch=nb_train_samples // batch_size" where nb_train_samples = 1998 and batch_size = 18, I get number per epoch = 111 // 18 = 6 instead. When I set "steps_per_epoch=nb_train_samples", I get number per epoch = 111. Don't know why. But, we should expect 111 (from: 1998//18) in this case, right? I use Keras 2.0.2 with tensorFlow 1.2.0.

nb_train_samples = 1998
nb_validation_samples = 800
epochs = 50
batch_size = 18
#......
#......
#......
# fine-tune the model
model.fit_generator(
    train_generator,
    samples_per_epoch=nb_train_samples,
    epochs=epochs,
    validation_data=validation_generator,
    nb_val_samples=nb_validation_samples)

ardianumam commented Jul 10, 2017

@hiroyachiba @kevinpatricksmith : For mine, by setting "steps_per_epoch=nb_train_samples // batch_size" where nb_train_samples = 1998 and batch_size = 18, I get number per epoch = 111 // 18 = 6 instead. When I set "steps_per_epoch=nb_train_samples", I get number per epoch = 111. Don't know why. But, we should expect 111 (from: 1998//18) in this case, right? I use Keras 2.0.2 with tensorFlow 1.2.0.

nb_train_samples = 1998
nb_validation_samples = 800
epochs = 50
batch_size = 18
#......
#......
#......
# fine-tune the model
model.fit_generator(
    train_generator,
    samples_per_epoch=nb_train_samples,
    epochs=epochs,
    validation_data=validation_generator,
    nb_val_samples=nb_validation_samples)
@mahernadar

This comment has been minimized.

Show comment
Hide comment
@mahernadar

mahernadar Jul 17, 2017

@Golly, @Arsey and @austinchencym,

I also am experimenting the same issue. When i applied the BottleNeck example on my 5 class project, i get respectable accuracies around 90%.

But when i switch to the fine-tuning of the 15th layer and above (as this example prescribes), i start with an accuracy of around 30% in the 1st epoch, and throughout this 1st epoch, it keeps going down while going through.

Anyone of you guys figured out the solution for this weird behavior?

Thanks :)

@Golly, @Arsey and @austinchencym,

I also am experimenting the same issue. When i applied the BottleNeck example on my 5 class project, i get respectable accuracies around 90%.

But when i switch to the fine-tuning of the 15th layer and above (as this example prescribes), i start with an accuracy of around 30% in the 1st epoch, and throughout this 1st epoch, it keeps going down while going through.

Anyone of you guys figured out the solution for this weird behavior?

Thanks :)

@GitHubKay

This comment has been minimized.

Show comment
Hide comment
@GitHubKay

GitHubKay Jul 19, 2017

Hello,

i struggle getting this code to run properly. I followed the suggestions and modified my code to reduce the errors step by step.
However, now i have no idea how to proceed. Any help would be really appreciated :)

Setup

  • Win10
  • Python 3.6
  • Keras 2.0.5/ Theano 0.9.0
  • tensorflow-gpu 1.2.0

Actual Code

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense

# path to the model weights files.
weights_path = 'vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/sample/train'
validation_data_dir = 'data/sample/valid'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
#model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
input_tensor = Input(shape=(150,150,3))
base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
#top_model = Sequential()
#top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
#top_model.add(Dense(256, activation='relu'))
#top_model.add(Dropout(0.5))
#top_model.add(Dense(1, activation='sigmoid'))
#top_model.load_weights('bootlneck_fc_model.h5')
#model = Model(input= base_model.input, output= top_model(base_model.output))
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))



# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
#model.add(top_model)
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

# set the first 15 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:15]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

model.sumary()

# fine-tune the model
#model.fit_generator(
#    train_generator,
#    samples_per_epoch=nb_train_samples,
#    epochs=epochs,
#    validation_data=validation_generator,
#    nb_val_samples=nb_validation_samples)

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size, verbose=2)

Error

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-234711fb2433> in <module>()
     26 
     27 # build a classifier model to put on top of the convolutional model
---> 28 input_tensor = Input(shape=(150,150,3))
     29 base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
     30 #top_model = Sequential()

NameError: name 'Input' is not defined

GitHubKay commented Jul 19, 2017

Hello,

i struggle getting this code to run properly. I followed the suggestions and modified my code to reduce the errors step by step.
However, now i have no idea how to proceed. Any help would be really appreciated :)

Setup

  • Win10
  • Python 3.6
  • Keras 2.0.5/ Theano 0.9.0
  • tensorflow-gpu 1.2.0

Actual Code

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense

# path to the model weights files.
weights_path = 'vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/sample/train'
validation_data_dir = 'data/sample/valid'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150,150,3))
#model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
input_tensor = Input(shape=(150,150,3))
base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
#top_model = Sequential()
#top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
#top_model.add(Dense(256, activation='relu'))
#top_model.add(Dropout(0.5))
#top_model.add(Dense(1, activation='sigmoid'))
#top_model.load_weights('bootlneck_fc_model.h5')
#model = Model(input= base_model.input, output= top_model(base_model.output))
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))



# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
#model.add(top_model)
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

# set the first 15 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:15]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

model.sumary()

# fine-tune the model
#model.fit_generator(
#    train_generator,
#    samples_per_epoch=nb_train_samples,
#    epochs=epochs,
#    validation_data=validation_generator,
#    nb_val_samples=nb_validation_samples)

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size, verbose=2)

Error

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-234711fb2433> in <module>()
     26 
     27 # build a classifier model to put on top of the convolutional model
---> 28 input_tensor = Input(shape=(150,150,3))
     29 base_model = VGG16(weights='imagenet',include_top= False,input_tensor=input_tensor)
     30 #top_model = Sequential()

NameError: name 'Input' is not defined
@GitHubKay

This comment has been minimized.

Show comment
Hide comment
@GitHubKay

GitHubKay Jul 20, 2017

@skoch9 i tried your code now, but it seems like my GPU run out of memory even though it says:

  • The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
  • Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_TIMEOUT

The Process starts and breaks down saying, that python is down.

Using TensorFlow backend.
Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.
Epoch 1/10

Here the full Report:

name: GeForce GTX 950
major: 5 minor: 2 memoryClockRate (GHz) 1.2785
pciBusID 0000:04:00.0
Total memory: 2.00GiB
Free memory: 1.64GiB
2017-07-20 10:24:50.315152: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0
2017-07-20 10:24:50.317399: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0:   Y
2017-07-20 10:24:50.317852: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:04:00.0)
2017-07-20 10:24:53.389471: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.52GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.500072: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.605247: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.29GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.696293: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.10GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.795242: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.911425: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.10GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.011991: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.20GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.012182: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.095455: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.574779: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
[I 10:26:18.720 NotebookApp] Saving file at /Masterarbeit/KerasCNN/Different Finetuning.ipynb
2017-07-20 10:28:09.415272: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_TIMEOUT
2017-07-20 10:28:09.415378: F c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_event_mgr.cc:203] Unexpected Event status: 1
[I 10:28:31.838 NotebookApp] KernelRestarter: restarting kernel (1/5)
WARNING:root:kernel 2ccf2624-b385-4814-aa25-44a53a86dd02 restarted

Thx guys!

GitHubKay commented Jul 20, 2017

@skoch9 i tried your code now, but it seems like my GPU run out of memory even though it says:

  • The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
  • Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_TIMEOUT

The Process starts and breaks down saying, that python is down.

Using TensorFlow backend.
Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.
Epoch 1/10

Here the full Report:

name: GeForce GTX 950
major: 5 minor: 2 memoryClockRate (GHz) 1.2785
pciBusID 0000:04:00.0
Total memory: 2.00GiB
Free memory: 1.64GiB
2017-07-20 10:24:50.315152: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0
2017-07-20 10:24:50.317399: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0:   Y
2017-07-20 10:24:50.317852: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:04:00.0)
2017-07-20 10:24:53.389471: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.52GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.500072: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.605247: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.29GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.696293: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.10GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.795242: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:53.911425: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.10GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.011991: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.20GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.012182: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.095455: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-07-20 10:24:54.574779: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
[I 10:26:18.720 NotebookApp] Saving file at /Masterarbeit/KerasCNN/Different Finetuning.ipynb
2017-07-20 10:28:09.415272: E c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_TIMEOUT
2017-07-20 10:28:09.415378: F c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_event_mgr.cc:203] Unexpected Event status: 1
[I 10:28:31.838 NotebookApp] KernelRestarter: restarting kernel (1/5)
WARNING:root:kernel 2ccf2624-b385-4814-aa25-44a53a86dd02 restarted

Thx guys!

@fadam

This comment has been minimized.

Show comment
Hide comment
@fadam

fadam Aug 1, 2017

@GitHubKay

NameError: name 'Input' is not defined

Seems like you forgot to import Input, add this
from keras.models import Input

fadam commented Aug 1, 2017

@GitHubKay

NameError: name 'Input' is not defined

Seems like you forgot to import Input, add this
from keras.models import Input

@GitHubKay

This comment has been minimized.

Show comment
Hide comment
@GitHubKay

GitHubKay Aug 5, 2017

@fadam
Yeah! I proceeded today with my files and now everything is running fine. After all theses issues i couldn't find the small ones anymore! Anyway, i split up the code in small batches within jupyter and could manage to run the code successfully!
Thx to everybody who shared his code here!!

any suggestions where i could find more tuning tips for little data! I've my own landscape pictures and would like to classify them after my own preferences in bad and good pictures. But i only have 800 good ones.

any help is appreciated!

@fadam
Yeah! I proceeded today with my files and now everything is running fine. After all theses issues i couldn't find the small ones anymore! Anyway, i split up the code in small batches within jupyter and could manage to run the code successfully!
Thx to everybody who shared his code here!!

any suggestions where i could find more tuning tips for little data! I've my own landscape pictures and would like to classify them after my own preferences in bad and good pictures. But i only have 800 good ones.

any help is appreciated!

@peachthiefmedia

This comment has been minimized.

Show comment
Hide comment
@peachthiefmedia

peachthiefmedia Aug 10, 2017

@sakvaua
I'm trying to work that out as well, looks like demeaning doesn't help but re-scaling does in testing for me so far, but not sure why that would be the case if it was the standard weights.

@sakvaua
I'm trying to work that out as well, looks like demeaning doesn't help but re-scaling does in testing for me so far, but not sure why that would be the case if it was the standard weights.

@S1M0N38

This comment has been minimized.

Show comment
Hide comment
@S1M0N38

S1M0N38 Aug 10, 2017

I'm stuck on this problem

top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))


ValueError Traceback (most recent call last)
in ()
1 top_model = Sequential()
----> 2 top_model.add(Flatten(input_shape=model.output_shape[1:]))
3 top_model.add(Dense(256, activation='relu'))
4 top_model.add(Dropout(0.5))
5 top_model.add(Dense(1, activation='sigmoid'))

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/models.py in add(self, layer)
434 # and create the node connecting the current layer
435 # to the input layer we just created.
--> 436 layer(x)
437
438 if len(layer.inbound_nodes) != 1:

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
613 # Infering the output shape is only relevant for Theano.
614 if all([s is not None for s in _to_list(input_shape)]):
--> 615 output_shape = self.compute_output_shape(input_shape)
616 else:
617 if isinstance(input_shape, list):

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/layers/core.py in compute_output_shape(self, input_shape)
475 raise ValueError('The shape of the input to "Flatten" '
476 'is not fully defined '
--> 477 '(got ' + str(input_shape[1:]) + '. '
478 'Make sure to pass a complete "input_shape" '
479 'or "batch_input_shape" argument to the first '

ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

S1M0N38 commented Aug 10, 2017

I'm stuck on this problem

top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))


ValueError Traceback (most recent call last)
in ()
1 top_model = Sequential()
----> 2 top_model.add(Flatten(input_shape=model.output_shape[1:]))
3 top_model.add(Dense(256, activation='relu'))
4 top_model.add(Dropout(0.5))
5 top_model.add(Dense(1, activation='sigmoid'))

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/models.py in add(self, layer)
434 # and create the node connecting the current layer
435 # to the input layer we just created.
--> 436 layer(x)
437
438 if len(layer.inbound_nodes) != 1:

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
613 # Infering the output shape is only relevant for Theano.
614 if all([s is not None for s in _to_list(input_shape)]):
--> 615 output_shape = self.compute_output_shape(input_shape)
616 else:
617 if isinstance(input_shape, list):

/Users/Simo/.conda/envs/main/lib/python3.5/site-packages/keras/layers/core.py in compute_output_shape(self, input_shape)
475 raise ValueError('The shape of the input to "Flatten" '
476 'is not fully defined '
--> 477 '(got ' + str(input_shape[1:]) + '. '
478 'Make sure to pass a complete "input_shape" '
479 'or "batch_input_shape" argument to the first '

ValueError: The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

@edumucelli

This comment has been minimized.

Show comment
Hide comment
@edumucelli

edumucelli Aug 12, 2017

@S1M0N38 your problem is that you should set the input_shape parameter when using VGG16 with include_top=false. Take a look into the code here where it is explained.

edumucelli commented Aug 12, 2017

@S1M0N38 your problem is that you should set the input_shape parameter when using VGG16 with include_top=false. Take a look into the code here where it is explained.

@edumucelli

This comment has been minimized.

Show comment
Hide comment
@edumucelli

edumucelli Aug 12, 2017

@skoch9 be aware that as you are not including the top layers, but replacing by a way simpler layer than the original VGG16. Just for comparison's sake take the number of parameters from summary of your model and the original VGG16.

Your model's top is like:

block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
sequential_2 (Sequential)    (None, 1)                 6423041   
=================================================================
Total params: 21,137,729
Trainable params: 21,137,729
Non-trainable params: 0

While original VGG16 is

block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0

You are not fine-tuning, but simplyfing the model.

@skoch9 be aware that as you are not including the top layers, but replacing by a way simpler layer than the original VGG16. Just for comparison's sake take the number of parameters from summary of your model and the original VGG16.

Your model's top is like:

block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
sequential_2 (Sequential)    (None, 1)                 6423041   
=================================================================
Total params: 21,137,729
Trainable params: 21,137,729
Non-trainable params: 0

While original VGG16 is

block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0

You are not fine-tuning, but simplyfing the model.

@Niladri365

This comment has been minimized.

Show comment
Hide comment
@Niladri365

Niladri365 Aug 15, 2017

Can I use callbacks (e.g. early stopping) with model.fit_generator as I have used with model.fit?

Can I use callbacks (e.g. early stopping) with model.fit_generator as I have used with model.fit?

@WangCharlie

This comment has been minimized.

Show comment
Hide comment
@WangCharlie

WangCharlie Aug 16, 2017

who can point out how to implment like this:
data/
train/
dogs1/
dog001.jpg
dog002.jpg
...
dogs2/
dog001.jpg
dog002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
dogs2/
dog001.jpg
dog002.jpg
...
and set the name of dog1 is dd,set the name of dog2 ff
then when i put an image it will show the result like this > this is a dog named dd&ff

who can point out how to implment like this:
data/
train/
dogs1/
dog001.jpg
dog002.jpg
...
dogs2/
dog001.jpg
dog002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
dogs2/
dog001.jpg
dog002.jpg
...
and set the name of dog1 is dd,set the name of dog2 ff
then when i put an image it will show the result like this > this is a dog named dd&ff

@taozhiiq

This comment has been minimized.

Show comment
Hide comment
@taozhiiq

taozhiiq Aug 21, 2017

Hi guys:
I have a question, when i train the model Under Windows using tensorflow as backend, I hava 89% accuracy for 2 classes. But when i train the model Under Linux, the model decreased in accuracy from 89% to 15%, And then i use the weights which is trained under the Windows to fine-tuning the model under Linux, I got the 91% accuracy , It seems that the 'save_weights' is different between Windows and Linux, but i hava no idea.
Any advice on this?

Hi guys:
I have a question, when i train the model Under Windows using tensorflow as backend, I hava 89% accuracy for 2 classes. But when i train the model Under Linux, the model decreased in accuracy from 89% to 15%, And then i use the weights which is trained under the Windows to fine-tuning the model under Linux, I got the 91% accuracy , It seems that the 'save_weights' is different between Windows and Linux, but i hava no idea.
Any advice on this?

@nimpy

This comment has been minimized.

Show comment
Hide comment
@nimpy

nimpy Aug 21, 2017

@edumucelli are you sure @skoch9 isn't including the top layers?
I thought the same thing, so I used this code instead to create a model:

base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=input_shape)

x = Flatten()(base_model.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(1, activation = 'sigmoid')(x)

model = Model(input = base_model.input, output = predictions)

and when you look at the model.summary(), the last layers:

flatten_8 (Flatten)          (None, 25088)             0         
_________________________________________________________________
dense_15 (Dense)             (None, 256)               6422784   
_________________________________________________________________
dropout_8 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 1)                 257            

have the same number of parameters (in total) as in @skoch9 's model's last layer:

sequential_1 (Sequential)    (None, 1)                 6423041   

I believe it's the same thing, but please correct me if I'm wrong.

nimpy commented Aug 21, 2017

@edumucelli are you sure @skoch9 isn't including the top layers?
I thought the same thing, so I used this code instead to create a model:

base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=input_shape)

x = Flatten()(base_model.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(1, activation = 'sigmoid')(x)

model = Model(input = base_model.input, output = predictions)

and when you look at the model.summary(), the last layers:

flatten_8 (Flatten)          (None, 25088)             0         
_________________________________________________________________
dense_15 (Dense)             (None, 256)               6422784   
_________________________________________________________________
dropout_8 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 1)                 257            

have the same number of parameters (in total) as in @skoch9 's model's last layer:

sequential_1 (Sequential)    (None, 1)                 6423041   

I believe it's the same thing, but please correct me if I'm wrong.

@all3xfx

This comment has been minimized.

Show comment
Hide comment

all3xfx commented Sep 5, 2017

Thanks @saulthu

@HyunJoonChoi

This comment has been minimized.

Show comment
Hide comment
@HyunJoonChoi

HyunJoonChoi Sep 6, 2017

Thanks for your providing helpful source codes like this. But I need your help now. I'm trying binary classification through cats and dogs dataset. The code is running well without any code errors. But acc and loss does not improve at all. I need you guys' powerful brain.

Keras 2.0.7 (backend tensorflow)
tensorflow-gpu 1.2.1

dataset
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...

The code
`from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense

img_width, img_height, img_channel = 150, 150, 3

train_data_dir = './data/train'
validation_data_dir = './data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

build the VGG16 network

base_model = applications.VGG16(weights='imagenet',
input_shape=(img_width, img_height, img_channel),
include_top=False)
print('Model loaded.')

build a classifier model to put on top of the convolutional model

vgg16_output = base_model.output
flatten = Flatten(name='flatten')(vgg16_output)
dense = Dense(256, activation='relu', kernel_initializer='he_normal', name='fc1')(flatten)
dense = Dense(256, activation='relu', kernel_initializer='he_normal', name='fc2')(dense)
pred = Dense(units=1, activation='softmax', kernel_initializer='he_normal', name='prediction')(dense)

new_model = Model(input=base_model.input, output=pred)
new_model.summary()

set the first 25 layers (up to the last conv block)

to non-trainable (weights will not be updated)

for layer in new_model.layers:
if layer.name in ['fc1', 'fc2', 'prediction']:
continue
layer.trainable = False

compile the model with a SGD/momentum optimizer

and a very slow learning rate.

new_model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
new_model.summary()

prepare data augmentation configuration

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

fine-tune the model

new_model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples//batch_size)`

Thanks for your providing helpful source codes like this. But I need your help now. I'm trying binary classification through cats and dogs dataset. The code is running well without any code errors. But acc and loss does not improve at all. I need you guys' powerful brain.

Keras 2.0.7 (backend tensorflow)
tensorflow-gpu 1.2.1

dataset
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...

The code
`from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense

img_width, img_height, img_channel = 150, 150, 3

train_data_dir = './data/train'
validation_data_dir = './data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

build the VGG16 network

base_model = applications.VGG16(weights='imagenet',
input_shape=(img_width, img_height, img_channel),
include_top=False)
print('Model loaded.')

build a classifier model to put on top of the convolutional model

vgg16_output = base_model.output
flatten = Flatten(name='flatten')(vgg16_output)
dense = Dense(256, activation='relu', kernel_initializer='he_normal', name='fc1')(flatten)
dense = Dense(256, activation='relu', kernel_initializer='he_normal', name='fc2')(dense)
pred = Dense(units=1, activation='softmax', kernel_initializer='he_normal', name='prediction')(dense)

new_model = Model(input=base_model.input, output=pred)
new_model.summary()

set the first 25 layers (up to the last conv block)

to non-trainable (weights will not be updated)

for layer in new_model.layers:
if layer.name in ['fc1', 'fc2', 'prediction']:
continue
layer.trainable = False

compile the model with a SGD/momentum optimizer

and a very slow learning rate.

new_model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
new_model.summary()

prepare data augmentation configuration

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

fine-tune the model

new_model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples//batch_size,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples//batch_size)`

@rabeehkarimi

This comment has been minimized.

Show comment
Hide comment
@rabeehkarimi

rabeehkarimi Sep 19, 2017

Hi, I cannot donwload the data, can someone help me please? thanks.

Hi, I cannot donwload the data, can someone help me please? thanks.

@pebbleshx

This comment has been minimized.

Show comment
Hide comment
@pebbleshx

pebbleshx Sep 28, 2017

Hi just wan to share that the number of layers to freeze is really 25 instead of 15.
It is not as simple as count the layers listed in model.summary(), please look at VGG16 @ https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3, and count the layers (pay attention to those ZeroPadding2D layers).

Thanks and have a nice day! Ciao.

Hi just wan to share that the number of layers to freeze is really 25 instead of 15.
It is not as simple as count the layers listed in model.summary(), please look at VGG16 @ https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3, and count the layers (pay attention to those ZeroPadding2D layers).

Thanks and have a nice day! Ciao.

@SecretSquirrel123

This comment has been minimized.

Show comment
Hide comment
@SecretSquirrel123

SecretSquirrel123 Oct 22, 2017

@taozhiiq I see same results - 89% accuracy using saved weights when using keras 2.0 with TF backend on Windows.
Something is wrong.

@taozhiiq I see same results - 89% accuracy using saved weights when using keras 2.0 with TF backend on Windows.
Something is wrong.

@jennyluciav

This comment has been minimized.

Show comment
Hide comment
@jennyluciav

jennyluciav Oct 23, 2017

Hi everyone. After run this part of the tutorial, i get this error:
Traceback (most recent call last): File "dogandcats6.py", line 34, in <module> top_model.load_weights(top_model_weights_path) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\models.py", line 719, in load_weights topology.load_weights_from_hdf5_group(f, layers) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\topology.py", line 3095, in load_weights_from_hdf5_group K.batch_set_value(weight_value_tuples) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py", line 2188, in batch_set_value assign_op = x.assign(assign_placeholder) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\variables.py", line 522, in assign return state_ops.assign(self._variable, value, use_locking=use_locking) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 47, in assign use_locking=use_locking, name=name) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op op_def=op_def) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 2329, in create_op set_shapes_for_outputs(ret) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in set_shapes_for_outputs shapes = shape_func(op) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1667, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Dimension 1 in both shapes must be equal, but are 1 and 2 for 'Assign_28' (op: 'Assign') with input shapes: [256,1], [256,2].

This is the code:
`from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'

img_width, img_height = 350,350

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 100
nb_validation_samples = 20
epochs = 50
batch_size = 16

model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(350, 350, 3))
print('Model loaded.')

top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

top_model.load_weights(top_model_weights_path)

model.add(top_model)

for layer in model.layers[:25]:
layer.trainable = False

model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)`

Please, help me!

jennyluciav commented Oct 23, 2017

Hi everyone. After run this part of the tutorial, i get this error:
Traceback (most recent call last): File "dogandcats6.py", line 34, in <module> top_model.load_weights(top_model_weights_path) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\models.py", line 719, in load_weights topology.load_weights_from_hdf5_group(f, layers) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\topology.py", line 3095, in load_weights_from_hdf5_group K.batch_set_value(weight_value_tuples) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py", line 2188, in batch_set_value assign_op = x.assign(assign_placeholder) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\variables.py", line 522, in assign return state_ops.assign(self._variable, value, use_locking=use_locking) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 47, in assign use_locking=use_locking, name=name) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op op_def=op_def) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 2329, in create_op set_shapes_for_outputs(ret) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in set_shapes_for_outputs shapes = shape_func(op) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1667, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "C:\Users\USUARIO\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Dimension 1 in both shapes must be equal, but are 1 and 2 for 'Assign_28' (op: 'Assign') with input shapes: [256,1], [256,2].

This is the code:
`from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'

img_width, img_height = 350,350

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 100
nb_validation_samples = 20
epochs = 50
batch_size = 16

model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(350, 350, 3))
print('Model loaded.')

top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

top_model.load_weights(top_model_weights_path)

model.add(top_model)

for layer in model.layers[:25]:
layer.trainable = False

model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')

model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)`

Please, help me!

@MathiasKahlen

This comment has been minimized.

Show comment
Hide comment
@MathiasKahlen

MathiasKahlen Nov 5, 2017

@GitHubKay How did you fix the memory problem? I think I'm getting the same error. Jupyter says "Dead kernel" when I'm running the last part: model.fit_generator(...)

@GitHubKay How did you fix the memory problem? I think I'm getting the same error. Jupyter says "Dead kernel" when I'm running the last part: model.fit_generator(...)

@mandliya

This comment has been minimized.

Show comment
Hide comment
@mandliya

mandliya Nov 7, 2017

Getting the exact same error as @MathiasKahlen . Any help please.

mandliya commented Nov 7, 2017

Getting the exact same error as @MathiasKahlen . Any help please.

@kavousanos

This comment has been minimized.

Show comment
Hide comment
@kavousanos

kavousanos Nov 12, 2017

@saulthu Excellent! Works like a charm.

@saulthu Excellent! Works like a charm.

@khankuan

This comment has been minimized.

Show comment
Hide comment
@khankuan

khankuan Nov 13, 2017

@fchollet thanks for the example! I've trained and saved a model with 1 epoch. I intend to load the model back up to train for the 2nd epoch. Any tips on how I would approach it with finetuning? Cheers :)

@fchollet thanks for the example! I've trained and saved a model with 1 epoch. I intend to load the model back up to train for the 2nd epoch. Any tips on how I would approach it with finetuning? Cheers :)

@DaniRuiz92

This comment has been minimized.

Show comment
Hide comment
@DaniRuiz92

DaniRuiz92 Nov 17, 2017

@pebbleshx thanks for your comment but I'm not sure that what you say is correct. If you look at this code, the vgg16 is composed of less than 25 layers. I suppose that it is possible to use different implementations of the VGG16.
In fact, when you load the VGG16 model from the keras.applications repository model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3)) and you look the structure of the layers model.layers, you will find only 20 layers. When you put the flag trainable = False, you do it in the layers of model.layers, so if you choose to freeze 25 layers you will freeze all layers. You can check this by doing a model.summary and checking the number of trainable weigths.

Thats only my interpretation of how the freezing should be done.

Probably @fchollet can clarify this.

DaniRuiz92 commented Nov 17, 2017

@pebbleshx thanks for your comment but I'm not sure that what you say is correct. If you look at this code, the vgg16 is composed of less than 25 layers. I suppose that it is possible to use different implementations of the VGG16.
In fact, when you load the VGG16 model from the keras.applications repository model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3)) and you look the structure of the layers model.layers, you will find only 20 layers. When you put the flag trainable = False, you do it in the layers of model.layers, so if you choose to freeze 25 layers you will freeze all layers. You can check this by doing a model.summary and checking the number of trainable weigths.

Thats only my interpretation of how the freezing should be done.

Probably @fchollet can clarify this.

@Moondra

This comment has been minimized.

Show comment
Hide comment
@Moondra

Moondra Nov 19, 2017

I'm running this script for the first time, and It's downloading the VGG weights or model structure(I'm assuming). It's taking forever. Is this normal behavior?

Also these lines:

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'

Are these the same as these weights:
vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

If so I have these weights in a different directory, should I change the weights_path ?

Moondra commented Nov 19, 2017

I'm running this script for the first time, and It's downloading the VGG weights or model structure(I'm assuming). It's taking forever. Is this normal behavior?

Also these lines:

weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'

Are these the same as these weights:
vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

If so I have these weights in a different directory, should I change the weights_path ?

@richajain44

This comment has been minimized.

Show comment
Hide comment
@richajain44

richajain44 Nov 21, 2017

I did follow the tutorial and I was successfully able to create updated weights. I save the model and weights using below command:
model_json = model.to_json()
with open('model_def_gen1.1.json','w') as json_file:
json_file.write(model_json)
model.save_weights(top_model_weights_path)

However, when I predict I get below error : You are trying to load a weight file containing 14 layers into a model with 2 layers.
Below is my code for prediction:
#reading the json file created above
with open('model_def_gen1.json') as json_file:
model_json = json_file.read()
mymodel = keras.models.model_from_json(model_json)
#loading the weights created at the end
mymodel.load_weights('bottleneck_fc_model1.h5')
img_array1 =[]
#path to my test images directory
img_path_test="//test-images//*.jpg"
for img1 in glob.glob(img_path_test):
img_array1.append(img1)
img_w=150
img_h=150
preds_gender_img =[]
#creating my base model for extracting features
base_model = applications.VGG16(weights='imagenet', include_top=False )
#accessing each image in my test folder
for i2 in img_array1:
l=load_img(i2,target_size=(img_w,img_h))
x_test_img=image.img_to_array(l)
x_test = np.expand_dims(x_test, axis=0)
x_test=x_test/255
#extracting features using vgg16
features = base_model.predict(x_test)
#using my weights to predict the class, I am doing binary classification
preds_gender_img.append(mymodel.predict_classes(features))

Can anyone please tell me what I am doing wrong? I followed another GitHub link to predict classes.#6408 keras-team/keras#6408
@KamalOthman can you please tell me what is wrong here?

I did follow the tutorial and I was successfully able to create updated weights. I save the model and weights using below command:
model_json = model.to_json()
with open('model_def_gen1.1.json','w') as json_file:
json_file.write(model_json)
model.save_weights(top_model_weights_path)

However, when I predict I get below error : You are trying to load a weight file containing 14 layers into a model with 2 layers.
Below is my code for prediction:
#reading the json file created above
with open('model_def_gen1.json') as json_file:
model_json = json_file.read()
mymodel = keras.models.model_from_json(model_json)
#loading the weights created at the end
mymodel.load_weights('bottleneck_fc_model1.h5')
img_array1 =[]
#path to my test images directory
img_path_test="//test-images//*.jpg"
for img1 in glob.glob(img_path_test):
img_array1.append(img1)
img_w=150
img_h=150
preds_gender_img =[]
#creating my base model for extracting features
base_model = applications.VGG16(weights='imagenet', include_top=False )
#accessing each image in my test folder
for i2 in img_array1:
l=load_img(i2,target_size=(img_w,img_h))
x_test_img=image.img_to_array(l)
x_test = np.expand_dims(x_test, axis=0)
x_test=x_test/255
#extracting features using vgg16
features = base_model.predict(x_test)
#using my weights to predict the class, I am doing binary classification
preds_gender_img.append(mymodel.predict_classes(features))

Can anyone please tell me what I am doing wrong? I followed another GitHub link to predict classes.#6408 keras-team/keras#6408
@KamalOthman can you please tell me what is wrong here?

@richajain44

This comment has been minimized.

Show comment
Hide comment
@richajain44

richajain44 Nov 21, 2017

@AakashKumarNain can you tell me how did you load the model and the weights

@AakashKumarNain can you tell me how did you load the model and the weights

@Ksen17

This comment has been minimized.

Show comment
Hide comment
@Ksen17

Ksen17 Nov 21, 2017

After performing this tutorial accuracy falls down to 0.0195 from 0.7089
Can someone help me with this problem
Also, I want to make input shape (150, 150, 3) to convert this model to CoreML

Any help would be great

Ksen17 commented Nov 21, 2017

After performing this tutorial accuracy falls down to 0.0195 from 0.7089
Can someone help me with this problem
Also, I want to make input shape (150, 150, 3) to convert this model to CoreML

Any help would be great

@savinay

This comment has been minimized.

Show comment
Hide comment
@savinay

savinay Nov 26, 2017

Can anybody please tell me how to predict the class for a new image. I have a multiclass problem and I am running the following code:

`model = load_model('my_model.h5')

model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])

img=array(Image.open('yalefaces/subject01.glasses'))

img = cv2.resize(img,(150,150), interpolation = cv2.INTER_AREA)

img = np.reshape(img,[1,150,150,3])

classes = model.predict_classes(img)

print classes`

But I get the following error:
ValueError: cannot reshape array of size 22500 into shape (1,150,150,3)

Any help would be appreciated.

savinay commented Nov 26, 2017

Can anybody please tell me how to predict the class for a new image. I have a multiclass problem and I am running the following code:

`model = load_model('my_model.h5')

model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])

img=array(Image.open('yalefaces/subject01.glasses'))

img = cv2.resize(img,(150,150), interpolation = cv2.INTER_AREA)

img = np.reshape(img,[1,150,150,3])

classes = model.predict_classes(img)

print classes`

But I get the following error:
ValueError: cannot reshape array of size 22500 into shape (1,150,150,3)

Any help would be appreciated.

@rmis7170

This comment has been minimized.

Show comment
Hide comment

rmis7170 commented Nov 29, 2017

@chiragyeole

This comment has been minimized.

Show comment
Hide comment
@chiragyeole

chiragyeole Dec 2, 2017

@Golly
@Arsey
@austinchencym
@JinnyZhao
@mahernadar

For Bottleneck features accuracy is ~90%. But after fine tuning its 50%
Are you guys able to figure out what was the issue?

chiragyeole commented Dec 2, 2017

@Golly
@Arsey
@austinchencym
@JinnyZhao
@mahernadar

For Bottleneck features accuracy is ~90%. But after fine tuning its 50%
Are you guys able to figure out what was the issue?

@MathiasKahlen

This comment has been minimized.

Show comment
Hide comment
@MathiasKahlen

MathiasKahlen Dec 15, 2017

Is it just me or is the weights_path never used? Does this cause the confusion of freezing 15 or 25 layers? When i print all layers in the VGG16 model there isn't actually 25? The keras.applications implementation seem to have less layers than the implementation in this code: https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3. I'm truly confused.

Is it just me or is the weights_path never used? Does this cause the confusion of freezing 15 or 25 layers? When i print all layers in the VGG16 model there isn't actually 25? The keras.applications implementation seem to have less layers than the implementation in this code: https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3. I'm truly confused.

@MathiasKahlen

This comment has been minimized.

Show comment
Hide comment
@MathiasKahlen

MathiasKahlen Dec 15, 2017

@nimpy If you're doing that don't you then skip the bottleneck and training of top_model parts and go directly to the finetuning part? And if so, what is the difference between the two methods?

@nimpy If you're doing that don't you then skip the bottleneck and training of top_model parts and go directly to the finetuning part? And if so, what is the difference between the two methods?

@trejkaz

This comment has been minimized.

Show comment
Hide comment
@trejkaz

trejkaz Dec 23, 2017

class_mode in script 3 is 'binary' but in script 2 it's None. Is that right, or are they supposed to match?

trejkaz commented Dec 23, 2017

class_mode in script 3 is 'binary' but in script 2 it's None. Is that right, or are they supposed to match?

@trejkaz

This comment has been minimized.

Show comment
Hide comment
@trejkaz

trejkaz Dec 23, 2017

@savinay I hit that problem myself and figured out that the image was only a single channel. i.e., it was a 150,150,1 image, but what I wanted was a 150,150,3 image. I fixed it by re-saving it in another format.

trejkaz commented Dec 23, 2017

@savinay I hit that problem myself and figured out that the image was only a single channel. i.e., it was a 150,150,1 image, but what I wanted was a 150,150,3 image. I fixed it by re-saving it in another format.

@PaulTonData

This comment has been minimized.

Show comment
Hide comment
@PaulTonData

PaulTonData Dec 24, 2017

@MathiasKahlen yeah, you're right, weights_path is never used. Instead, as you say. in the most recent code, the VGG16 model is loaded using the keras.applications implementation. For that implementation, you need to freeze only 15 layers to leave the last convolution block tunable. The implementation from the link is for an older version of Keras, and that one had 25 layers before the last convolution block.

@MathiasKahlen yeah, you're right, weights_path is never used. Instead, as you say. in the most recent code, the VGG16 model is loaded using the keras.applications implementation. For that implementation, you need to freeze only 15 layers to leave the last convolution block tunable. The implementation from the link is for an older version of Keras, and that one had 25 layers before the last convolution block.

@PaulTonData

This comment has been minimized.

Show comment
Hide comment
@PaulTonData

PaulTonData Dec 24, 2017

@Golly
@Arsey
@austinchencym
@JinnyZhao
@mahernadar
@chiragyeole

For me, the issue was with the SGD hyper-parameters. Both the learning rate and the momentum were too high, and as a result, the cost function was blowing up. For the tutorial model, you can try:

optimizer=optimizers.SGD(lr=1e-5, momentum=0.5)
or
optimizer=optimizers.SGD(lr=5e-6, momentum=0.5)

Which only got me to ~96% (the bottleneck model I used was already 95% using @yurkor 's suggestion)
For a multi-class model, you'd probably have to find a lr and momentum specific to the problem, although tuning is painfully time-consuming (took me ~10min. per epoch).

@Golly
@Arsey
@austinchencym
@JinnyZhao
@mahernadar
@chiragyeole

For me, the issue was with the SGD hyper-parameters. Both the learning rate and the momentum were too high, and as a result, the cost function was blowing up. For the tutorial model, you can try:

optimizer=optimizers.SGD(lr=1e-5, momentum=0.5)
or
optimizer=optimizers.SGD(lr=5e-6, momentum=0.5)

Which only got me to ~96% (the bottleneck model I used was already 95% using @yurkor 's suggestion)
For a multi-class model, you'd probably have to find a lr and momentum specific to the problem, although tuning is painfully time-consuming (took me ~10min. per epoch).

@saksham789

This comment has been minimized.

Show comment
Hide comment
@saksham789

saksham789 Jan 2, 2018

I am Solving a Dog Breed Classification problem which has 120 classes in all. Training Data has 6000 examples and validation Data has 3000 examples.i have tried using the above technique but it gives me:
which is training the top layer over the features obtained by passing traiing and validaton set over the bottom layers of vgg16.

def VGG16(Input_shape,classes):
X_Input=Input(Input_shape)
X=Flatten()(X_Input)

X=Dense(4096,activation='relu')(X)
X=Dropout(.5)(X)
X=Dense(4096,activation='relu')(X)
X=Dropout(.5)(X)
X=Dense(classes,activation='softmax')(X)

model=Model(inputs=X_Input,outputs=X)
return model

After,150 epochs
Training Accuracy :99.98%
Test Accuracy : 39%
and the loss would not decrease any further.
After Fine tuning by applying L2 penalty also there was a huge variance .
I am trying Data Augmentation but can'T figure out how to do it for multiclass problem cause i can't save labels the way it is shown in tutorial.(Using Data Augmentation)

Please Help!!!
Have Already Spent weeks on this
Thanxx in advance.

I am Solving a Dog Breed Classification problem which has 120 classes in all. Training Data has 6000 examples and validation Data has 3000 examples.i have tried using the above technique but it gives me:
which is training the top layer over the features obtained by passing traiing and validaton set over the bottom layers of vgg16.

def VGG16(Input_shape,classes):
X_Input=Input(Input_shape)
X=Flatten()(X_Input)

X=Dense(4096,activation='relu')(X)
X=Dropout(.5)(X)
X=Dense(4096,activation='relu')(X)
X=Dropout(.5)(X)
X=Dense(classes,activation='softmax')(X)

model=Model(inputs=X_Input,outputs=X)
return model

After,150 epochs
Training Accuracy :99.98%
Test Accuracy : 39%
and the loss would not decrease any further.
After Fine tuning by applying L2 penalty also there was a huge variance .
I am trying Data Augmentation but can'T figure out how to do it for multiclass problem cause i can't save labels the way it is shown in tutorial.(Using Data Augmentation)

Please Help!!!
Have Already Spent weeks on this
Thanxx in advance.

@sachsom

This comment has been minimized.

Show comment
Hide comment
@sachsom

sachsom Jan 4, 2018

I am doing for different images and i did the second code to get the h5 file now when i run the 3rd fine tuning code i get this error

ValueError: Dimension 1 in both shapes must be equal, but are 1 and 2 for 'Assign_366' (op: 'Assign') with input shapes: [256,1], [256,2].

sachsom commented Jan 4, 2018

I am doing for different images and i did the second code to get the h5 file now when i run the 3rd fine tuning code i get this error

ValueError: Dimension 1 in both shapes must be equal, but are 1 and 2 for 'Assign_366' (op: 'Assign') with input shapes: [256,1], [256,2].

@junkwhinger

This comment has been minimized.

Show comment
Hide comment
@junkwhinger

junkwhinger Jan 7, 2018

@aidiary

I found this page when looking for an answer whether i should use the original preprocessing function when fine-tuning VGG16.
I made two models; one with rescaling (divided by 255.0) and the other one with your preprocessing function.
The bottom line is that the model with preprocessing function did not converge at all, when the one with rescaling worked perfectly. But I still don't understand what causes this difference.

Anyway thanks for your input.

@aidiary

I found this page when looking for an answer whether i should use the original preprocessing function when fine-tuning VGG16.
I made two models; one with rescaling (divided by 255.0) and the other one with your preprocessing function.
The bottom line is that the model with preprocessing function did not converge at all, when the one with rescaling worked perfectly. But I still don't understand what causes this difference.

Anyway thanks for your input.

@hylaarborea

This comment has been minimized.

Show comment
Hide comment
@hylaarborea

hylaarborea Jan 9, 2018

I have implemented the model above, actually following the update of @saulthu .
Everything works fine with vgg16/vgg19 but I have problems with InceptionV3/Xception/InceptionResNetV2; the problem is subtle.

Let's start from the results of classifier_from_little_data_script_2.py (I consider in this example InceptionV3). Here you can see the results for loss/acc for the last six epochs:

loss: 0.0503 - acc: 0.9875 - val_loss: 0.2732 - val_acc: 0.9000
loss: 0.0401 - acc: 0.9913 - val_loss: 0.2899 - val_acc: 0.9050
loss: 0.0374 - acc: 0.9925 - val_loss: 0.2932 - val_acc: 0.9150
loss: 0.0322 - acc: 0.9950 - val_loss: 0.2888 - val_acc: 0.9000
loss: 0.0331 - acc: 0.9950 - val_loss: 0.2854 - val_acc: 0.8950
loss: 0.0301 - acc: 0.9963 - val_loss: 0.2896 - val_acc: 0.9050

We see from here that acc for train is very high (0.99, clearly overfitting), while acc for validation is more reasonable ~0.90. So everything works well.

I have created the file bottleneck_fc_model_inceptionv3.h5 which I will use in the script classifier_from_little_data_script_3.py

I run now the script_3 but I freeze ALL layers, so that the number of trainable params is zero. At this point I would expect no change in acc (I should see basically the same numbers I got from script_2 for the last epoch). This is exaclty what I see using vgg16/vgg19 (actually while acc for validation is really constant for all epoches, it fluctuates a bit for train: I think this is due to the fact that everytime, because batch_size and because the figures are shuffled, even if the weights are not changing, the value of acc can because the figures are changing).

(Note that when I use script_3 I expect both for train and validation a value of acc~0.90, because the figures have been shuffled with respect script_2 so I cannot expect overfitting now)

This is what I see for the first epoch:

1/16 [>.............................] - ETA: 43s - loss: 0.7263 - acc: 0.6000
2/16 [==>...........................] - ETA: 35s - loss: 0.7661 - acc: 0.5600
3/16 [====>.........................] - ETA: 31s - loss: 0.7433 - acc: 0.5800
4/16 [======>.......................] - ETA: 28s - loss: 0.7649 - acc: 0.5800
5/16 [========>.....................] - ETA: 25s - loss: 0.7589 - acc: 0.5880
6/16 [==========>...................] - ETA: 22s - loss: 0.7724 - acc: 0.5867
7/16 [============>.................] - ETA: 20s - loss: 0.7962 - acc: 0.5800
8/16 [==============>...............] - ETA: 18s - loss: 0.7927 - acc: 0.5800
9/16 [===============>..............] - ETA: 15s - loss: 0.7749 - acc: 0.5844
10/16 [=================>............] - ETA: 13s - loss: 0.7716 - acc: 0.5760
11/16 [===================>..........] - ETA: 11s - loss: 0.7814 - acc: 0.5655
12/16 [=====================>........] - ETA: 8s - loss: 0.7876 - acc: 0.5600
13/16 [=======================>......] - ETA: 6s - loss: 0.7810 - acc: 0.5600
14/16 [=========================>....] - ETA: 4s - loss: 0.7855 - acc: 0.5571
15/16 [===========================>..] - ETA: 2s - loss: 0.7882 - acc: 0.5560
16/16 [==============================] - 46s 3s/step - loss: 0.8105 - acc: 0.5450 - val_loss: 0.3043 - val_acc: 0.8850

and for the followings:

loss: 0.8105 - acc: 0.5450 - val_loss: 0.3043 - val_acc: 0.8850
loss: 0.8119 - acc: 0.5488 - val_loss: 0.3432 - val_acc: 0.8500
loss: 0.8019 - acc: 0.5563 - val_loss: 0.3779 - val_acc: 0.8350
loss: 0.8021 - acc: 0.5350 - val_loss: 0.4651 - val_acc: 0.7900
loss: 0.8203 - acc: 0.5588 - val_loss: 0.5278 - val_acc: 0.7500
loss: 0.8301 - acc: 0.5600 - val_loss: 0.5778 - val_acc: 0.6950
loss: 0.8162 - acc: 0.5363 - val_loss: 0.6122 - val_acc: 0.6750
loss: 0.8075 - acc: 0.5750 - val_loss: 0.6514 - val_acc: 0.6450
loss: 0.8428 - acc: 0.5350 - val_loss: 0.6832 - val_acc: 0.6400
loss: 0.8187 - acc: 0.5500 - val_loss: 0.7096 - val_acc: 0.6150

As you see something strange is happening:

  1. acc for "train" chance abruptly from ~0.90 to 0.54
  2. acc for validation is slowly going down but starting from 0.885 which is very close to ~0.90
  3. acc for validation is changing, but (as I saw for vgg16/vgg19) it should be constant!

I have the impression that the model is using the correct weights for validation but the wrong for train: this would explain why the value of acc falls down for train and goes slowly down for validation. I am not able to see the problem in the code. Can someone understand what is going on?

Here the code I am using (my version of keras is 2.1.2):

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense


# path to the model weights files.
top_model_weights_path = 'bottleneck_fc_model_inceptionv3.h5'
# dimensions of our images.
img_width, img_height = 139, 139


train_data_dir = './keras/data_test/train'
validation_data_dir = './keras/data_test/validation'
nb_train_samples = 800
nb_validation_samples = 200
epochs = 10
batch_size = 50

# build the network
base_model = applications.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))


# set all layers (312) to non-trainable (weights will not be updated)
for layer in model.layers[:312]:
    layer.trainable = False


# incptv3: 312 -> 0 params
#          311 -> 4,719,105 params



ii=0
for layer in model.layers:
    ii+=1
    print('ii: ',ii,', layer.trainable',layer.trainable)

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

#info a about the layers
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

#more info about the layers
model.summary()

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)



test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

# fine-tune the model
history=model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

print('acc : ', history.history['acc'] )
print('loss: ', history.history['loss'] )

I have implemented the model above, actually following the update of @saulthu .
Everything works fine with vgg16/vgg19 but I have problems with InceptionV3/Xception/InceptionResNetV2; the problem is subtle.

Let's start from the results of classifier_from_little_data_script_2.py (I consider in this example InceptionV3). Here you can see the results for loss/acc for the last six epochs:

loss: 0.0503 - acc: 0.9875 - val_loss: 0.2732 - val_acc: 0.9000
loss: 0.0401 - acc: 0.9913 - val_loss: 0.2899 - val_acc: 0.9050
loss: 0.0374 - acc: 0.9925 - val_loss: 0.2932 - val_acc: 0.9150
loss: 0.0322 - acc: 0.9950 - val_loss: 0.2888 - val_acc: 0.9000
loss: 0.0331 - acc: 0.9950 - val_loss: 0.2854 - val_acc: 0.8950
loss: 0.0301 - acc: 0.9963 - val_loss: 0.2896 - val_acc: 0.9050

We see from here that acc for train is very high (0.99, clearly overfitting), while acc for validation is more reasonable ~0.90. So everything works well.

I have created the file bottleneck_fc_model_inceptionv3.h5 which I will use in the script classifier_from_little_data_script_3.py

I run now the script_3 but I freeze ALL layers, so that the number of trainable params is zero. At this point I would expect no change in acc (I should see basically the same numbers I got from script_2 for the last epoch). This is exaclty what I see using vgg16/vgg19 (actually while acc for validation is really constant for all epoches, it fluctuates a bit for train: I think this is due to the fact that everytime, because batch_size and because the figures are shuffled, even if the weights are not changing, the value of acc can because the figures are changing).

(Note that when I use script_3 I expect both for train and validation a value of acc~0.90, because the figures have been shuffled with respect script_2 so I cannot expect overfitting now)

This is what I see for the first epoch:

1/16 [>.............................] - ETA: 43s - loss: 0.7263 - acc: 0.6000
2/16 [==>...........................] - ETA: 35s - loss: 0.7661 - acc: 0.5600
3/16 [====>.........................] - ETA: 31s - loss: 0.7433 - acc: 0.5800
4/16 [======>.......................] - ETA: 28s - loss: 0.7649 - acc: 0.5800
5/16 [========>.....................] - ETA: 25s - loss: 0.7589 - acc: 0.5880
6/16 [==========>...................] - ETA: 22s - loss: 0.7724 - acc: 0.5867
7/16 [============>.................] - ETA: 20s - loss: 0.7962 - acc: 0.5800
8/16 [==============>...............] - ETA: 18s - loss: 0.7927 - acc: 0.5800
9/16 [===============>..............] - ETA: 15s - loss: 0.7749 - acc: 0.5844
10/16 [=================>............] - ETA: 13s - loss: 0.7716 - acc: 0.5760
11/16 [===================>..........] - ETA: 11s - loss: 0.7814 - acc: 0.5655
12/16 [=====================>........] - ETA: 8s - loss: 0.7876 - acc: 0.5600
13/16 [=======================>......] - ETA: 6s - loss: 0.7810 - acc: 0.5600
14/16 [=========================>....] - ETA: 4s - loss: 0.7855 - acc: 0.5571
15/16 [===========================>..] - ETA: 2s - loss: 0.7882 - acc: 0.5560
16/16 [==============================] - 46s 3s/step - loss: 0.8105 - acc: 0.5450 - val_loss: 0.3043 - val_acc: 0.8850

and for the followings:

loss: 0.8105 - acc: 0.5450 - val_loss: 0.3043 - val_acc: 0.8850
loss: 0.8119 - acc: 0.5488 - val_loss: 0.3432 - val_acc: 0.8500
loss: 0.8019 - acc: 0.5563 - val_loss: 0.3779 - val_acc: 0.8350
loss: 0.8021 - acc: 0.5350 - val_loss: 0.4651 - val_acc: 0.7900
loss: 0.8203 - acc: 0.5588 - val_loss: 0.5278 - val_acc: 0.7500
loss: 0.8301 - acc: 0.5600 - val_loss: 0.5778 - val_acc: 0.6950
loss: 0.8162 - acc: 0.5363 - val_loss: 0.6122 - val_acc: 0.6750
loss: 0.8075 - acc: 0.5750 - val_loss: 0.6514 - val_acc: 0.6450
loss: 0.8428 - acc: 0.5350 - val_loss: 0.6832 - val_acc: 0.6400
loss: 0.8187 - acc: 0.5500 - val_loss: 0.7096 - val_acc: 0.6150

As you see something strange is happening:

  1. acc for "train" chance abruptly from ~0.90 to 0.54
  2. acc for validation is slowly going down but starting from 0.885 which is very close to ~0.90
  3. acc for validation is changing, but (as I saw for vgg16/vgg19) it should be constant!

I have the impression that the model is using the correct weights for validation but the wrong for train: this would explain why the value of acc falls down for train and goes slowly down for validation. I am not able to see the problem in the code. Can someone understand what is going on?

Here the code I am using (my version of keras is 2.1.2):

from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense


# path to the model weights files.
top_model_weights_path = 'bottleneck_fc_model_inceptionv3.h5'
# dimensions of our images.
img_width, img_height = 139, 139


train_data_dir = './keras/data_test/train'
validation_data_dir = './keras/data_test/validation'
nb_train_samples = 800
nb_validation_samples = 200
epochs = 10
batch_size = 50

# build the network
base_model = applications.InceptionV3(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))


# set all layers (312) to non-trainable (weights will not be updated)
for layer in model.layers[:312]:
    layer.trainable = False


# incptv3: 312 -> 0 params
#          311 -> 4,719,105 params



ii=0
for layer in model.layers:
    ii+=1
    print('ii: ',ii,', layer.trainable',layer.trainable)

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

#info a about the layers
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

#more info about the layers
model.summary()

# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)



test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

# fine-tune the model
history=model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

print('acc : ', history.history['acc'] )
print('loss: ', history.history['loss'] )
@iliya-hajjar

This comment has been minimized.

Show comment
Hide comment
@iliya-hajjar

iliya-hajjar Jan 12, 2018

Hi,
Is there any way to import images from directories without augmentation?
I augmented my data before and now I want to import them and train them without augmentation.

Hi,
Is there any way to import images from directories without augmentation?
I augmented my data before and now I want to import them and train them without augmentation.

@Saumya7

This comment has been minimized.

Show comment
Hide comment
@Saumya7

Saumya7 Mar 5, 2018

@raaju-shiv @micklexqg Did you find any solution for the error you guys are stuck at? I'm also facing the same issue.

Saumya7 commented Mar 5, 2018

@raaju-shiv @micklexqg Did you find any solution for the error you guys are stuck at? I'm also facing the same issue.

@artusi-a

This comment has been minimized.

Show comment
Hide comment
@artusi-a

artusi-a Mar 19, 2018

Hi, Thanks a lot for your code. I have a couple of questions that may clarify to me the fine tuning process:

  1. weights_path = '../keras/examples/vgg16_weights.h5'

this path is not used anywhere in the code, or at least i do not see it used. What is the purpose to have it?

  1. note that it is necessary to start with a fully-trained

classifier, including the top classifier,

in order to successfully do fine-tuning

top_model.load_weights(top_model_weights_path)

The weights of the top_layer have been pre-trained right? So I need first also to pre-train the
top layer right? So at the end how exactly works? could you make a more clear example?

Thanks

Hi, Thanks a lot for your code. I have a couple of questions that may clarify to me the fine tuning process:

  1. weights_path = '../keras/examples/vgg16_weights.h5'

this path is not used anywhere in the code, or at least i do not see it used. What is the purpose to have it?

  1. note that it is necessary to start with a fully-trained

classifier, including the top classifier,

in order to successfully do fine-tuning

top_model.load_weights(top_model_weights_path)

The weights of the top_layer have been pre-trained right? So I need first also to pre-train the
top layer right? So at the end how exactly works? could you make a more clear example?

Thanks

@tushar-ghule

This comment has been minimized.

Show comment
Hide comment
@tushar-ghule

tushar-ghule Apr 9, 2018

top_model_weights_path = 'fc_model.h5'

what is the significance of 'fc_model.h5' ? , where is it generating from?

top_model_weights_path = 'fc_model.h5'

what is the significance of 'fc_model.h5' ? , where is it generating from?

@zzaibi

This comment has been minimized.

Show comment
Hide comment
@zzaibi

zzaibi Apr 10, 2018

Is the topmodel weights 'fc_model.h5' here the same as the 'bottleneck_fc_model.h5' produced by script_2?

zzaibi commented Apr 10, 2018

Is the topmodel weights 'fc_model.h5' here the same as the 'bottleneck_fc_model.h5' produced by script_2?

@philippHRO

This comment has been minimized.

Show comment
Hide comment

philippHRO commented Apr 12, 2018

@zzaibi yes

@ssetty

This comment has been minimized.

Show comment
Hide comment
@ssetty

ssetty Apr 15, 2018

Hi, I am using transfer learning using keras-Xception model. Loss function is about 0.2 - 0.5. How do I decrease loss value? Thanks

ssetty commented Apr 15, 2018

Hi, I am using transfer learning using keras-Xception model. Loss function is about 0.2 - 0.5. How do I decrease loss value? Thanks

@fnando1995

This comment has been minimized.

Show comment
Hide comment
@fnando1995

fnando1995 Apr 20, 2018

Hello, quick question...
I don´t see where is the "cats_and_dogs_small" dir been used... or at least have not read anything about this directory. Can someone tell me if this is one created with other images different from the ones in the training and validation in script1.py?

Thanks

Hello, quick question...
I don´t see where is the "cats_and_dogs_small" dir been used... or at least have not read anything about this directory. Can someone tell me if this is one created with other images different from the ones in the training and validation in script1.py?

Thanks

@KennethYCK

This comment has been minimized.

Show comment
Hide comment
@KennethYCK

KennethYCK Apr 29, 2018

Sorry, I got a error like that
The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

I cannot find the weight file to download, '../keras/examples/vgg16_weights.h5'
Thanks

Sorry, I got a error like that
The shape of the input to "Flatten" is not fully defined (got (None, None, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

I cannot find the weight file to download, '../keras/examples/vgg16_weights.h5'
Thanks

@JinwenJay

This comment has been minimized.

Show comment
Hide comment
@JinwenJay

JinwenJay May 5, 2018

Why did I get the same val_acc in every epochs:
Epoch 1/50
125/125 [==============================] - 14s 109ms/step - loss: 0.5215 - acc: 0.9285 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 2/50
125/125 [==============================] - 13s 101ms/step - loss: 0.5790 - acc: 0.9245 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 3/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5965 - acc: 0.9265 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 4/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6562 - acc: 0.9135 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 5/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5102 - acc: 0.9315 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 6/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6308 - acc: 0.9305 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 7/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6134 - acc: 0.9230 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 8/50
125/125 [==============================] - 12s 100ms/step - loss: 0.6208 - acc: 0.9190 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 9/50
125/125 [==============================] - 13s 100ms/step - loss: 0.5764 - acc: 0.9295 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 10/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6356 - acc: 0.9185 - val_loss: 1.0519 - val_acc: 0.8838

Why did I get the same val_acc in every epochs:
Epoch 1/50
125/125 [==============================] - 14s 109ms/step - loss: 0.5215 - acc: 0.9285 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 2/50
125/125 [==============================] - 13s 101ms/step - loss: 0.5790 - acc: 0.9245 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 3/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5965 - acc: 0.9265 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 4/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6562 - acc: 0.9135 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 5/50
125/125 [==============================] - 13s 102ms/step - loss: 0.5102 - acc: 0.9315 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 6/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6308 - acc: 0.9305 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 7/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6134 - acc: 0.9230 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 8/50
125/125 [==============================] - 12s 100ms/step - loss: 0.6208 - acc: 0.9190 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 9/50
125/125 [==============================] - 13s 100ms/step - loss: 0.5764 - acc: 0.9295 - val_loss: 1.0519 - val_acc: 0.8838
Epoch 10/50
125/125 [==============================] - 13s 101ms/step - loss: 0.6356 - acc: 0.9185 - val_loss: 1.0519 - val_acc: 0.8838

@KennethYCK

This comment has been minimized.

Show comment
Hide comment
@KennethYCK

KennethYCK May 8, 2018

Sorry, not sure it is a good place to ask,
from the classifier_from_little_data_script_2.py, i can get around 90% accuracy
but if i follow classifer 3 for fine tine, the accuracy only 53%. dont know why

also, if i later want to add 1 more class, is it need to train again from the beginning
Thanks

Sorry, not sure it is a good place to ask,
from the classifier_from_little_data_script_2.py, i can get around 90% accuracy
but if i follow classifer 3 for fine tine, the accuracy only 53%. dont know why

also, if i later want to add 1 more class, is it need to train again from the beginning
Thanks

@michaelscheinfeild

This comment has been minimized.

Show comment
Hide comment
@michaelscheinfeild

michaelscheinfeild Jun 1, 2018

hi im new here.
but can you explain if i do the transfer learning.
i want to save the model after training completed.
Then load it in real time in other app or phone mobile
can you explain !!

hi im new here.
but can you explain if i do the transfer learning.
i want to save the model after training completed.
Then load it in real time in other app or phone mobile
can you explain !!

@gitskp

This comment has been minimized.

Show comment
Hide comment
@gitskp

gitskp Jun 2, 2018

How to download fc model.h5 .Please tell me

gitskp commented Jun 2, 2018

How to download fc model.h5 .Please tell me

@gitskp

This comment has been minimized.

Show comment
Hide comment
@gitskp

gitskp Jun 2, 2018

Please upload the fc_model.h5

gitskp commented Jun 2, 2018

Please upload the fc_model.h5

@naren142

This comment has been minimized.

Show comment
Hide comment
@naren142

naren142 Jun 26, 2018

@aquibjaved use expand_dims function to extend the input to other dimension.

@aquibjaved use expand_dims function to extend the input to other dimension.

@rahulkulhalli

This comment has been minimized.

Show comment
Hide comment
@rahulkulhalli

rahulkulhalli Jun 28, 2018

I've got a small query. While performing transfer learning in order to train ONLY the bottleneck layers, how many epochs should you train it for?

The example given here says 'train for a few epochs'. Can anyone give me a general rule of thumb?

I've got a small query. While performing transfer learning in order to train ONLY the bottleneck layers, how many epochs should you train it for?

The example given here says 'train for a few epochs'. Can anyone give me a general rule of thumb?

@aafmhh

This comment has been minimized.

Show comment
Hide comment
@aafmhh

aafmhh Jul 3, 2018

Hi
I'm running the code at https://github.com/imatge-upc/detection-2016-nipsws/commits/master
I installed keras 2.0.2, theano0.9.0 with (anaconda3)python3.5. and coding by pycharm on windows10.
but I'm getting an error:

File "C:/Users/heram/PycharmProjects/Hirarchical obj detec/scripts/image_zooms_training.py", line 78, in
model_vgg = obtain_compiled_vgg_16(path_vgg)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 251, in obtain_compiled_vgg_16
model = vgg_16(vgg_weights_path)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 295, in vgg_16
model.add(Flatten())
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\models.py", line 455, in add
output_tensor = layer(self.outputs[0])
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\engine\topology.py", line 559, in call
output_shape = self.compute_output_shape(input_shape)
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\layers\core.py", line 488, in compute_output_shape
'(got ' + str(input_shape[1:]) + '. '
ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

code
in image_zoomz_training.py:
model_vgg = obtain_compiled_vgg_16(path_vgg)

in features.py file:
def obtain_compiled_vgg_16(vgg_weights_path):
model = vgg_16(vgg_weights_path)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
return model

def vgg_16(weights_path=None):
model = Sequential()
model.add(ZeroPadding2D((1, 1), input_shape=(3, 224, 224)))
model.add(Conv2D(64,(3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))

if weights_path:
    model.load_weights(weights_path)

return model

aafmhh commented Jul 3, 2018

Hi
I'm running the code at https://github.com/imatge-upc/detection-2016-nipsws/commits/master
I installed keras 2.0.2, theano0.9.0 with (anaconda3)python3.5. and coding by pycharm on windows10.
but I'm getting an error:

File "C:/Users/heram/PycharmProjects/Hirarchical obj detec/scripts/image_zooms_training.py", line 78, in
model_vgg = obtain_compiled_vgg_16(path_vgg)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 251, in obtain_compiled_vgg_16
model = vgg_16(vgg_weights_path)
File "C:\Users\heram\PycharmProjects\Hirarchical obj detec\scripts\features.py", line 295, in vgg_16
model.add(Flatten())
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\models.py", line 455, in add
output_tensor = layer(self.outputs[0])
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\engine\topology.py", line 559, in call
output_shape = self.compute_output_shape(input_shape)
File "C:\Users\heram\Anaconda3\envs\Hirarchical obj detec\lib\site-packages\keras\layers\core.py", line 488, in compute_output_shape
'(got ' + str(input_shape[1:]) + '. '
ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 7, 512). Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.

code
in image_zoomz_training.py:
model_vgg = obtain_compiled_vgg_16(path_vgg)

in features.py file:
def obtain_compiled_vgg_16(vgg_weights_path):
model = vgg_16(vgg_weights_path)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
return model

def vgg_16(weights_path=None):
model = Sequential()
model.add(ZeroPadding2D((1, 1), input_shape=(3, 224, 224)))
model.add(Conv2D(64,(3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(512, (3, 3), activation="relu"))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))

if weights_path:
    model.load_weights(weights_path)

return model
@francescopelizza-omega

This comment has been minimized.

Show comment
Hide comment
@francescopelizza-omega

francescopelizza-omega Jul 18, 2018

Hello there,

So when the training is concluded and I am happy enough...How can I then use the trained model to predict new unknow pictures of cats/dogs?

Thanks

Hello there,

So when the training is concluded and I am happy enough...How can I then use the trained model to predict new unknow pictures of cats/dogs?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment