'''This script goes along the blog post | |
"Building powerful image classification models using very little data" | |
from blog.keras.io. | |
It uses data that can be downloaded at: | |
https://www.kaggle.com/c/dogs-vs-cats/data | |
In our setup, we: | |
- created a data/ folder | |
- created train/ and validation/ subfolders inside data/ | |
- created cats/ and dogs/ subfolders inside train/ and validation/ | |
- put the cat pictures index 0-999 in data/train/cats | |
- put the cat pictures index 1000-1400 in data/validation/cats | |
- put the dogs pictures index 12500-13499 in data/train/dogs | |
- put the dog pictures index 13500-13900 in data/validation/dogs | |
So that we have 1000 training examples for each class, and 400 validation examples for each class. | |
In summary, this is our directory structure: | |
``` | |
data/ | |
train/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
validation/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
``` | |
''' | |
import numpy as np | |
from keras.preprocessing.image import ImageDataGenerator | |
from keras.models import Sequential | |
from keras.layers import Dropout, Flatten, Dense | |
from keras import applications | |
# dimensions of our images. | |
img_width, img_height = 150, 150 | |
top_model_weights_path = 'bottleneck_fc_model.h5' | |
train_data_dir = 'data/train' | |
validation_data_dir = 'data/validation' | |
nb_train_samples = 2000 | |
nb_validation_samples = 800 | |
epochs = 50 | |
batch_size = 16 | |
def save_bottlebeck_features(): | |
datagen = ImageDataGenerator(rescale=1. / 255) | |
# build the VGG16 network | |
model = applications.VGG16(include_top=False, weights='imagenet') | |
generator = datagen.flow_from_directory( | |
train_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode=None, | |
shuffle=False) | |
bottleneck_features_train = model.predict_generator( | |
generator, nb_train_samples // batch_size) | |
np.save(open('bottleneck_features_train.npy', 'w'), | |
bottleneck_features_train) | |
generator = datagen.flow_from_directory( | |
validation_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode=None, | |
shuffle=False) | |
bottleneck_features_validation = model.predict_generator( | |
generator, nb_validation_samples // batch_size) | |
np.save(open('bottleneck_features_validation.npy', 'w'), | |
bottleneck_features_validation) | |
def train_top_model(): | |
train_data = np.load(open('bottleneck_features_train.npy')) | |
train_labels = np.array( | |
[0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2)) | |
validation_data = np.load(open('bottleneck_features_validation.npy')) | |
validation_labels = np.array( | |
[0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2)) | |
model = Sequential() | |
model.add(Flatten(input_shape=train_data.shape[1:])) | |
model.add(Dense(256, activation='relu')) | |
model.add(Dropout(0.5)) | |
model.add(Dense(1, activation='sigmoid')) | |
model.compile(optimizer='rmsprop', | |
loss='binary_crossentropy', metrics=['accuracy']) | |
model.fit(train_data, train_labels, | |
epochs=epochs, | |
batch_size=batch_size, | |
validation_data=(validation_data, validation_labels)) | |
model.save_weights(top_model_weights_path) | |
save_bottlebeck_features() | |
train_top_model() |
This comment has been minimized.
This comment has been minimized.
I get an error at the point where it starts
It seems to suggest that the numpy array saved in the |
This comment has been minimized.
This comment has been minimized.
How to get the labels when training more than two categories? Is it possible to infer the labels from the generator? |
This comment has been minimized.
This comment has been minimized.
I'm getting an error in the line More generally, I get a similar unexpected keyword error if I uncomment any of the commented lines in the following, which is taken from the
I'm using Python 2.7 with Tensor Flow backend. Appreciate any help. (Francois, these gists and your blog posts have been really helpful to me, thanks for taking the time to share them!) |
This comment has been minimized.
This comment has been minimized.
@jdelange To solve the ValueError
If you are using Windows platform, you must use the binary mode of files when saving numpy array, so just add the char 'b' to the mode, like this: This weird non cross-platform python behavior explained here |
This comment has been minimized.
This comment has been minimized.
It's awesome!Thank you so much |
This comment has been minimized.
This comment has been minimized.
Ideally we want use original VGG image prepossessing mean and no scale
it gives final accuracy 95 instead of 90 Epoch 3/10 |
This comment has been minimized.
This comment has been minimized.
I really love this example. Thank You Very Much! |
This comment has been minimized.
This comment has been minimized.
How do I use this model to predict classification weights_path = 'vgg16_weights.h5' def VGG16_bottlebeck_features():
if name == 'main':
And ERROR: HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. |
This comment has been minimized.
This comment has been minimized.
I really love this simple example. But I got a problem now. Can anyone tell me how to make prediction? Here is what I did. load json and create modeljson_file = open('model_vgg.json', 'r') load weights into new modelassert os.path.exists(weights_path), 'Model weights not found (see "weights_path" variable in script).' for k in range(f.attrs['nb_layers']): load json and create modeljson_file = open('model_catdog.json', 'r') load weights into new modelmodel2.load_weights("bottleneck_fc_model.h5") img = cv2.imread("dog.1000.jpg") feature = model1.predict(img) |
This comment has been minimized.
This comment has been minimized.
I have done exactly what this code does, by the end of 50th epoch I am getting high training accuracy but ≈ 50% validation accuracy. In fact the validation accuracy hovers around 50% throughout. See the last 3 epochs. See the non-changing val_loss and val_acc. What am i doing wrong? Epoch 48/50 Facts -
What am i doing wrong? |
This comment has been minimized.
This comment has been minimized.
I also wonder, how to make predictions. But I got error: |
This comment has been minimized.
This comment has been minimized.
Question:how to use this to multiclass classification? In order to apply it to multiclass problem, I changed the loss function t'binary_crossentropy' to 'categorical_crossentropy', but the acc could only reach 50% or so? I used 700 images for training 69 classes. |
This comment has been minimized.
This comment has been minimized.
@srikar2097, did you find out a solution? Facing the same issue. |
This comment has been minimized.
This comment has been minimized.
I carefully followed the given procedure given both at blog and here. Unfortunately, I am getting the following error which I couldn't find a way to solve: |
This comment has been minimized.
This comment has been minimized.
@paho67 I am also facing the same issue |
This comment has been minimized.
This comment has been minimized.
@paho67 @AakashKumarNain When I do the prediction, it looks like this:
I am not sure this is 100% correct since I am not getting accurate predictions, but I am getting a prediction nonetheless. |
This comment has been minimized.
This comment has been minimized.
@srikar2097, I found the reason for that issue. flow_from_directory selects from one folder at a time and as there are more then 2000 images in the first folder (13k for me) it just takes images from one folder/class but still pretends like the val data is half from one class half from the other. See line 145. |
This comment has been minimized.
This comment has been minimized.
I'm getting the following error for line 79:
|
This comment has been minimized.
This comment has been minimized.
Hi everybody! |
This comment has been minimized.
This comment has been minimized.
@jessicaowensby I think that this is not solution for my problem. I wonder how to change this that way that I could give it 150x150 image (array with shape (1, 3, 150, 150))? |
This comment has been minimized.
This comment has been minimized.
@bayraktare Thanks for this fix! |
This comment has been minimized.
This comment has been minimized.
@srikar2097 and @shackenberg So the specific way to fix this is to enter the correct number in nb_train_samples and nb_validation_samples. @bayraktare Thanks--that is a better way than what I'm doing which is modifying every file with:
FINAL NOTE: it appears that numpy has changed its interface so that np.save now takes a filename. The np.save appears to add '.npy' to the end (whereas load does not), so instead of opening the file, you have to use this interface: I was getting: |
This comment has been minimized.
This comment has been minimized.
I'm just going from fixing one bug to the next and not getting the code to work. With the installation etc I am easily at 20h work now (Vagrant Mac) installing one package after the other and fixing things. I don't think the code is fixable with Theano, I am now going to install tf, but am growing tired. Would it be possible to make a VM which just runs and has a complete set of everything? I would be willing to work on that. I have a PhD in AI, and I'm 60 years old, but this is situation is more for a 15 year old with time on his hands and knowledge of Python version issues. |
This comment has been minimized.
This comment has been minimized.
Here is the error I am now getting |
This comment has been minimized.
This comment has been minimized.
@edmundronald : I think you have an installation problem. I use Ubuntu 16.04, Python3.5, Tensorfow and the only modification I have to made was changing from "tf" to "th" in ~/.keras/keras.json |
This comment has been minimized.
This comment has been minimized.
@xcmax (or anyone who attempted applying the code for multiple classes) did you figure out how to make it work for multi-class? Thanks |
This comment has been minimized.
This comment has been minimized.
I am getting the following error Using TensorFlow backend. This seems to be a numpy error not a keras error. My settings are exactly like the example except the number of training and validation examples. Has anyone else encountered this before? |
This comment has been minimized.
This comment has been minimized.
@ManasHardas I am having exactly the same issue on my own dataset. Did you find a solution? The problem seems to be related to What I don't understand is why the original code was supposed to work like this, since |
This comment has been minimized.
This comment has been minimized.
Problem solved! Both my |
This comment has been minimized.
This comment has been minimized.
@tlind I am still getting error. Using TensorFlow backend. |
This comment has been minimized.
This comment has been minimized.
@davidpickup I think the errors might be due to a tensorflow/theano issue. It took me a while, but I put together a notebook that runs all the code with a tensorflow backend. Its available here: https://github.com/rajshah4/image_keras |
This comment has been minimized.
This comment has been minimized.
I resolved --- TypeError: write() argument must be str, not bytes --- by following @daveselinger suggestion. Create model without .npy extension For prediction I am doing the following But getting this error @rajivshah4 - were you able to load the classifier layer weights and actually predict the class for a single test image? |
This comment has been minimized.
This comment has been minimized.
Don't you have to pass the test images to the same generator if you want to predict on them, like the train and validation images. So you get the (None, 512L, 2L, 2L) shape. Like: generator = datagen.flow(
test = np.load('bottleneck_features_test.npy') predictions = model.predict(test, verbose=1) |
This comment has been minimized.
This comment has been minimized.
@MayorPain - I have created as follow the generator and send through the convolutional layer as follow: generator = datagen.flow_from_directory( My question is that when this generator searches test-folder, it ask for a subfolder and count that subfolder as a class. I have run my program and everything is smooth but during the predication accuracy is really low. Although I have trained over entire Cat-&-Dog dataset with 50 Epochs and my validation accuracy for train data and validation data reaches to 99.2 %. But when I predict the test data it's really worse. Would you please share what is wrong with my test-images or the generator provide some ambiguity problem. |
This comment has been minimized.
This comment has been minimized.
Thanks for the tutorial , its nice one. I get this error: |
This comment has been minimized.
This comment has been minimized.
@bluelight773 |
This comment has been minimized.
This comment has been minimized.
I got File "data/Classifier2.py", line 100, in train_top_model UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 142: character maps to undefined did any one have the same issue and get solved?> |
This comment has been minimized.
This comment has been minimized.
For multiclass prediction you need to convert your labels via to_categorical. For more info see this: |
This comment has been minimized.
This comment has been minimized.
I am getting some errors while predicting test data. How to predict the images for test data in Using the bottleneck features of a pre-trained network section. I have tried the following : test_data_dir = '/Dogs_Cats/data/test' ## There are 100 imagesin test data test_data_features = model.predict_generator(generator, 100) I am getting errors: |
This comment has been minimized.
This comment has been minimized.
that's why our little 2-layers network is not specified for getting pictures as input. |
This comment has been minimized.
This comment has been minimized.
Hi Folks, Anyone know the reason why I get loss : nan for some end epochs . I was trying this code with a multiclass classification Epoch 44/50 Thanks, |
This comment has been minimized.
This comment has been minimized.
Hi, First of all, thanks for a great tutorial! I have only just discovered keras and this example has shown me how simple and powerful development using the keras framework can be. I have followed the tutorial (and made the modification to use the published vgg16 image preprocessing means suggested by @yurkor) but i seem to get significantly different results using the theano backend and the tensorflow backend - does this happen to anybody else? Using the vgg16 image preprocessing means and the theano backend I get a validation accuracy of ~95% after 50 epochs, but using the tensorflow backend the validation accuracy is between 88 - 90% after 50 epochs. Without the vgg16 image preprocessing means the accuracy is around 90% for the theano backend and 86% for tensorflow. Any ideas why theano seems to outperform tensorflow? I thought that they would be almost identical. Regards, Alex |
This comment has been minimized.
This comment has been minimized.
Follow up: If I use the vgg16 example with imagenet weights provided with keras:
to do the feature extraction and then build the classifier on top of that (as in this tutorial), then I get the same accuracy using the tensorflow backend as with using the theano backend. I assume I'm messing up the conversion of the vgg 16 weights from theano format to tensorflow format. At the moment I am doing:
which came from: I want to use something like convert_kernel - but am having difficulty going from the h5 file to the numpy array. Any ideas how I should be doing it? Regards, Alex |
This comment has been minimized.
This comment has been minimized.
TypeError: can't multiply sequence by non-int of type 'float' TypeError Traceback (most recent call last) in train_top_model() TypeError: can't multiply sequence by non-int of type 'float' Help please |
This comment has been minimized.
This comment has been minimized.
@rk-ka Are you running this under Python 3.5? I had this exact issue when I ran it under Python 3.5, however I was able to find a solution to the problem. Change the following line: To: And also change the following line: To: |
This comment has been minimized.
This comment has been minimized.
@srikar2097 I had the same problem. My problem was not the number of images in the folders, however removing the file type in the filename solved the problem, see example below.
Loading the bottleneck features with @rk-ka Changing the second argument to |
This comment has been minimized.
This comment has been minimized.
Here is my code for building the model import h5py def save_bottleneck_features():
)
def train_top_model():
#save_bottleneck_features() I used h5 format to save the weights. for testing, I created a test directory with 20 images import h5py datagen= ImageDataGenerator(rescale= 1./255) bottleneck_features_test= base_model.predict_generator(generator,20) json_file = open('model_catsVSdogs_bottleNeck1.json', 'r') model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) |
This comment has been minimized.
This comment has been minimized.
I am getting an Error, when trying to runk the save_bottlebeck_features() function:
I am using python 3.5 with anaconda on ubuntu 16.04 Can someone help me with that? |
This comment has been minimized.
This comment has been minimized.
@Wheele9 It seems numpy changed the save/load methods to use a filename. You can change each np.save and np.load (2 of each) to something like
|
This comment has been minimized.
This comment has been minimized.
I've got an error running the program, can someone help why this happened?
|
This comment has been minimized.
This comment has been minimized.
@galactica147 You need to change input_shape=(3, img_width, img_height) to input_shape=(img_width, img_height,3) |
This comment has been minimized.
This comment has been minimized.
@galactica147 You need to change input_shape=(3, img_width, img_height) to input_shape=(img_width, img_height,3) |
This comment has been minimized.
This comment has been minimized.
@xiaokaisq, thank you for the tip, but now i ran into another error:
Any ideas how to further fix this? (new to keras) |
This comment has been minimized.
This comment has been minimized.
ok, I fixed the crash by this tip: keras-team/keras#3945 |
This comment has been minimized.
This comment has been minimized.
@yurkor @daveselinger Thanks for your hints! Worked like a charm. @yurkor I had to change the .reshape for Tensorflow to:
Could improve the accuracy like you suggested. |
This comment has been minimized.
This comment has been minimized.
Maybe I'm to stupid, but I'm not able to test my model. The training runs fine, but how should I test my model after that? Always running into some exceptions while trying to predict my own classes. |
This comment has been minimized.
This comment has been minimized.
what i do to predict is like this , just put the image into the VGG16 again and use the output bottleneck feature to predict.... if name=='main': |
This comment has been minimized.
This comment has been minimized.
Guys please help me. I followed this example and when i try to predict on model from "train_top_model" with the following code: I got ValueError: Error when checking : expected flatten_2_input to have shape (None, 4, 4, 512) but got array with shape (1, 150, 150, 3) |
This comment has been minimized.
This comment has been minimized.
I got the first tutorial running fine, but I'm having trouble with this second one. The script runs, I get these messages on the command prompt: https://image.ibb.co/c729GF/cmd_script2.png The script just keeps running, taking up 100% CPU and doing nothing after these messages appear. I tried running in python 2.7 on Ubuntu and python 3.5 on Windows. Both times the same thing happened. How do I fix this? @SoftwareentwicklungHell @BoldinovaNatalya @liangxiao05 |
This comment has been minimized.
This comment has been minimized.
@JamesKeras. I'm having a similar problem to yours. The script keeps running and I don't get the desired output. Were you able to find a solution to this issue? Thanks. |
This comment has been minimized.
This comment has been minimized.
First of all thanks for a great tutorial! I ran into some problems while running the code.
It turned out though, that it has nothing to do with a memory error instead, the error was caused by the img = img.resize(wh_tuple) in the def load_img found in the image.py file. Running this block of code separately resulted in the following error message.
I haven't figured out why this particular image makes the function fail since it too comes in JPEG format. I can get the code to work by adding a try and except in the def load_img and in the class DirectoryIterator(Iterator), but I guess that some of you might have a better solution. Also, some of you might know how to include a fix in the original code, such that other users of Keras don't run into the same problem as me. |
This comment has been minimized.
This comment has been minimized.
I have GTX1080 GPU, but predict_generator is dramatically slow.
|
This comment has been minimized.
This comment has been minimized.
@biswatherockstar I faced the same problem. I solved it by adding 'wb' and 'rb' in the save and load functions respectively. It has something to do with Windows and binary files. |
This comment has been minimized.
This comment has been minimized.
Hi @fchollet and everyone Now, I wrote the below code for predicting new image. The problem is Is there any mistake in merging the two models? I know Any suggestion?
|
This comment has been minimized.
This comment has been minimized.
Hi @michelyang Thanks |
This comment has been minimized.
This comment has been minimized.
Hi Everyone and @michelyang
I just normalized the image values
and I removed: Thanks |
This comment has been minimized.
This comment has been minimized.
@abderhassan I abandoned keras. I used tensorflow alone to create a deep learning object detector. I adapted this tutorial: https://www.tensorflow.org/tutorials/image_retraining Keras was too hard to get working. |
This comment has been minimized.
This comment has been minimized.
Hi everyone ! I need your help :) _img_width, img_height = 150, 150 top_model_weights_path = 'bottleneck_fc_model.h5' def save_bottlebeck_features():
def train_top_model():
save_bottlebeck_features() I am getting different Error messages :
I tried with and without '.npy', with and without 'wb' and 'rb', with and without 'import gc; gc.collect()' But nothing works... |
This comment has been minimized.
This comment has been minimized.
@Wheele9 The approach proposed by @BrianMowrey obviously avoids the problem altogether. |
This comment has been minimized.
This comment has been minimized.
I think there's an error with the
Otherwise you're creating around 32,000 samples. |
This comment has been minimized.
This comment has been minimized.
Is the function name 'save_bottlebeck_features' a mistake? Do you mean 'bottleneck'? |
This comment has been minimized.
This comment has been minimized.
@LaetiM I get the same error. |
This comment has been minimized.
This comment has been minimized.
Hi, I have problem and need your help!! "ValueError: Input arrays should have the same number of samples as target arrays. Found 180 input samples and 200 target samples." Im using: my code
then I got this Error
I think this error cause |
This comment has been minimized.
This comment has been minimized.
This is my code. It even predicts the class for a new image |
This comment has been minimized.
This comment has been minimized.
Hi All, I got a problem while using this example. I am using the same code. Can anyone help? keras 2.0.4 Error is: ValueError: Input dimension mis-match. (input[1].shape[1] = 3, input[2].shape[1] = 64) HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. |
This comment has been minimized.
This comment has been minimized.
Hello,
` Anybody have an idea why this isn't working? |
This comment has been minimized.
This comment has been minimized.
Is it mandatory to have equal number of samples for each class? In the requirement it is stated that there are 1000 images of each cats and dogs in the training set and 400 images each in validation set. What if I don't have equal number of instances for each class? Will my params for predict_generator method call change? If yes, how do I do that? Thank you. |
This comment has been minimized.
This comment has been minimized.
@tomisuker The simple resolution is |
This comment has been minimized.
This comment has been minimized.
Hi All, How to find test accuracy using this code? I already have folder structure for test. How can I use model.evaluate? |
This comment has been minimized.
This comment has been minimized.
Hi All, I have a question: |
This comment has been minimized.
This comment has been minimized.
@elsadarwin: Nope, you can have different sizes of each training set |
This comment has been minimized.
This comment has been minimized.
All goodproblem was a spelling mistake from my site within the class. System
ErrorIm running into this error and can't solve the issue:
|
This comment has been minimized.
This comment has been minimized.
Edit: And it takes ages for the first Process. Is that normal? @tomisuker did the solution from @zixingyan helped in your case?
In my case i still get the same error as before.
is there anybody who could help? |
This comment has been minimized.
This comment has been minimized.
@GitHubKay You can solve the "ValueError: Input arrays should have the same number of samples as target arrays...." error by modifying the 'model.predict_generator' lines to the following,
and,
|
This comment has been minimized.
This comment has been minimized.
@Thimira yeah, thx i already solved this problem ;) |
This comment has been minimized.
This comment has been minimized.
I encountered a similar problem as @GitHubKay and @tomisuker
Reading the code it looks like training sample number and validation sample number must be whole times of the batch_size 16. So for my case I just changed the validation sample numbers from 200 to 208 (13 * 16) and solved the problem. The training sample number 2000 is already the whole number of the batch_size. |
This comment has been minimized.
This comment has been minimized.
Hello , While running this code getting following error: Using TensorFlow backend. File "", line 75, in File "", line 34, in save_bottlebeck_features File "/home/nd/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 509, in save File "/home/nd/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 555, in write_array File "/home/nd/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 328, in _write_array_header TypeError: write() argument must be str, not bytes I am Using Ubuntu. Tried Solution : change 'w' to 'wb' in line 70 and 81. Thnx in advance |
This comment has been minimized.
This comment has been minimized.
one more question: in starting they asked to arrange data like
but at end it is given: |
This comment has been minimized.
This comment has been minimized.
Image AugmentationHey, i wonder why we use the same ImageDataGenerator for train and validation Data for the generation of the weights for the Optimization process. Within the optimization process (OP) we use different Augmentations for our Data. Since we use the weights of this model to fine tune them in the OP, couldn't we do the same here to generate more generalized weights? Edit: Cheers! i'm speaking about this Part:
|
This comment has been minimized.
This comment has been minimized.
please someone reply |
This comment has been minimized.
This comment has been minimized.
I've created a tutorial on how to get this working for multi-class classification, and how to use it to make predictions once trained. You can find the tutorial here: Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow |
This comment has been minimized.
This comment has been minimized.
@Thimira great work, thank you for that tutorial, exactly, what I was looking for |
This comment has been minimized.
This comment has been minimized.
I'm trying to run this example on AWS Machine learning AMI but I get the error when executing predict_generator: Any help or pointers where to look for more info would be highly appreciated. |
This comment has been minimized.
This comment has been minimized.
@softberries as you found, this was an issue with the Keras version. The AWS pre-built Deep Learning AMI only has Keras 1.2.2 |
This comment has been minimized.
This comment has been minimized.
@Thimira ,the solution of "ValueError: Input arrays should have the same number of samples as target arrays...." you said do solve the problem,but my loss and accuracy didn't change anymore. |
This comment has been minimized.
This comment has been minimized.
Thank you so much @rajshah4 ! |
This comment has been minimized.
This comment has been minimized.
UnicodeDecodeError Traceback (most recent call last) in train_top_model() /usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding) /usr/lib/python3.6/codecs.py in decode(self, input, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte any have idea what should i do ? |
This comment has been minimized.
This comment has been minimized.
Thank you for a great tutorial! After some more brainstorming(and a lot more googling) I decided to move to the third part, BUT I got error Any help would be great and thankful |
This comment has been minimized.
This comment has been minimized.
Does anybody know why saving the bottleneck features prevents us from image augmentation (as Francoise writes)? I have to use it but when I do in the bottleneck saving part it does not work I just dont get why. How can I still use it? |
This comment has been minimized.
This comment has been minimized.
re-arranged the methods and added plotting of training history. Overfitting is evident. https://github.com/aspiringguru/fastai_bmt_reworking/blob/master/kerasBlog/Keras_cats_dogs_2.ipynb |
This comment has been minimized.
This comment has been minimized.
Besides the multiple errors on Python 3, like @aspiringguru said there's definitely overfitting. It's clearly visible in the second section of the Keras blog tutorial. The Epoch 1/50 |
This comment has been minimized.
This comment has been minimized.
I get this error too:
But I already applied all fixes previously mentioned and I haven't changed the values from |
This comment has been minimized.
This comment has been minimized.
I have got an error
=============================================================================================== Why this error? |
This comment has been minimized.
This comment has been minimized.
Hi everyone! I got Intel 2.7 GHz with 8 GB. It seems like to take forever. How long should it take? what am i doing wrong? Thanks, |
This comment has been minimized.
This comment has been minimized.
I am Solving a Dog Breed Classification problem which has 120 classes in all. Training Data has 6000 examples and validation Data has 3000 examples.i have tried using the above technique but it gives me: def VGG16(Input_shape,classes): X=Dense(4096,activation='relu')(X) model=Model(inputs=X_Input,outputs=X) Please Help!!! |
This comment has been minimized.
This comment has been minimized.
Hey, how can I get the confusion_matrix using the original code (classifier_from_little_data_script_2.py)? Thanks! |
This comment has been minimized.
This comment has been minimized.
Hi all, I just got this script running using python 3.6. Wanted to provide some guidance as I ran into some of the same errors that others did. The first error I got was: …to avoid this error, just remove the open() functions nested in the np.save() function and np.load() function. For example, replace: With: And replace: With: The second error I received was: …this can be solved by ensuring the number of sample images that you set up in the directory for both the training and the test sets are wholly divisible by the number you select for you batch size, so that there is a whole number of iterations. |
This comment has been minimized.
This comment has been minimized.
Update : After that got another error mentioned by @rk-ka - can't multiply sequence by non-int of type 'float' Fixed the new error by following @jamiejwsmith's solution posted above ! Code ran fine after this. Original: UnicodeDecodeError Traceback (most recent call last) in train_top_model() /usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding) /usr/lib/python3.6/codecs.py in decode(self, input, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte |
This comment has been minimized.
This comment has been minimized.
VGG-16 uses 224 by 224 as input image size. But the size of image is 150 by 150 here. Does this not throw error ? |
This comment has been minimized.
This comment has been minimized.
@orrimoch it can take up to 10 minutes, here's the expected output using a macbook pro - https://asciinema.org/a/rYD3A5TIpfKyOh0x3alQPFTqr |
This comment has been minimized.
This comment has been minimized.
I stuck to the encoding problem. train_data = np.load(open('bottleneck_features_train.npy')) #on this line so please help me.....need any extra arguments for encoding I use Windows platform. |
This comment has been minimized.
This comment has been minimized.
So there seem to be a lot of people struggling to predict using the model. It's simple, just use the following code:
|
This comment has been minimized.
This comment has been minimized.
@fchollet I think there is a typo on lines 56 and 110? |
This comment has been minimized.
This comment has been minimized.
I am struggling with a problem with the line:
It gives me an error: I already went through the comments and changed: Can anyone guide me on this? Thanks in advance. |
This comment has been minimized.
This comment has been minimized.
@dAmnation69 you would better save it with |
This comment has been minimized.
This comment has been minimized.
I have the problem of : AssertionError: Model weights not found (see "weights_path" variable in script). How can I solve it? |
This comment has been minimized.
This comment has been minimized.
i did not understand what exactly is the output of model.predict_generator() . As far as i understand it returns probabilities of the predicted output. If this is the case it should give two probabilities( because in my case i have just two classes , CTAS and DOGS). Below is the code.
OUTPUT- [[0. 0. 0.00883913 ... 0.40895757 0.44412684 0. ] [[0.8991803 0. 0.21461597 ... 0. 0. 0. ] [[1.3933742 0. 0.21758774 ... 0. 0.69323623 0. ] i do not understand what is this output of print bottleneck_features_train[0] represents. |
This comment has been minimized.
This comment has been minimized.
This is the code that could be executed successfully after all above modifications. (Updated at 2018-09-23 16:47:09 UTC+8) # This script goes along the blog post
# "Building powerful image classification models using very little data"
# from blog.keras.io.
# It uses data that can be downloaded at:
# https://www.kaggle.com/c/dogs-vs-cats/data
# In our setup, we:
# - created a data/ folder
# - created train/ and validation/ subfolders inside data/
# - created cats/ and dogs/ subfolders inside train/ and validation/
# - put the cat pictures index 0-999 in data/train/cats
# - put the cat pictures index 1000-1400 in data/validation/cats
# - put the dogs pictures index 12500-13499 in data/train/dogs
# - put the dog pictures index 13500-13900 in data/validation/dogs
# So that we have 1000 training examples for each class, and 400 validation examples for each class.
# In summary, this is our directory structure:
#
# data/
# train/
# dogs/
# dog001.jpg
# dog002.jpg
# ...
# cats/
# cat001.jpg
# cat002.jpg
# ...
# validation/
# dogs/
# dog001.jpg
# dog002.jpg
# ...
# cats/
# cat001.jpg
# cat002.jpg
# ...
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
# dimensions of our images.
img_width, img_height = 150, 150
top_model_weights_path = 'bottleneck_fc_model.h5'
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
def save_bottlebeck_features():
datagen = ImageDataGenerator(rescale=1. / 255)
# build the VGG16 network
model = applications.VGG16(include_top=False, weights='imagenet')
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_train = model.predict_generator(
generator, nb_train_samples // batch_size)
np.save('bottleneck_features_train.npy',bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(
generator, nb_validation_samples // batch_size)
np.save('bottleneck_features_validation.npy',bottleneck_features_validation)
def train_top_model():
train_data = np.load('bottleneck_features_train.npy')
train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))
validation_data = np.load('bottleneck_features_validation.npy')
validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
save_bottlebeck_features()
train_top_model() |
This comment has been minimized.
This comment has been minimized.
Hi @fchollet I have a doubt about how to use pre-trained models in Keras. After reading a little bit on StackOverflow I got to the understanding that in order to extract the "bottleneck features" for a pre-trained model, one has to preprocess the images using the function I think this is in agreement with what you explain in the documentation for the In the code for this tutorial, however, you simply divide the input by 255 (which as much as I understood is different from what the As I am a newbie to keras, this left me a bit confused. Can you please confirm that the right way is indeed to use If this is the case, I think that adding a note in the documentation for the "Applications" module and fixing this it in the tutorial would greatly help new keras users. I would be happy to give a small contribute by doing it myself if you let me know how to propose modifications to the documentation. |
This comment has been minimized.
This comment has been minimized.
I get the same, have you found the solution? |
This comment has been minimized.
This comment has been minimized.
@fchollet, other contributors Despite MobileNet beeing a much smaller model. |
This comment has been minimized.
This comment has been minimized.
My computer (CPU running) takes 30 minutes from reading the training data: to the validation data: and then 15 more minutes till it start the epochs. Any idea why it's sooo slow? thanks! |
This comment has been minimized.
This comment has been minimized.
ValueError: Error when checking target: expected dense_2 to have shape (10,) but got array with shape (1,)
Also , using Need help to fix the error . |
This comment has been minimized.
This comment has been minimized.
I am running this exact code on the exact same data set, and the training accuracy increases as it should. However, the validation accuracy doesn't really change, but fluctuates at around 0.9 throughout the epochs as shown below. Epoch 1/50 |
This comment has been minimized.
This comment has been minimized.
Hi, I am having dataset of 48 classes with 1000 images in each. With epoch memory in ram is slowly increasing and training is halting when my ram get full. Please help me out. |
This comment has been minimized.
This comment has been minimized.
please help me to rectify this following errors. because i am new to python. InvalidArgumentError Traceback (most recent call last) InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_2/MaxPool' (op: 'MaxPool') with input shapes: [?,1,148,32]. During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/keras/engine/sequential.py in add(self, layer) /usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in call(self, inputs, **kwargs) /usr/local/lib/python3.6/dist-packages/keras/layers/pooling.py in call(self, inputs) /usr/local/lib/python3.6/dist-packages/keras/layers/pooling.py in _pooling_function(self, inputs, pool_size, strides, padding, data_format) /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in pool2d(x, pool_size, strides, padding, data_format, pool_mode) /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py in max_pool(value, ksize, strides, padding, data_format, name) /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py in max_pool(input, ksize, strides, padding, data_format, name) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py in new_func(*args, **kwargs) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in create_op(failed resolving arguments) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in init(self, node_def, g, inputs, output_types, control_inputs, input_types, original_op, op_def) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs) ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_2/MaxPool' (op: 'MaxPool') with input shapes: [?,1,148,32]. |
This comment has been minimized.
This comment has been minimized.
when change the optimizer to adam , it works fine |
This comment has been minimized.
This comment has been minimized.
Sometimes my accuracy is coming around 90 to 94 % but most of the after running the same code it is coming exact 50%. I don' from keras.applications.vgg19 import VGG19 ) |
This comment has been minimized.
This comment has been minimized.
I am using this code for classifying ten class of faces using vgg facenet but getting a error can someone help TypeError Traceback (most recent call last) in train_top_model() TypeError: can't multiply sequence by non-int of type 'float' |
This comment has been minimized.
This comment has been minimized.
problem solved actually I created labels for two classes whereas my model has to clasify among 10 classes, so changed the train_label= np.array([0]*nb_train_samples/10+.......[9]*nb_train_samples/10) |
This comment has been minimized.
This comment has been minimized.
Do we need an equal number of images in each class |
This comment has been minimized.
This comment has been minimized.
Dears @biswagsingh @srikar2097 @drewszurko @aspiringguru |
This comment has been minimized.
This comment has been minimized.
Fine tuned models' Prediction codeThis codes were checked by myself. They all worked fine.
This code is inspired by stack overflow answer. click here
|
This comment has been minimized.
This comment has been minimized.
'''
|
This comment has been minimized.
This comment has been minimized.
Yes, because if we provide equal no of images then the model would be able to generalize well |
This comment has been minimized.
I really love this example. I had some problems getting it running on python 3.5, but everything ran smooth on 2.7. Great work.