-
-
Save fchollet/0830affa1f7f19fd47b06d4cf89ed44d to your computer and use it in GitHub Desktop.
'''This script goes along the blog post | |
"Building powerful image classification models using very little data" | |
from blog.keras.io. | |
It uses data that can be downloaded at: | |
https://www.kaggle.com/c/dogs-vs-cats/data | |
In our setup, we: | |
- created a data/ folder | |
- created train/ and validation/ subfolders inside data/ | |
- created cats/ and dogs/ subfolders inside train/ and validation/ | |
- put the cat pictures index 0-999 in data/train/cats | |
- put the cat pictures index 1000-1400 in data/validation/cats | |
- put the dogs pictures index 12500-13499 in data/train/dogs | |
- put the dog pictures index 13500-13900 in data/validation/dogs | |
So that we have 1000 training examples for each class, and 400 validation examples for each class. | |
In summary, this is our directory structure: | |
``` | |
data/ | |
train/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
validation/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
``` | |
''' | |
from keras.preprocessing.image import ImageDataGenerator | |
from keras.models import Sequential | |
from keras.layers import Conv2D, MaxPooling2D | |
from keras.layers import Activation, Dropout, Flatten, Dense | |
from keras import backend as K | |
# dimensions of our images. | |
img_width, img_height = 150, 150 | |
train_data_dir = 'data/train' | |
validation_data_dir = 'data/validation' | |
nb_train_samples = 2000 | |
nb_validation_samples = 800 | |
epochs = 50 | |
batch_size = 16 | |
if K.image_data_format() == 'channels_first': | |
input_shape = (3, img_width, img_height) | |
else: | |
input_shape = (img_width, img_height, 3) | |
model = Sequential() | |
model.add(Conv2D(32, (3, 3), input_shape=input_shape)) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Conv2D(32, (3, 3))) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Conv2D(64, (3, 3))) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Flatten()) | |
model.add(Dense(64)) | |
model.add(Activation('relu')) | |
model.add(Dropout(0.5)) | |
model.add(Dense(1)) | |
model.add(Activation('sigmoid')) | |
model.compile(loss='binary_crossentropy', | |
optimizer='rmsprop', | |
metrics=['accuracy']) | |
# this is the augmentation configuration we will use for training | |
train_datagen = ImageDataGenerator( | |
rescale=1. / 255, | |
shear_range=0.2, | |
zoom_range=0.2, | |
horizontal_flip=True) | |
# this is the augmentation configuration we will use for testing: | |
# only rescaling | |
test_datagen = ImageDataGenerator(rescale=1. / 255) | |
train_generator = train_datagen.flow_from_directory( | |
train_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode='binary') | |
validation_generator = test_datagen.flow_from_directory( | |
validation_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode='binary') | |
model.fit_generator( | |
train_generator, | |
steps_per_epoch=nb_train_samples // batch_size, | |
epochs=epochs, | |
validation_data=validation_generator, | |
validation_steps=nb_validation_samples // batch_size) | |
model.save_weights('first_try.h5') |
How to check model accuracy?
@marco-zorzi thank you!
Hi,@fchollet and anyone
i am new to deep learning and keras.
thanks for your tutorial!
i have successfully run the code.
Now , How can i test it ?
i am using the tensorflow backend.
If anyone can help me with code , it will be helpful.
thanks
thanks for great tutorial! I think the last line of code should be model_save_weights('first_try.h5').
its model.save_weights('first_try.h5').
Good learning example.
I am greenhorn with Keras, figuring my way through.How does one add :
a) K-fold cross validation
b) checkpoint
with the generators
and
c) change the example to more than 2 classes
I tried
model.add(Dense(3)) # 3 classes
model.add(Activation('softmax')) not sure if this is the only change. Will the class_mode change ?
I am still confused how does one add : a) K-fold cross validation
@fchollet hope u are doing well....
reference to your powerful image classifier your keras code i was doing binery classification on mac and other laptops. My training set contains 1000 examples of mac and other laptops (total 2000 training examples). i trained it for 50 epochs. when training ends the accuracy was 0.93 written on screen but when i test the model on my test set it give naughty results,,, can u please help me what should be done in this situation as i need 95% above accuracy to classify between Mac and other brand laptops....
i test my model with this code:
preds = laptop_model.predict_classes(x)
prob = laptop_model.predict_proba(x)
print(preds, prob)
if(prob <= 0.5): # My first folder in train directory was Mac
label = "MacBook " + str(prob)
color = (0, 0, 255)
else:
label = "Laptop" + str(prob)
color = (0, 255, 0)
print(label)
Kindly check the code.......
here is the result of what my naughty model had learned yet.. ?
model result
I would like go implement a hierarchical resnet architecture. However, I could not find any solution for this. For example, my data structure is like:
class A
Subclass 1
Subclass 2
....
class B
subclass 6
........
So i would like to train and predict the main class and then the subclass of the chosen/predicted mainclass. Let say, we predict first the age group of a person (40,60), then the classifiers knows that the age is something between 40-60 years old and it will predict the apparent age 53. I cannot get the idea behind the implementation for this. Can someone provide a simple example how to do this with generators?
I know how to compile the model and do fit_generator, but do not get the intuition behind the model creation.
base_model = ResNet50(include_top=False, weights='imagenet', input_shape=(224, 224, 3), pooling="avg") # ResNet50 for layer in base_model.layers: if isinstance(layer, Conv2D) or isinstance(layer, Dense): layer.add_loss(l2(0.0005)(layer.kernel)) if hasattr(layer, 'bias_regularizer') and layer.use_bias: layer.add_loss(l2(0.0005)(layer.bias)) prediction = Dense(units=100, kernel_initializer="he_normal", use_bias=False, activation="softmax", name="pred_age", kernel_regularizer=l2(0.0005))(base_model.output) model = Model(inputs=base_model.input, outputs=prediction)
What is validation data? Is it the same with train data?
What is validation data? Is it the same with train data?
My validation data is the same as train data. I use ImageDataGenerator for splitting.
train_datagen = ImageDataGenerator(validation_split=val_split,rescale=1./255) # set validation split
train_generator = train_datagen.flow_from_directory( train_data_dir, target_size=(img_size, img_size), batch_size=batch_size, color_mode='rgb', shuffle=True, subset='training') # set as training data
validation_generator = train_datagen.flow_from_directory( train_data_dir, # same directory as training data target_size=(img_size, img_size), color_mode='rgb', shuffle=False, subset='validation') # set as validation data
I was thinking to train first a model on N age groups (0-10,11-18,...) and then train N models for each age group (model 1 (0-10), model 2 (11-18), model 3 (..)) and then combine those 5 models into 1 model, however I do not know how to combine those N models as 1 model. Because if 1 model has 10 prediction layers and another model 15 etc.. The final model must have 25 prediction output, so the predictions outputs are also combined into 1 large output (100 prediction size output for the combination of all the model).
I see some samples , not use validation data.
It only use train data and train label.
Is it necessary?
My validation data is the same as train data. I use ImageDataGenerator for splitting.
I see number image of valid data is smaller than train data.
Has anyone written small prediction script for testing the data under test folder (if given one?)
Hello everyone,
I run the same codes and got this error:
ValueError: Error when checking input: expected conv2d_11_input to have shape (28, 28, 2) but got array with shape (150, 150, 3)
any help?
Hello
How do you use .load_weights('example.h5') do you have an example ?
Thanks !
hello!
I wonder if the file type must be .h5 in the sentence "model.save_weights('first_try.h5')" ?
and I have the same question like her answer @MasterWas
Thanks !!
Hi, I generated the confusion matrix on prediction result and that is [[2500 0] [2500 0] ]. I think it's not up to the mark. I'm training dogs and cats , training images are 20000 and validation images are 5000.
Thanks
Hi, I generated the confusion matrix on prediction result and that is [[2500 0] [2500 0] ]. I think it's not up to the mark. I'm training dogs and cats , training images are 20000 and validation images are 5000.
Thanks
Can you tell me how you generated the confusion matrix?
@austinchencym
Hi
Actually I got not more than 70% when I increased number of dataset. However, It looks not stable.
Any new with you? How can you use one-hot encoding based on the example?
Hello.did you implement the one hot encoding in your code?if so then will you please help me out and is it a must to use one hot encoding in multi class image classification??
Thank you
I wonder, don't we need the 'label' ? There is no such vector "y_train" being used in the code?
I have this same question.can anyone please clarify?
Thanks
Accuracy not rising over 0.5000
I met the same problem! Did you solve it.
Hi, I am doing Python and classification both the first time. Can anyone tell me is it necessary to keep the index as part of the image name or as long as I have a unique image name it will work?
I wonder, don't we need the 'label' ? There is no such vector "y_train" being used in the code?
I have this same question.can anyone please clarify?
Thanks
This line of code takes care of the label (i.e. the name of the subdirectory is the label)
train_datagen.flow_from_directory
how can i test this code on new images ?
thank you
Hello excellent tutorial, While trying the code I came across this error. Do you have an idea. It's the same code.
Thank you
Found 21 images belonging to 2 classes. /home/abdou/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1905: UserWarning:
Model.predict_generatoris deprecated and will be removed in a future version. Please use
Model.predict, which supports generators. warnings.warn('
Model.predict_generatoris deprecated and ' WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs` batches (in this case, 125.0 batches). You may need to use the repeat() function when building your dataset.
TypeError Traceback (most recent call last)
in ()
24
25
---> 26 save_bottlebeck_features()
27 train_top_model()
in save_bottlebeck_features()
26 generator, nb_train_samples / batch_size)
27 np.save(open('bottleneck_features_train.npy', 'w'),
---> 28 bottleneck_features_train)
29
30 generator = datagen.flow_from_directory(
<array_function internals> in save(*args, **kwargs)
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/npyio.py in save(file, arr, allow_pickle, fix_imports)
527 arr = np.asanyarray(arr)
528 format.write_array(fid, arr, allow_pickle=allow_pickle,
--> 529 pickle_kwargs=dict(fix_imports=fix_imports))
530
531
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/format.py in write_array(fp, array, version, allow_pickle, pickle_kwargs)
646 """
647 _check_version(version)
--> 648 _write_array_header(fp, header_data_from_array_1_0(array), version)
649
650 if array.itemsize == 0:
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/format.py in _write_array_header(fp, d, version)
426 else:
427 header = _wrap_header(header, version)
--> 428 fp.write(header)
429
430 def write_array_header_1_0(fp, d):
TypeError: write() argument must be str, not bytes`
@diouck I think that your dataset is very small to use this CNN. Try to classify with more samples and update your packages!
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
'''
Showing the directory structure as image is so helpful! including simple examples! tysm this helped me a lot!!
Damn the Getúlio Vargas Foundation for demanding this example in a high-level public tender
page 7
we will never forget it...
I got this error, please give a bit detail how to solve this problem:
Found 0 images belonging to 0 classes.
Found 0 images belonging to 0 classes.
:70: UserWarning: Model.fit_generator
is deprecated and will be removed in a future version. Please use Model.fit
, which supports generators.
model.fit_generator(
ValueError Traceback (most recent call last)
in <cell line: 70>()
68 class_mode='binary')
69
---> 70 model.fit_generator(
71 train_generator,
72 steps_per_epoch=nb_train_samples // batch_size,
2 frames
/usr/local/lib/python3.9/dist-packages/keras/preprocessing/image.py in getitem(self, idx)
101 def getitem(self, idx):
102 if idx >= len(self):
--> 103 raise ValueError(
104 "Asked to retrieve element {idx}, "
105 "but the Sequence "
ValueError: Asked to retrieve element 0, but the Sequence has length 0
RuntimeError Traceback (most recent call last)
in <cell line: 54>()
52 model = Sequential()
53
---> 54 model.fit(
55 train_generator,
56 steps_per_epoch=2000,
1 frames
/usr/local/lib/python3.9/dist-packages/keras/engine/training.py in _assert_compile_was_called(self)
3683 # (i.e. whether the model is built and its inputs/outputs are set).
3684 if not self._is_compiled:
-> 3685 raise RuntimeError(
3686 "You must compile your model before "
3687 "training/testing. "
RuntimeError: You must compile your model before training/testing. Use model.compile(optimizer, loss)
.
i am using this fit_generator method --
classifier.fit_generator(training_set,
steps_per_epoch = 250,
epochs = 25,
validation_data = test_set,
validation_steps = 63)
and getting this error --
File "", line 5, in
validation_steps = 63)
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\engine\training_generator.py", line 40, in fit_generator
model._make_train_function()
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\engine\training.py", line 519, in _make_train_function
**self._function_kwargs)
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 2744, in function
return Function(inputs, outputs, updates=updates, **kwargs)
File "C:\Users\dell\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 2575, in init
'time: %s', session_kwargs.keys())
ValueError: ('Some keys in session_kwargs are not supported at this time: %s', dict_keys(['matris']))
can anybody help?