-
Star
(447)
You must be signed in to star a gist -
Fork
(178)
You must be signed in to fork a gist
-
-
Save fchollet/0830affa1f7f19fd47b06d4cf89ed44d to your computer and use it in GitHub Desktop.
'''This script goes along the blog post | |
"Building powerful image classification models using very little data" | |
from blog.keras.io. | |
It uses data that can be downloaded at: | |
https://www.kaggle.com/c/dogs-vs-cats/data | |
In our setup, we: | |
- created a data/ folder | |
- created train/ and validation/ subfolders inside data/ | |
- created cats/ and dogs/ subfolders inside train/ and validation/ | |
- put the cat pictures index 0-999 in data/train/cats | |
- put the cat pictures index 1000-1400 in data/validation/cats | |
- put the dogs pictures index 12500-13499 in data/train/dogs | |
- put the dog pictures index 13500-13900 in data/validation/dogs | |
So that we have 1000 training examples for each class, and 400 validation examples for each class. | |
In summary, this is our directory structure: | |
``` | |
data/ | |
train/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
validation/ | |
dogs/ | |
dog001.jpg | |
dog002.jpg | |
... | |
cats/ | |
cat001.jpg | |
cat002.jpg | |
... | |
``` | |
''' | |
from keras.preprocessing.image import ImageDataGenerator | |
from keras.models import Sequential | |
from keras.layers import Conv2D, MaxPooling2D | |
from keras.layers import Activation, Dropout, Flatten, Dense | |
from keras import backend as K | |
# dimensions of our images. | |
img_width, img_height = 150, 150 | |
train_data_dir = 'data/train' | |
validation_data_dir = 'data/validation' | |
nb_train_samples = 2000 | |
nb_validation_samples = 800 | |
epochs = 50 | |
batch_size = 16 | |
if K.image_data_format() == 'channels_first': | |
input_shape = (3, img_width, img_height) | |
else: | |
input_shape = (img_width, img_height, 3) | |
model = Sequential() | |
model.add(Conv2D(32, (3, 3), input_shape=input_shape)) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Conv2D(32, (3, 3))) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Conv2D(64, (3, 3))) | |
model.add(Activation('relu')) | |
model.add(MaxPooling2D(pool_size=(2, 2))) | |
model.add(Flatten()) | |
model.add(Dense(64)) | |
model.add(Activation('relu')) | |
model.add(Dropout(0.5)) | |
model.add(Dense(1)) | |
model.add(Activation('sigmoid')) | |
model.compile(loss='binary_crossentropy', | |
optimizer='rmsprop', | |
metrics=['accuracy']) | |
# this is the augmentation configuration we will use for training | |
train_datagen = ImageDataGenerator( | |
rescale=1. / 255, | |
shear_range=0.2, | |
zoom_range=0.2, | |
horizontal_flip=True) | |
# this is the augmentation configuration we will use for testing: | |
# only rescaling | |
test_datagen = ImageDataGenerator(rescale=1. / 255) | |
train_generator = train_datagen.flow_from_directory( | |
train_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode='binary') | |
validation_generator = test_datagen.flow_from_directory( | |
validation_data_dir, | |
target_size=(img_width, img_height), | |
batch_size=batch_size, | |
class_mode='binary') | |
model.fit_generator( | |
train_generator, | |
steps_per_epoch=nb_train_samples // batch_size, | |
epochs=epochs, | |
validation_data=validation_generator, | |
validation_steps=nb_validation_samples // batch_size) | |
model.save_weights('first_try.h5') |
My validation data is the same as train data. I use ImageDataGenerator for splitting.
I see number image of valid data is smaller than train data.
Has anyone written small prediction script for testing the data under test folder (if given one?)
Hello everyone,
I run the same codes and got this error:
ValueError: Error when checking input: expected conv2d_11_input to have shape (28, 28, 2) but got array with shape (150, 150, 3)
any help?
Hello
How do you use .load_weights('example.h5') do you have an example ?
Thanks !
hello!
I wonder if the file type must be .h5 in the sentence "model.save_weights('first_try.h5')" ?
and I have the same question like her answer @MasterWas
Thanks !!
Hi, I generated the confusion matrix on prediction result and that is [[2500 0] [2500 0] ]. I think it's not up to the mark. I'm training dogs and cats , training images are 20000 and validation images are 5000.
Thanks
Hi, I generated the confusion matrix on prediction result and that is [[2500 0] [2500 0] ]. I think it's not up to the mark. I'm training dogs and cats , training images are 20000 and validation images are 5000.
Thanks
Can you tell me how you generated the confusion matrix?
@austinchencym
Hi
Actually I got not more than 70% when I increased number of dataset. However, It looks not stable.
Any new with you? How can you use one-hot encoding based on the example?
Hello.did you implement the one hot encoding in your code?if so then will you please help me out and is it a must to use one hot encoding in multi class image classification??
Thank you
I wonder, don't we need the 'label' ? There is no such vector "y_train" being used in the code?
I have this same question.can anyone please clarify?
Thanks
Accuracy not rising over 0.5000
I met the same problem! Did you solve it.
Hi, I am doing Python and classification both the first time. Can anyone tell me is it necessary to keep the index as part of the image name or as long as I have a unique image name it will work?
I wonder, don't we need the 'label' ? There is no such vector "y_train" being used in the code?
I have this same question.can anyone please clarify?
Thanks
This line of code takes care of the label (i.e. the name of the subdirectory is the label)
train_datagen.flow_from_directory
how can i test this code on new images ?
thank you
Hello excellent tutorial, While trying the code I came across this error. Do you have an idea. It's the same code.
Thank you
Found 21 images belonging to 2 classes. /home/abdou/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1905: UserWarning:
Model.predict_generatoris deprecated and will be removed in a future version. Please use
Model.predict, which supports generators. warnings.warn('
Model.predict_generatoris deprecated and ' WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs` batches (in this case, 125.0 batches). You may need to use the repeat() function when building your dataset.
TypeError Traceback (most recent call last)
in ()
24
25
---> 26 save_bottlebeck_features()
27 train_top_model()
in save_bottlebeck_features()
26 generator, nb_train_samples / batch_size)
27 np.save(open('bottleneck_features_train.npy', 'w'),
---> 28 bottleneck_features_train)
29
30 generator = datagen.flow_from_directory(
<array_function internals> in save(*args, **kwargs)
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/npyio.py in save(file, arr, allow_pickle, fix_imports)
527 arr = np.asanyarray(arr)
528 format.write_array(fid, arr, allow_pickle=allow_pickle,
--> 529 pickle_kwargs=dict(fix_imports=fix_imports))
530
531
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/format.py in write_array(fp, array, version, allow_pickle, pickle_kwargs)
646 """
647 _check_version(version)
--> 648 _write_array_header(fp, header_data_from_array_1_0(array), version)
649
650 if array.itemsize == 0:
/home/abdou/.local/lib/python3.6/site-packages/numpy/lib/format.py in _write_array_header(fp, d, version)
426 else:
427 header = _wrap_header(header, version)
--> 428 fp.write(header)
429
430 def write_array_header_1_0(fp, d):
TypeError: write() argument must be str, not bytes`
@diouck I think that your dataset is very small to use this CNN. Try to classify with more samples and update your packages!
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
'''
Showing the directory structure as image is so helpful! including simple examples! tysm this helped me a lot!!
Damn the Getúlio Vargas Foundation for demanding this example in a high-level public tender
page 7
we will never forget it...
I got this error, please give a bit detail how to solve this problem:
Found 0 images belonging to 0 classes.
Found 0 images belonging to 0 classes.
:70: UserWarning: Model.fit_generator
is deprecated and will be removed in a future version. Please use Model.fit
, which supports generators.
model.fit_generator(
ValueError Traceback (most recent call last)
in <cell line: 70>()
68 class_mode='binary')
69
---> 70 model.fit_generator(
71 train_generator,
72 steps_per_epoch=nb_train_samples // batch_size,
2 frames
/usr/local/lib/python3.9/dist-packages/keras/preprocessing/image.py in getitem(self, idx)
101 def getitem(self, idx):
102 if idx >= len(self):
--> 103 raise ValueError(
104 "Asked to retrieve element {idx}, "
105 "but the Sequence "
ValueError: Asked to retrieve element 0, but the Sequence has length 0
RuntimeError Traceback (most recent call last)
in <cell line: 54>()
52 model = Sequential()
53
---> 54 model.fit(
55 train_generator,
56 steps_per_epoch=2000,
1 frames
/usr/local/lib/python3.9/dist-packages/keras/engine/training.py in _assert_compile_was_called(self)
3683 # (i.e. whether the model is built and its inputs/outputs are set).
3684 if not self._is_compiled:
-> 3685 raise RuntimeError(
3686 "You must compile your model before "
3687 "training/testing. "
RuntimeError: You must compile your model before training/testing. Use model.compile(optimizer, loss)
.
I see some samples , not use validation data.
It only use train data and train label.
Is it necessary?