Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
GoogLeNet in Keras

GoogLeNet in Keras

Here is a Keras model of GoogLeNet (a.k.a Inception V1). I created it by converting the GoogLeNet model from Caffe.

GoogLeNet paper:

Going deeper with convolutions.
Szegedy, Christian, et al. 
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

Requirements

The code now runs with Python 3.6, Keras 2.2.4, and either Theano 1.0.4 or Tensorflow 1.14.0. You will also need to install the following:

pip install pillow numpy imageio

To switch to the Theano backend, change your ~/.keras/keras.json file to

{"epsilon": 1e-07, "floatx": "float32", "backend": "theano", "image_data_format": "channels_first"}

Or for the Tensorflow backend,

{"epsilon": 1e-07, "floatx": "float32", "backend": "tensorflow", "image_data_format": "channels_first"}

Note that in either case, the code requires the channels_first option for image_data_format.

Running the Demo (googlenet.py)

To create a GoogLeNet model, call the following from within Python:

from googlenet import create_googlenet
model = create_googlenet()

googlenet.py also contains a demo image classification. To run the demo, you will need to install the pre-trained weights and the class labels. You will also need this test image. Once these are downloaded and moved to the working directory, you can run googlenet.py from the terminal:

$ python googlenet.py

which will output the predicted class label for the image.

from __future__ import print_function
import imageio
from PIL import Image
import numpy as np
import keras
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D, Dropout, Flatten, Concatenate, Reshape, Activation
from keras.models import Model
from keras.regularizers import l2
from keras.optimizers import SGD
from pool_helper import PoolHelper
from lrn import LRN
if keras.backend.backend() == 'tensorflow':
from keras import backend as K
import tensorflow as tf
from keras.utils.conv_utils import convert_kernel
def create_googlenet(weights_path=None):
# creates GoogLeNet a.k.a. Inception v1 (Szegedy, 2015)
input = Input(shape=(3, 224, 224))
input_pad = ZeroPadding2D(padding=(3, 3))(input)
conv1_7x7_s2 = Conv2D(64, (7,7), strides=(2,2), padding='valid', activation='relu', name='conv1/7x7_s2', kernel_regularizer=l2(0.0002))(input_pad)
conv1_zero_pad = ZeroPadding2D(padding=(1, 1))(conv1_7x7_s2)
pool1_helper = PoolHelper()(conv1_zero_pad)
pool1_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid', name='pool1/3x3_s2')(pool1_helper)
pool1_norm1 = LRN(name='pool1/norm1')(pool1_3x3_s2)
conv2_3x3_reduce = Conv2D(64, (1,1), padding='same', activation='relu', name='conv2/3x3_reduce', kernel_regularizer=l2(0.0002))(pool1_norm1)
conv2_3x3 = Conv2D(192, (3,3), padding='same', activation='relu', name='conv2/3x3', kernel_regularizer=l2(0.0002))(conv2_3x3_reduce)
conv2_norm2 = LRN(name='conv2/norm2')(conv2_3x3)
conv2_zero_pad = ZeroPadding2D(padding=(1, 1))(conv2_norm2)
pool2_helper = PoolHelper()(conv2_zero_pad)
pool2_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid', name='pool2/3x3_s2')(pool2_helper)
inception_3a_1x1 = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_3a/1x1', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
inception_3a_3x3_reduce = Conv2D(96, (1,1), padding='same', activation='relu', name='inception_3a/3x3_reduce', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
inception_3a_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_3a_3x3_reduce)
inception_3a_3x3 = Conv2D(128, (3,3), padding='valid', activation='relu', name='inception_3a/3x3', kernel_regularizer=l2(0.0002))(inception_3a_3x3_pad)
inception_3a_5x5_reduce = Conv2D(16, (1,1), padding='same', activation='relu', name='inception_3a/5x5_reduce', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
inception_3a_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_3a_5x5_reduce)
inception_3a_5x5 = Conv2D(32, (5,5), padding='valid', activation='relu', name='inception_3a/5x5', kernel_regularizer=l2(0.0002))(inception_3a_5x5_pad)
inception_3a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_3a/pool')(pool2_3x3_s2)
inception_3a_pool_proj = Conv2D(32, (1,1), padding='same', activation='relu', name='inception_3a/pool_proj', kernel_regularizer=l2(0.0002))(inception_3a_pool)
inception_3a_output = Concatenate(axis=1, name='inception_3a/output')([inception_3a_1x1,inception_3a_3x3,inception_3a_5x5,inception_3a_pool_proj])
inception_3b_1x1 = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_3b/1x1', kernel_regularizer=l2(0.0002))(inception_3a_output)
inception_3b_3x3_reduce = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_3b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_3a_output)
inception_3b_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_3b_3x3_reduce)
inception_3b_3x3 = Conv2D(192, (3,3), padding='valid', activation='relu', name='inception_3b/3x3', kernel_regularizer=l2(0.0002))(inception_3b_3x3_pad)
inception_3b_5x5_reduce = Conv2D(32, (1,1), padding='same', activation='relu', name='inception_3b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_3a_output)
inception_3b_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_3b_5x5_reduce)
inception_3b_5x5 = Conv2D(96, (5,5), padding='valid', activation='relu', name='inception_3b/5x5', kernel_regularizer=l2(0.0002))(inception_3b_5x5_pad)
inception_3b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_3b/pool')(inception_3a_output)
inception_3b_pool_proj = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_3b/pool_proj', kernel_regularizer=l2(0.0002))(inception_3b_pool)
inception_3b_output = Concatenate(axis=1, name='inception_3b/output')([inception_3b_1x1,inception_3b_3x3,inception_3b_5x5,inception_3b_pool_proj])
inception_3b_output_zero_pad = ZeroPadding2D(padding=(1, 1))(inception_3b_output)
pool3_helper = PoolHelper()(inception_3b_output_zero_pad)
pool3_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid', name='pool3/3x3_s2')(pool3_helper)
inception_4a_1x1 = Conv2D(192, (1,1), padding='same', activation='relu', name='inception_4a/1x1', kernel_regularizer=l2(0.0002))(pool3_3x3_s2)
inception_4a_3x3_reduce = Conv2D(96, (1,1), padding='same', activation='relu', name='inception_4a/3x3_reduce', kernel_regularizer=l2(0.0002))(pool3_3x3_s2)
inception_4a_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_4a_3x3_reduce)
inception_4a_3x3 = Conv2D(208, (3,3), padding='valid', activation='relu', name='inception_4a/3x3' ,kernel_regularizer=l2(0.0002))(inception_4a_3x3_pad)
inception_4a_5x5_reduce = Conv2D(16, (1,1), padding='same', activation='relu', name='inception_4a/5x5_reduce', kernel_regularizer=l2(0.0002))(pool3_3x3_s2)
inception_4a_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_4a_5x5_reduce)
inception_4a_5x5 = Conv2D(48, (5,5), padding='valid', activation='relu', name='inception_4a/5x5', kernel_regularizer=l2(0.0002))(inception_4a_5x5_pad)
inception_4a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4a/pool')(pool3_3x3_s2)
inception_4a_pool_proj = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_4a/pool_proj', kernel_regularizer=l2(0.0002))(inception_4a_pool)
inception_4a_output = Concatenate(axis=1, name='inception_4a/output')([inception_4a_1x1,inception_4a_3x3,inception_4a_5x5,inception_4a_pool_proj])
loss1_ave_pool = AveragePooling2D(pool_size=(5,5), strides=(3,3), name='loss1/ave_pool')(inception_4a_output)
loss1_conv = Conv2D(128, (1,1), padding='same', activation='relu', name='loss1/conv', kernel_regularizer=l2(0.0002))(loss1_ave_pool)
loss1_flat = Flatten()(loss1_conv)
loss1_fc = Dense(1024, activation='relu', name='loss1/fc', kernel_regularizer=l2(0.0002))(loss1_flat)
loss1_drop_fc = Dropout(rate=0.7)(loss1_fc)
loss1_classifier = Dense(1000, name='loss1/classifier', kernel_regularizer=l2(0.0002))(loss1_drop_fc)
loss1_classifier_act = Activation('softmax')(loss1_classifier)
inception_4b_1x1 = Conv2D(160, (1,1), padding='same', activation='relu', name='inception_4b/1x1', kernel_regularizer=l2(0.0002))(inception_4a_output)
inception_4b_3x3_reduce = Conv2D(112, (1,1), padding='same', activation='relu', name='inception_4b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4a_output)
inception_4b_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_4b_3x3_reduce)
inception_4b_3x3 = Conv2D(224, (3,3), padding='valid', activation='relu', name='inception_4b/3x3', kernel_regularizer=l2(0.0002))(inception_4b_3x3_pad)
inception_4b_5x5_reduce = Conv2D(24, (1,1), padding='same', activation='relu', name='inception_4b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4a_output)
inception_4b_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_4b_5x5_reduce)
inception_4b_5x5 = Conv2D(64, (5,5), padding='valid', activation='relu', name='inception_4b/5x5', kernel_regularizer=l2(0.0002))(inception_4b_5x5_pad)
inception_4b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4b/pool')(inception_4a_output)
inception_4b_pool_proj = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_4b/pool_proj', kernel_regularizer=l2(0.0002))(inception_4b_pool)
inception_4b_output = Concatenate(axis=1, name='inception_4b/output')([inception_4b_1x1,inception_4b_3x3,inception_4b_5x5,inception_4b_pool_proj])
inception_4c_1x1 = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_4c/1x1', kernel_regularizer=l2(0.0002))(inception_4b_output)
inception_4c_3x3_reduce = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_4c/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4b_output)
inception_4c_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_4c_3x3_reduce)
inception_4c_3x3 = Conv2D(256, (3,3), padding='valid', activation='relu', name='inception_4c/3x3', kernel_regularizer=l2(0.0002))(inception_4c_3x3_pad)
inception_4c_5x5_reduce = Conv2D(24, (1,1), padding='same', activation='relu', name='inception_4c/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4b_output)
inception_4c_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_4c_5x5_reduce)
inception_4c_5x5 = Conv2D(64, (5,5), padding='valid', activation='relu', name='inception_4c/5x5', kernel_regularizer=l2(0.0002))(inception_4c_5x5_pad)
inception_4c_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4c/pool')(inception_4b_output)
inception_4c_pool_proj = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_4c/pool_proj', kernel_regularizer=l2(0.0002))(inception_4c_pool)
inception_4c_output = Concatenate(axis=1, name='inception_4c/output')([inception_4c_1x1,inception_4c_3x3,inception_4c_5x5,inception_4c_pool_proj])
inception_4d_1x1 = Conv2D(112, (1,1), padding='same', activation='relu', name='inception_4d/1x1', kernel_regularizer=l2(0.0002))(inception_4c_output)
inception_4d_3x3_reduce = Conv2D(144, (1,1), padding='same', activation='relu', name='inception_4d/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4c_output)
inception_4d_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_4d_3x3_reduce)
inception_4d_3x3 = Conv2D(288, (3,3), padding='valid', activation='relu', name='inception_4d/3x3', kernel_regularizer=l2(0.0002))(inception_4d_3x3_pad)
inception_4d_5x5_reduce = Conv2D(32, (1,1), padding='same', activation='relu', name='inception_4d/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4c_output)
inception_4d_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_4d_5x5_reduce)
inception_4d_5x5 = Conv2D(64, (5,5), padding='valid', activation='relu', name='inception_4d/5x5', kernel_regularizer=l2(0.0002))(inception_4d_5x5_pad)
inception_4d_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4d/pool')(inception_4c_output)
inception_4d_pool_proj = Conv2D(64, (1,1), padding='same', activation='relu', name='inception_4d/pool_proj', kernel_regularizer=l2(0.0002))(inception_4d_pool)
inception_4d_output = Concatenate(axis=1, name='inception_4d/output')([inception_4d_1x1,inception_4d_3x3,inception_4d_5x5,inception_4d_pool_proj])
loss2_ave_pool = AveragePooling2D(pool_size=(5,5), strides=(3,3), name='loss2/ave_pool')(inception_4d_output)
loss2_conv = Conv2D(128, (1,1), padding='same', activation='relu', name='loss2/conv', kernel_regularizer=l2(0.0002))(loss2_ave_pool)
loss2_flat = Flatten()(loss2_conv)
loss2_fc = Dense(1024, activation='relu', name='loss2/fc', kernel_regularizer=l2(0.0002))(loss2_flat)
loss2_drop_fc = Dropout(rate=0.7)(loss2_fc)
loss2_classifier = Dense(1000, name='loss2/classifier', kernel_regularizer=l2(0.0002))(loss2_drop_fc)
loss2_classifier_act = Activation('softmax')(loss2_classifier)
inception_4e_1x1 = Conv2D(256, (1,1), padding='same', activation='relu', name='inception_4e/1x1', kernel_regularizer=l2(0.0002))(inception_4d_output)
inception_4e_3x3_reduce = Conv2D(160, (1,1), padding='same', activation='relu', name='inception_4e/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4d_output)
inception_4e_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_4e_3x3_reduce)
inception_4e_3x3 = Conv2D(320, (3,3), padding='valid', activation='relu', name='inception_4e/3x3', kernel_regularizer=l2(0.0002))(inception_4e_3x3_pad)
inception_4e_5x5_reduce = Conv2D(32, (1,1), padding='same', activation='relu', name='inception_4e/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4d_output)
inception_4e_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_4e_5x5_reduce)
inception_4e_5x5 = Conv2D(128, (5,5), padding='valid', activation='relu', name='inception_4e/5x5', kernel_regularizer=l2(0.0002))(inception_4e_5x5_pad)
inception_4e_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4e/pool')(inception_4d_output)
inception_4e_pool_proj = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_4e/pool_proj', kernel_regularizer=l2(0.0002))(inception_4e_pool)
inception_4e_output = Concatenate(axis=1, name='inception_4e/output')([inception_4e_1x1,inception_4e_3x3,inception_4e_5x5,inception_4e_pool_proj])
inception_4e_output_zero_pad = ZeroPadding2D(padding=(1, 1))(inception_4e_output)
pool4_helper = PoolHelper()(inception_4e_output_zero_pad)
pool4_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid', name='pool4/3x3_s2')(pool4_helper)
inception_5a_1x1 = Conv2D(256, (1,1), padding='same', activation='relu', name='inception_5a/1x1', kernel_regularizer=l2(0.0002))(pool4_3x3_s2)
inception_5a_3x3_reduce = Conv2D(160, (1,1), padding='same', activation='relu', name='inception_5a/3x3_reduce', kernel_regularizer=l2(0.0002))(pool4_3x3_s2)
inception_5a_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_5a_3x3_reduce)
inception_5a_3x3 = Conv2D(320, (3,3), padding='valid', activation='relu', name='inception_5a/3x3', kernel_regularizer=l2(0.0002))(inception_5a_3x3_pad)
inception_5a_5x5_reduce = Conv2D(32, (1,1), padding='same', activation='relu', name='inception_5a/5x5_reduce', kernel_regularizer=l2(0.0002))(pool4_3x3_s2)
inception_5a_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_5a_5x5_reduce)
inception_5a_5x5 = Conv2D(128, (5,5), padding='valid', activation='relu', name='inception_5a/5x5', kernel_regularizer=l2(0.0002))(inception_5a_5x5_pad)
inception_5a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_5a/pool')(pool4_3x3_s2)
inception_5a_pool_proj = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_5a/pool_proj', kernel_regularizer=l2(0.0002))(inception_5a_pool)
inception_5a_output = Concatenate(axis=1, name='inception_5a/output')([inception_5a_1x1,inception_5a_3x3,inception_5a_5x5,inception_5a_pool_proj])
inception_5b_1x1 = Conv2D(384, (1,1), padding='same', activation='relu', name='inception_5b/1x1', kernel_regularizer=l2(0.0002))(inception_5a_output)
inception_5b_3x3_reduce = Conv2D(192, (1,1), padding='same', activation='relu', name='inception_5b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_5a_output)
inception_5b_3x3_pad = ZeroPadding2D(padding=(1, 1))(inception_5b_3x3_reduce)
inception_5b_3x3 = Conv2D(384, (3,3), padding='valid', activation='relu', name='inception_5b/3x3', kernel_regularizer=l2(0.0002))(inception_5b_3x3_pad)
inception_5b_5x5_reduce = Conv2D(48, (1,1), padding='same', activation='relu', name='inception_5b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_5a_output)
inception_5b_5x5_pad = ZeroPadding2D(padding=(2, 2))(inception_5b_5x5_reduce)
inception_5b_5x5 = Conv2D(128, (5,5), padding='valid', activation='relu', name='inception_5b/5x5', kernel_regularizer=l2(0.0002))(inception_5b_5x5_pad)
inception_5b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_5b/pool')(inception_5a_output)
inception_5b_pool_proj = Conv2D(128, (1,1), padding='same', activation='relu', name='inception_5b/pool_proj', kernel_regularizer=l2(0.0002))(inception_5b_pool)
inception_5b_output = Concatenate(axis=1, name='inception_5b/output')([inception_5b_1x1,inception_5b_3x3,inception_5b_5x5,inception_5b_pool_proj])
pool5_7x7_s1 = AveragePooling2D(pool_size=(7,7), strides=(1,1), name='pool5/7x7_s2')(inception_5b_output)
loss3_flat = Flatten()(pool5_7x7_s1)
pool5_drop_7x7_s1 = Dropout(rate=0.4)(loss3_flat)
loss3_classifier = Dense(1000, name='loss3/classifier', kernel_regularizer=l2(0.0002))(pool5_drop_7x7_s1)
loss3_classifier_act = Activation('softmax', name='prob')(loss3_classifier)
googlenet = Model(inputs=input, outputs=[loss1_classifier_act,loss2_classifier_act,loss3_classifier_act])
if weights_path:
googlenet.load_weights(weights_path)
if keras.backend.backend() == 'tensorflow':
# convert the convolutional kernels for tensorflow
ops = []
for layer in googlenet.layers:
if layer.__class__.__name__ == 'Conv2D':
original_w = K.get_value(layer.kernel)
converted_w = convert_kernel(original_w)
ops.append(tf.assign(layer.kernel, converted_w).op)
K.get_session().run(ops)
return googlenet
if __name__ == "__main__":
img = imageio.imread('cat.jpg', pilmode='RGB')
img = np.array(Image.fromarray(img).resize((224, 224))).astype(np.float32)
img[:, :, 0] -= 123.68
img[:, :, 1] -= 116.779
img[:, :, 2] -= 103.939
img[:,:,[0,1,2]] = img[:,:,[2,1,0]]
img = img.transpose((2, 0, 1))
img = np.expand_dims(img, axis=0)
# Test pretrained model
model = create_googlenet('googlenet_weights.h5')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
out = model.predict(img) # note: the model has three outputs
labels = np.loadtxt('synset_words.txt', str, delimiter='\t')
predicted_label = np.argmax(out[2])
predicted_class_name = labels[predicted_label]
print('Predicted Class: ', predicted_label, ', Class Name: ', predicted_class_name)
from keras.layers.core import Layer
from keras import backend as K
if K.backend() == 'theano':
import theano.tensor as T
elif K.backend() == 'tensorflow':
import tensorflow as tf
else:
raise NotImplementedError
class LRN(Layer):
def __init__(self, alpha=0.0001, k=1, beta=0.75, n=5, **kwargs):
self.alpha = alpha
self.k = k
self.beta = beta
self.n = n
super(LRN, self).__init__(**kwargs)
def call(self, x, mask=None):
b, ch, r, c = x.shape
half_n = self.n // 2 # half the local region
input_sqr = K.square(x) # square the input
if K.backend() == 'theano':
# make an empty tensor with zero pads along channel dimension
zeros = T.alloc(0., b, ch + 2*half_n, r, c)
# set the center to be the squared input
input_sqr = T.set_subtensor(zeros[:, half_n:half_n+ch, :, :], input_sqr)
else:
input_sqr = tf.pad(input_sqr, [[0, 0], [half_n, half_n], [0, 0], [0, 0]])
scale = self.k # offset for the scale
norm_alpha = self.alpha / self.n # normalized alpha
for i in range(self.n):
scale += norm_alpha * input_sqr[:, i:i+ch, :, :]
scale = scale ** self.beta
x = x / scale
return x
def get_config(self):
config = {"alpha": self.alpha,
"k": self.k,
"beta": self.beta,
"n": self.n}
base_config = super(LRN, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
from keras.layers.core import Layer
class PoolHelper(Layer):
def __init__(self, **kwargs):
super(PoolHelper, self).__init__(**kwargs)
def call(self, x, mask=None):
return x[:,:,1:,1:]
def get_config(self):
config = {}
base_config = super(PoolHelper, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
@rikkuporta

This comment has been minimized.

Copy link

@rikkuporta rikkuporta commented Jul 7, 2016

Hi!

I changed the last layers to a Dense(8) instead of Dense(1000)
Now I am trying to retrain the Network with

ghist = googlemodel.fit_generator(
        get_train_gen(224,64),
        samples_per_epoch=4416,
        nb_epoch=30,
        validation_data=get_val_gen(224),
        nb_val_samples=1152)

but I get an Error Message:
Exception: The model expects 3 input arrays, but only received one array. Found: array with shape (64, 8).
[64 = BatchSize, 8=Classes]

Could you help me or give me a hint?

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jul 11, 2016

The GoogLeNet model requires three output vectors, one for each of the classifiers. You can see this from the line

googlenet = Model(input=input, output=[loss1_classifier_act,loss2_classifier_act,loss3_classifier_act])

in googlenet.py. In order to train GoogLeNet in Keras, you need to feed three copies of your labels into the model. I'm not sure what your get_train_gen() function is doing, but it should be returning an ImageDataGenerator object. If you have X_train and Y_train and a generator datagen defined using

datagen = ImageDataGenerator()
datagen.fit(X_train)

then somewhere in get_train_gen(), you should be running something like

datagen.flow(X_train,Y_train, batch_size=64).

Something similar should be done in your get_val_gen() function. Now, I'm not sure you can use the fit_generator method with GoogLeNet. I would instead suggest using a for loop with the ImageDataGenerator flow method, as shown here. The three outputs come in when using one of the model's training methods, such as train_on_batch:

loss = model.train_on_batch(X_batch, [Y_batch,Y_batch,Y_batch]).

This would look something like:

for e in range(nb_epoch):
    batches = 0
    for X_batch, Y_batch in datagen.flow(X_train, Y_train, batch_size=64):
        loss = model.train_on_batch(X_batch, [Y_batch,Y_batch,Y_batch]) # note the three outputs
        batches += 1
        if batches >= len(X_train) / 64:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break
@rikkuporta

This comment has been minimized.

Copy link

@rikkuporta rikkuporta commented Jul 14, 2016

Thank you for your answer!
Now I understand, but unfortunately I use flow_from_directory, where it reads from a directory. Do you know something for this?
I also tried to use the flow method but this also seems to fail with the error X (images tensor) and y (labels) should have the same length. Found: X.shape = (17999, 3, 224, 224), y.shape = (3, 17999) I'm not sure what to make out of this message.

This is the code:

def get_train_gen_google(batch):
    (X, Y) = load_data('training')

    datagen = ImageDataGenerator(
        rescale=1. / 255,
        shear_range=0.2,
        zoom_range=0.2)

    datagen.fit(X)

    datagen = datagen.flow(X, [Y, Y, Y], batch_size=batch)

    return datagen
def load_data(what):
    file = pp + '/dataset_2/'+what+'.txt'
    labels = list()
    data = list()
    with open(file, 'r') as f:
        reader = csv.reader(f, delimiter=';')
        for row in reader:
            labels.append(row[1])

            i = imread(pp+'/dataset_2/'+what+'/'+row[0], mode='RGB')
            img = imresize(i, (224, 224)).astype(np.float32)
            img[:, :, [0, 1, 2]] = img[:, :, [2, 1, 0]]
            img = img.transpose((2, 0, 1))
            nimg = np.asarray(img, dtype="uint8")
            data.append(nimg)
    Y = np.asarray(labels)
    Xb = np.asarray(data)
    X = np.array(Xb)
    return (X,Y)

Edit:
I also tried transposing, but then I get object of type 'ImageDataGenerator' has no len() error

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jul 27, 2016

My apologies! I had confused ImageDataGenerator's flow method with the model's fit method. I updated my previous comment to reflect this.

@jsalbert

This comment has been minimized.

Copy link

@jsalbert jsalbert commented Oct 6, 2016

Hi, I have this error when creating the net:
in create_googlenet
mode='concat', concat_axis=1, name='inception_3a/output')
File "/imatge/ajimenez/workspace/ITR/keras_env/local/lib/python2.7/site-packages/keras/engine/topology.py", line 1528, in merge
name=name)
File "/imatge/ajimenez/workspace/ITR/keras_env/local/lib/python2.7/site-packages/keras/engine/topology.py", line 1186, in init
node_indices, tensor_indices)
File "/imatge/ajimenez/workspace/ITR/keras_env/local/lib/python2.7/site-packages/keras/engine/topology.py", line 1253, in _arguments_validation
'Layer shapes: %s' % (input_shapes))

Exception: "concat" mode can only merge layers with matching output shapes except for the concat axis. Layer shapes: [(None, 1, 28, 64), (None, 1, 28, 128), (None, 1, 28, 32), (None, 1, 28, 32)]

Could you help me to solve it?

@leviswind

This comment has been minimized.

Copy link

@leviswind leviswind commented Oct 25, 2016

the configuration of your keras is not correct, follow the instruction here.
File "Your user folder"/.keras/keras.json The default setting is:
{
"image_dim_ordering": "tf",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
just set "backend" to "theano" And set "image_dim_ordering" to "th"

@amanrana20

This comment has been minimized.

Copy link

@amanrana20 amanrana20 commented Oct 29, 2016

Can you please tell me what ZeroPadding2D() and PoolHelper() are doing ?
Thanks

@JGuillaumin

This comment has been minimized.

Copy link

@JGuillaumin JGuillaumin commented Dec 21, 2016

This is explained in this post : http://joelouismarino.github.io/blog_posts/blog_googlenet_keras.html

ZeroPadding2D : add zeros around feature maps after convolution layers
PoolHelper : remove the first row and column

@zaid478

This comment has been minimized.

Copy link

@zaid478 zaid478 commented Jan 13, 2017

Can anyone tell me about how to perform Localization task with the model . I want bounding box parameters of all the objects in the image . Can anyone help please.

@Miladiouss

This comment has been minimized.

Copy link

@Miladiouss Miladiouss commented Mar 27, 2017

When I run this code, I get the following error:

.../googlenet_custom_layers.py in call(self, x, mask)
     12 
     13     def call(self, x, mask=None):
---> 14         b, ch, r, c = x.shape
     15         half_n = self.n // 2 # half the local region
     16         input_sqr = T.sqr(x) # square the input

AttributeError: 'Tensor' object has no attribute 'shape'

What should I do?

@mdda

This comment has been minimized.

Copy link

@mdda mdda commented Apr 19, 2017

Would you be open to someone (like me, for instance) trying to get this model into https://github.com/fchollet/deep-learning-models (which is MIT licensed)?

Like you, I think the Googlenet original model has a nice mix of low-parameters vs. performance (and FLOPs), so it would be good to have it in the 'standard library'.

@ernest-s

This comment has been minimized.

Copy link

@ernest-s ernest-s commented May 10, 2017

Can you please let me know how to use this model for new dataset with 16 classes?

@infiniteperplexity

This comment has been minimized.

Copy link

@infiniteperplexity infiniteperplexity commented Feb 13, 2018

I get TypeError: ('The following error happened while compiling the node', DotModulo(A, s, m, A2, s2, m2), '\n', "can't concat bytes to str")

@Mohamedsabry109

This comment has been minimized.

Copy link

@Mohamedsabry109 Mohamedsabry109 commented Oct 7, 2018

this code gives different outputs when running with theano backend and tensorflow backend !!!!

@akshay-varshney

This comment has been minimized.

Copy link

@akshay-varshney akshay-varshney commented Feb 14, 2019

ValueError: Improper config format: {'l2': 0.00019999999494757503, 'name': 'WeightRegularizer', 'l1': 0.0}
I am getting this error please help me to resolve this.

@williamsnick606

This comment has been minimized.

Copy link

@williamsnick606 williamsnick606 commented Feb 16, 2019

I tried to run this and it gave me endless errors. I think it might be due to version issues. Which versions of libraries did you use?

@mamangxtc

This comment has been minimized.

Copy link

@mamangxtc mamangxtc commented Feb 27, 2019

Hi i'm new to python and getting this error while to run the code :

model = create_googlenet('googlenet_weights.h5')
TypeError: 'module' object is not callable

what should i do? please help
i'm using python 3.6 and keras 2.2.4 in Visual Studio 2017

@m-dawoud

This comment has been minimized.

Copy link

@m-dawoud m-dawoud commented Apr 12, 2019

How can I retrain using this code and weights? Urgent HELP needed please!

@yw3379

This comment has been minimized.

Copy link

@yw3379 yw3379 commented Jun 4, 2019

Hi I get the error when I run the code:
theano.tensor.var.AsTensorError: ('Cannot convert Tensor("pool1/3x3_s2/MaxPool:0", shape=(?, 1, 56, 63), dtype=float32) to TensorType', <class 'tensorflow.python.framework.ops.Tensor'>)
what does it mean?plz help
I am using python 3.6 and keras2.2.4 in pycharm

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 4, 2019

I've updated the Gist to work with the latest versions of Keras and Theano in Python 3.6. I have confirmed that this runs on the CPU.

@GianniDanese

This comment has been minimized.

Copy link

@GianniDanese GianniDanese commented Jun 16, 2019

Hi I get the error when I run the code:
theano.tensor.var.AsTensorError: ('Cannot convert Tensor("pool1/3x3_s2/MaxPool:0", shape=(?, 1, 56, 63), dtype=float32) to TensorType', <class 'tensorflow.python.framework.ops.Tensor'>)
what does it mean?plz help
I am using python 3.6 and keras2.2.4 in pycharm

I have the same problem

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 16, 2019

@yw3379 @GianniDanese it looks like its trying to convert between Theano and Tensorflow tensors. Currently, the LRN layer only works with Theano. Are you using the Theano backend in Keras?

@GianniDanese

This comment has been minimized.

Copy link

@GianniDanese GianniDanese commented Jun 16, 2019

@yw3379 @GianniDanese it looks like its trying to convert between Theano and Tensorflow tensors. Currently, the LRN layer only works with Theano. Are you using the Theano backend in Keras?

Yes now. I've adding following lines at the starting of code:

import os
os.environ['KERAS_BACKEND'] = 'theano'

But I get the error when I run the code:

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 1, 28, 64), (None, 1, 28, 128), (None, 1, 28, 32), (None, 1, 28, 32)]

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 16, 2019

Are you resizing your input image to (1, 3, 224, 224)?

@GianniDanese

This comment has been minimized.

Copy link

@GianniDanese GianniDanese commented Jun 16, 2019

Are you resizing your input image to (1, 3, 224, 224)?

Yes, I haven't changed the code.

img = imageio.imread('cat.jpg', pilmode='RGB') img = np.array(Image.fromarray(img).resize((224, 224))).astype(np.float32)

Is it correct, right? this execute the resize.

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 16, 2019

Yes, that's correct. I did some testing, and your error is actually coming from your configuration of channels. Theano uses channels as the first dimension, whereas Tensorflow uses channels as the last dimension. You can change this using the image_data_format entry in your keras.json file in the ~/.keras directory. It should look like the following:

{"epsilon": 1e-07, "floatx": "float32", "backend": "theano", "image_data_format": "channels_first"}
@GianniDanese

This comment has been minimized.

Copy link

@GianniDanese GianniDanese commented Jun 16, 2019

Yes, that's correct. I did some testing, and your error is actually coming from your configuration of channels. Theano uses channels as the first dimension, whereas Tensorflow uses channels as the last dimension. You can change this using the image_data_format entry in your keras.json file in the ~/.keras directory. It should look like the following:

{"epsilon": 1e-07, "floatx": "float32", "backend": "theano", "image_data_format": "channels_first"}

I'm sorry, I did as you told me but now appear this new error:

OSError: Unable to open file (unable to open file: name = 'googlenet_weights.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 16, 2019

You will need to download the weights (see the description of the Gist) and move them to your working directory. The main script calls

model = create_googlenet('googlenet_weights.h5')

which assumes the weights file is in the same directory. If you don't want to use a pre-trained model, you can instead just call

model = create_googlenet()
@GianniDanese

This comment has been minimized.

Copy link

@GianniDanese GianniDanese commented Jun 16, 2019

You will need to download the weights (see the description of the Gist) and move them to your working directory. The main script calls

model = create_googlenet('googlenet_weights.h5')

which assumes the weights file is in the same directory. If you don't want to use a pre-trained model, you can instead just call

model = create_googlenet()

Sorry for the trouble, I'm a beginner with these things. thank you so much!

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 16, 2019

No problem; I'm happy to help. Glad you're getting some use out of it!

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 18, 2019

@utkarsh1148 you're getting the same error as above. This is due to not changing the image_data_format field in the ~/.keras/keras.json file to channels_first. It is set to channels_last by default. See the readme.md file for more details.

@utkarsh1148

This comment has been minimized.

Copy link

@utkarsh1148 utkarsh1148 commented Jun 18, 2019

Using Theano backend.
WARNING (theano.configdefaults): g++ not available, if using conda: conda install m2w64-toolchain

Warning (from warnings module):
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\configdefaults.py", line 560
warnings.warn("DeprecationWarning: there is no c++ compiler."
UserWarning: DeprecationWarning: there is no c++ compiler.This is deprecated and with Theano 0.11 a c++ compiler will be mandatory
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
ERROR (theano.gof.opt): Optimization failure due to: local_abstractconv_check
ERROR (theano.gof.opt): node: AbstractConv2d{convdim=2, border_mode='half', subsample=(2, 2), filter_flip=True, imshp=(None, 3, 224, 224), kshp=(64, 3, 7, 7), filter_dilation=(1, 1), num_groups=1, unshared=False}(/input_1, InplaceDimShuffle{3,2,0,1}.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 2034, in process_node
replacements = lopt.transform(node)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\tensor\nnet\opt.py", line 500, in local_abstractconv_check
node.op.class.name)
theano.gof.opt.LocalMetaOptimizerSkipAssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against? On the CPU we do not support float16.

Traceback (most recent call last):
File "E:/DL/google_net.py", line 164, in
out = model.predict(img) # note: the model has three outputs
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 1164, in predict
self._make_predict_function()
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 554, in _make_predict_function
**kwargs)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\theano_backend.py", line 1397, in function
return Function(inputs, outputs, updates=updates, **kwargs)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\theano_backend.py", line 1383, in init
**kwargs)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function.py", line 317, in function
output_keys=output_keys)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function_module.py", line 1839, in orig_function
name=name)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\compile\function_module.py", line 1519, in init
optimizer_profile = optimizer(fgraph)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 108, in call
return self.optimize(fgraph)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 97, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 251, in apply
sub_prof = optimizer.optimize(fgraph)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 97, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 2143, in apply
nb += self.process_node(fgraph, node)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 2039, in process_node
lopt, node)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1933, in warn_inplace
return NavigatorOptimizer.warn(exc, nav, repl_pairs, local_opt, node)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 1919, in warn
raise exc
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\opt.py", line 2034, in process_node
replacements = lopt.transform(node)
File "C:\Users\Utkarsh\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\tensor\nnet\opt.py", line 500, in local_abstractconv_check
node.op.class.name)
theano.gof.opt.LocalMetaOptimizerSkipAssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against? On the CPU we do not support float16.

pls help

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 18, 2019

This looks to be an issue with Theano. To test things, I would try running on the CPU. Make sure you're using float32 (again, see readme.md). If that still doesn't work, I would go back and check your installation here.

@utkarsh1148

This comment has been minimized.

Copy link

@utkarsh1148 utkarsh1148 commented Jun 19, 2019

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 19, 2019

Did you follow the installation instructions for Theano? First, install Anaconda. Then install the requirements:

conda install numpy scipy mkl

(this will install mkl, which is a BLAS library). Then install Theano:

conda install theano
@Flamingwizard4

This comment has been minimized.

Copy link

@Flamingwizard4 Flamingwizard4 commented Jul 4, 2019

Hi so I am trying to implement GoogleNet for 6 classes and have added the following before each activation layer:

E.g.

loss1_classifier = Dense(1000, name='loss1/classifier', kernel_regularizer=l2(0.0002))(loss1_drop_fc)
loss1_classifier_final = Dense(6, name='loss1/classifier_final', kernel_regularizer=l2(0.0002))(loss1_classifier)
loss1_classifier_act = Activation('softmax')(loss1_classifier_final)

However, I am getting this error:

ValueError: Error when checking target: expected activation_1 to have shape (6,) but got array with shape (1,)

Is this a problem with my labels? Do I have to reformat them so the network can properly compare its predicted activations or is it something else?

Thanks

@Flamingwizard4

This comment has been minimized.

Copy link

@Flamingwizard4 Flamingwizard4 commented Jul 4, 2019

Hi so I am trying to implement GoogleNet for 6 classes and have added the following before each activation layer:

E.g.

loss1_classifier = Dense(1000, name='loss1/classifier', kernel_regularizer=l2(0.0002))(loss1_drop_fc)
loss1_classifier_final = Dense(6, name='loss1/classifier_final', kernel_regularizer=l2(0.0002))(loss1_classifier)
loss1_classifier_act = Activation('softmax')(loss1_classifier_final)

However, I am getting this error:

ValueError: Error when checking target: expected activation_1 to have shape (6,) but got array with shape (1,)

Is this a problem with my labels? Do I have to reformat them so the network can properly compare its predicted activations or is it something else?

Thanks

Nevermind, I just had to use sparse_categorical_crossentropy :)

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jul 30, 2019

After a lot of trial and errors I could setup the LRN layer even with the TensorFlow backend. The major resistance was being offered by the fact that the latest Keras API lacks the LRN layer. Also, keras.backend.* set of functions couldn't help it much but only produce incomplete shape errors.

Setting the batch_input_shape for the old LRN2D class also do not appear to be working, and produces the same error "Cannot convert a partially known TensorShape to a Tensor".

The workaround is actually pretty simple and based on use of keras.layers.Lambda with the tf.nn.local_response_normalization function.

googlenet_custom_layers.py

from keras.layers.core import Layer, Lambda
import tensorflow as tf

# wraps up the tf.nn.local_response_normalisation
# into the keras.layers.Lambda so
# we can have a custom keras layer as
# a class that will perform LRN ops     
class LRN(Lambda):

    def __init__(self, alpha=0.0001, beta=0.75, depth_radius=5, **kwargs):
        # using parameter defaults as per GoogLeNet
        params = {
            "alpha": alpha,
            "beta": beta,
            "depth_radius": depth_radius
        }
        # construct a function for use with Keras Lambda
        lrn_fn = lambda inputs: tf.nn.local_response_normalization(inputs, **params)

        # pass the function to Keras Lambda
        return super().__init__(lrn_fn, **kwargs)

# this layer is also required by GoogLeNet (same as above)
class PoolHelper(Layer):
    
    def __init__(self, **kwargs):
        super(PoolHelper, self).__init__(**kwargs)
    
    def call(self, x, mask=None):
        return x[:,:,1:,1:]
    
    def get_config(self):
        config = {}
        base_config = super(PoolHelper, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Hope this helps!

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Aug 1, 2019

@swghosh thanks! I looked into wrapping Tensorflow's LRN layer, but unfortunately, there are some slight differences in how things are implemented. However, you motivated me enough to spend the time implementing the LRN layer with Tensorflow-compatible operations. I also had to make some modifications to googlenet.py because 1) padding for convolutions is somewhat inconsistent across different backends, and 2) Tensorflow uses rotated convolutional kernels relative to Theano.

I've updated the gist with Tensorflow compatibility, and I've confirmed that the outputs are consistent across backends.

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Aug 3, 2019

Thanks a lot @joelouismarino, much appreciated!

@NilakshanKunananthaseelan

This comment has been minimized.

Copy link

@NilakshanKunananthaseelan NilakshanKunananthaseelan commented Aug 5, 2019

Why LRN used here? The paper doesn't have such function.I got output filter count as 63 instead of 64

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Aug 5, 2019

Look at Figure 3 from Szegedy et al.. Though it's not explained in the text, they do indeed use local response normalization (twice).

@NilakshanKunananthaseelan

This comment has been minimized.

Copy link

@NilakshanKunananthaseelan NilakshanKunananthaseelan commented Aug 7, 2019

Hi,
To work in tensorflow backend with channel_last convention,I have modified three scripts little bit.Is any weight file available which trained using channel last convention?or any way to load transposed weights into the model
I got the following error.It seems like weight dimension is swapped.
ValueError: Dimension 2 in both shapes must be equal, but are 3 and 64. Shapes are [7,7,3,64] and [7,7,64,3]. for 'Assign' (op: 'Assign') with input shapes: [7,7,3,64], [7,7,64,3].

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Aug 7, 2019

Hi @NilakshanKunananthaseelan yeah you will have to use numpy's transpose function to change the dimension order in the convolutional weights from channels first to channels last. I assume you've already found the other changes: changing the input image dimensions, the concatenation axis in Concatenate, the pool_helper axes, and the axis along which the LRN is applied. If you really need this functionality, it's not too bad to implement, but the code already runs with the tensorflow backend by setting channels_first in your keras.json file (see the README).

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Aug 7, 2019

Hello @joelouismarino, there appear to be certain errors with the model architecture (not sure though why or how).
I keep receiving this error just after the first LRN layer (pool1/norm1).

The error is shown below.

conv2_3x3_reduce = Conv2D(64, (1,1), padding='same', activation='relu', name='conv2/3x3_reduce', kernel_regularizer=l2(0.0002))(pool1_norm1)
- ValueError: Depth of input (63) is not a multiple of input depth of filter (64) for 'conv2/3x3_reduce/convolution' (op: 'Conv2D') with input shapes: [?,1,56,63], [1,1,64,64].

Any help in this regard will be appreciated. Thanks in advance.

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Aug 7, 2019

Hi @swghosh, you currently have keras running with convolutions performed with "image_data_format": "channels_first". However, the code is currently set up to run with "image_data_format": "channels_last". The README document above explains how to change this (note: it works with either backend).

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Aug 8, 2019

The model appears to be working fine with "image_data_format": "channels_last" and no such errors!
Thanks for all the help!

@20130353

This comment has been minimized.

Copy link

@20130353 20130353 commented Nov 11, 2019

hi,guys:
i downloaded your codes and added the following codes:
img_rows, img_cols = 224, 224 if K.image_data_format() == 'channels_first': input_shape = (3, img_rows, img_cols) else: input_shape = (img_rows, img_cols, 3)
but it raised ValueError: number of input channels does not match corresponding dimension of filter, 63 != 64.

The detail is:
File "D:/workspace/ME/CNN_test/googlenet/googlenet.py", line 280, in
model = create_googlenet('googlenet_weights.h5')
File "D:/workspace/ME/CNN_test/googlenet/googlenet.py", line 40, in create_googlenet
kernel_regularizer=l2(0.0002))(pool1_norm1)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\engine\base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\layers\convolutional.py", line 171, in call
dilation_rate=self.dilation_rate)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\backend\tensorflow_backend.py", line 3650, in conv2d
data_format=tf_data_format)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 779, in convolution
data_format=data_format)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 839, in init
filter_shape[num_spatial_dims]))
ValueError: number of input channels does not match corresponding dimension of filter, 63 != 64

latform is win10, python3.6, tensorflow-gpu=1.12

could you help me? thanks a lot!

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Nov 11, 2019

hi,guys:
i downloaded your codes and added the following codes:
img_rows, img_cols = 224, 224 if K.image_data_format() == 'channels_first': input_shape = (3, img_rows, img_cols) else: input_shape = (img_rows, img_cols, 3)
but it raised ValueError: number of input channels does not match corresponding dimension of filter, 63 != 64.

The detail is:
File "D:/workspace/ME/CNN_test/googlenet/googlenet.py", line 280, in
model = create_googlenet('googlenet_weights.h5')
File "D:/workspace/ME/CNN_test/googlenet/googlenet.py", line 40, in create_googlenet
kernel_regularizer=l2(0.0002))(pool1_norm1)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\engine\base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\layers\convolutional.py", line 171, in call
dilation_rate=self.dilation_rate)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\keras\backend\tensorflow_backend.py", line 3650, in conv2d
data_format=tf_data_format)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 779, in convolution
data_format=data_format)
File "C:\Users\smx\Anaconda3\envs\tfg1.12p3.6\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 839, in init
filter_shape[num_spatial_dims]))
ValueError: number of input channels does not match corresponding dimension of filter, 63 != 64

latform is win10, python3.6, tensorflow-gpu=1.12

could you help me? thanks a lot!

You can edit the keras configuration file located at C:\Users\<username>\.keras\keras.json (<user_home_directory>/.keras/keras.json in Linux/UNIX) to use "channels_first" instead of "channels_last" under "image_data_format".
I hope that it will resolve the problem.

@ShayanRezvaninia

This comment has been minimized.

Copy link

@ShayanRezvaninia ShayanRezvaninia commented Nov 30, 2019

hi
I changed the last layers to a Dense 4
I want to use pretrain weights (googlenet_weights.h5) for all layers except the last layer
how can i do that ?

I get this error
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 4 and 1000. Shapes are [1024,4] and [1024,1000]. for 'Assign_126' (op: 'Assign') with input shapes: [1024,4], [1024,1000].
tnx

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Dec 1, 2019

Are you trying to load_weights googlenet_weights.h5 after adding Dense(4) layer to your model?

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Dec 4, 2019

Hi @ShayanRezvaninia you have to create the full googlenet model to load the weights. To use a different output size, you will need to create separate layers that take the corresponding layers in googletnet as input, then create a new model with those layers as the outputs. You will have to make a couple simple modifications to the code, e.g. in addition to the current model definition, create:

loss1_classifier_new = Dense(4, name='loss1/classifier', kernel_regularizer=l2(0.0002))(loss1_drop_fc)
loss1_classifier_act_new = Activation('softmax')(loss1_classifier_new)
...
loss2_classifier_new = Dense(4, name='loss2/classifier', kernel_regularizer=l2(0.0002))(loss2_drop_fc)
loss2_classifier_act_new = Activation('softmax')(loss2_classifier_new)
...
loss3_classifier_new = Dense(4, name='loss3/classifier', kernel_regularizer=l2(0.0002))(loss3_drop_fc)
loss3_classifier_act_new = Activation('softmax')(loss3_classifier_new)
...
googlenet = Model(inputs=input, outputs=[loss1_classifier_act,loss2_classifier_act,loss3_classifier_act])
googlenet_new = Model(inputs=input, outputs=[loss1_classifier_act_new,loss2_classifier_act_new,loss3_classifier_act_new])
googlenet.load_weights(weights_path)

Then you can use googlenet_new for your task.

@colinrgodsey

This comment has been minimized.

Copy link

@colinrgodsey colinrgodsey commented Dec 12, 2019

What exactly is going on here with the image normalization? Can't find any information on what normalization was used when those weights were initially trained.

    img = np.array(Image.fromarray(img).resize((224, 224))).astype(np.float32)
    img[:, :, 0] -= 123.68
    img[:, :, 1] -= 116.779
    img[:, :, 2] -= 103.939
    img[:,:,[0,1,2]] = img[:,:,[2,1,0]]
    img = img.transpose((2, 0, 1))
    img = np.expand_dims(img, axis=0)

Seems like Google originally used ((img / 255) - 1.0) * 2.0

nvm, found it: https://groups.google.com/forum/#!topic/lasagne-users/cCFVeT5rw-o

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Dec 12, 2019

ImageNet mean normalization have been applied on the input image.
This is equivalent to using

from keras.applications.imagenet_utils import preprocess_input
img = preprocess_input(img, mode='caffe')  # also converts RGB to BGR

ImageNet models trained using Caffe usually have such normalization applied on the input image. (eg: VGG16 model in keras_applications, weights ported from caffe)

In this case, the Inception v1 weigths have been ported from Caffe trained model hence, RGB to BGR conversion as well as ImageNet mean normalization have been applied on original image before passing it to the network.

The original GoogLeNet weights can be either obtained from BVLC or from GitHub Inception repository by Google.

Reference:
https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py#L18

https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py#L20

Later versions of the Inception network (eg. InceptionV3, as present in keras applications) uses weights trained on TensorFlow models. And, such images are normalised such that they have a range [-1, 1] instead of [0, 255]. x /= 127.5; x -= 1

Reference:

https://github.com/keras-team/keras-applications/blob/master/keras_applications/inception_v3.py#L407

https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py#L116-L119

@mohamed-nafea

This comment has been minimized.

Copy link

@mohamed-nafea mohamed-nafea commented Jan 25, 2020

Hi @swghosh, you currently have keras running with convolutions performed with "image_data_format": "channels_first". However, the code is currently set up to run with "image_data_format": "channels_last". The README document above explains how to change this (note: it works with either backend).

Hi,

Kindly note that I've faced the same error when running on goole cloab, and I've checked the keras setting, and found that "image_data_format": "channels_last" is set as mentioned here.

So can you please help me to overcome this issue, or let me know if I'm missing something.
Thanks,

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jan 27, 2020

Hi @mohamed-nafea, just to make sure, when you check keras.backend.image_data_format() from inside the colab notebook, you get 'channels_last'?

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jan 29, 2020

Hello @mohamed-nafea,

In order to set image format to channels first in latest version of keras (and, while using Colab) you can use the following code:

import keras.backend as K
K.set_image_data_format('channels_first')
@mohamed-nafea

This comment has been minimized.

Copy link

@mohamed-nafea mohamed-nafea commented Jan 29, 2020

Hi @joelouismarino,
I've checked te value returned from "keras.backend.image_data_format()" and it is "channel_last", and unfortunately I got the same error message :(.

@swghosh
I've tried to set the image data format to "channel_first", and I've overcome the previous error message but faced the below error message:


ValueError: Layer #98 (named "loss1/fc"), weight <tf.Variable 'loss1/fc/kernel:0' shape=(3200, 1024) dtype=float32_ref> has shape (3200, 1024), but the saved weight has shape (2048, 1024).

So have you faced something like that before.

Thanks so much for @joelouismarino, and @swghosh help.

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jan 29, 2020

Hi @mohamed-nafea. My apologies; the code should indeed be run in "channels_first" format.

As for your new error, the flatten layers in the network expect an output that is of size 2048. My guess is that you may have changed the input size to the network, and now the output dimensions don't line up properly. Have you modified the network structure in any way?

@mohamed-nafea

This comment has been minimized.

Copy link

@mohamed-nafea mohamed-nafea commented Jan 30, 2020

Hi @joelouismarino,
Thanks so much for the follow up :).

Actually, the issue has been fixed and the model works now with a new clean notepad on google colab. I didn't know the reason that leads to this error, but maybe I've performed a change in the code that broken the functionality.

@sunnybear

This comment has been minimized.

Copy link

@sunnybear sunnybear commented Feb 29, 2020

Hello everybody. With python classes for PoolHelper and LRN +

keras.backend.set_image_data_format('channels_first')

for tensorflow backend model successfully loaded under keras/tensorflow.

Also in my case it had to change
yield(x,y)
to
yield(x,[y,y,y])
in fit_generator function to correctly load targets for all activation layers.

@sandeepnmenon

This comment has been minimized.

Copy link

@sandeepnmenon sandeepnmenon commented Mar 1, 2020

Error in this line
conv2_3x3_reduce = Conv2D(64, (1,1), padding='same', activation='relu', name='conv2/3x3_reduce', kernel_regularizer=l2(0.0002))(pool1_norm1)

ValueError: Depth of input (63) is not a multiple of input depth of filter (64) for 'conv2/3x3_reduce/convolution' (op: 'Conv2D') with input shapes: [?,1,56,63], [1,1,64,64].

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Mar 1, 2020

@sandeepnmenon, please try with image data format as channels first instead of channels last as described in the thread above.

@sunnybear

This comment has been minimized.

Copy link

@sunnybear sunnybear commented Mar 1, 2020

Is there any docs/examples about training GoogLeNet (or Inception v1) vs. custom images/classes? It constantly shows no improvement after 25 epochs (about 1% of total loss, no change in mae) even with 10, or 1000, or 10000 samples and binary classifier.

@mg0721

This comment has been minimized.

Copy link

@mg0721 mg0721 commented Mar 4, 2020

Hello. Now I'm studying GoogLeNet using keras, and your codes are very helpful to me. Much thanks. If you are okay, can I ask you one question? I saw some image pre-processing in your codes, when getting cat test image. could you teach me why did you change RGB to BGR image? and why did you do transpose?

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Mar 4, 2020

@MarwanGriffin

This comment has been minimized.

Copy link

@MarwanGriffin MarwanGriffin commented Mar 17, 2020

so i'm trying to do transfert learning for my dataset , since your model is not Sequential object i got troubles when i want to add one last Dense layer as this Dense(2, activation="softmax",name='dense25') because i have two classes
but you using tensor layers so i can not dd a sequential object iwhen i tried with Model([model_base.input, top_model.input])
so i tried with Concatenate but i couldn't compile the model it says Concatenate object doesn't have copile attribute

@sunnybear

This comment has been minimized.

Copy link

@sunnybear sunnybear commented Mar 17, 2020

You need to replace all three outputs for GoogLeNet. Also sigmoid can work better for multi class problem.

so i'm trying to do transfert learning for my dataset , since your model is not Sequential object i got troubles when i want to add one last Dense layer as this Dense(2, activation="softmax",name='dense25') because i have two classes

@hamdymahmoud2019

This comment has been minimized.

Copy link

@hamdymahmoud2019 hamdymahmoud2019 commented Mar 18, 2020

How extract features from this code

@youssefkhaled2019

This comment has been minimized.

Copy link

@youssefkhaled2019 youssefkhaled2019 commented Mar 19, 2020

can you help me in this problem?
2 frames

in create_googlenet(weights_path)
6 conv1_7x7_s2 = Conv2D(64, (7,7), strides=(2,2), padding='valid', activation='relu', name='conv1/7x7_s2', kernel_regularizer=l2(0.0002))(input_pad)
7 conv1_zero_pad = ZeroPadding2D(padding=(1, 1))(conv1_7x7_s2)
----> 8 pool1_helper = PoolHelper()(conv1_zero_pad)
9 pool1_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid', name='pool1/3x3_s2')(pool1_helper)
10 pool1_norm1 = LRN(name='pool1/norm1')(pool1_3x3_s2)

/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in _collect_previous_mask(input_tensors)
1303 inbound_layer, node_index, tensor_index = x._keras_history
1304 node = inbound_layer._inbound_nodes[node_index]
-> 1305 mask = node.output_masks[tensor_index]
1306 masks.append(mask)
1307 else:

AttributeError: 'Node' object has no attribute 'output_masks'

@sunnybear

This comment has been minimized.

Copy link

@sunnybear sunnybear commented Mar 19, 2020

First link in Google: https://stackoverflow.com/questions/51821537/attributeerror-node-object-has-no-attribute-output-masks
It seems there are conflicts between Keras and Tensorflow somewhere. Check imports and libraries' versions

@carlosbar

This comment has been minimized.

Copy link

@carlosbar carlosbar commented Jun 15, 2020

Can you provide info why you are doing this:

    img[:, :, 0] -= 123.68
    img[:, :, 1] -= 116.779
    img[:, :, 2] -= 103.939

And how did you calculate the current weights?

@joelouismarino

This comment has been minimized.

Copy link
Owner Author

@joelouismarino joelouismarino commented Jun 15, 2020

@carlosbar subtraction is performed channel-wise to normalize the image inputs using the average channel values across the ImageNet dataset. See the comment from @swghosh above.

@carlosbar

This comment has been minimized.

Copy link

@carlosbar carlosbar commented Jun 16, 2020

@joelouismarino thank you!

@AnubhavCR7

This comment has been minimized.

Copy link

@AnubhavCR7 AnubhavCR7 commented Jul 15, 2020

Hi, Can anyone tell me how can I chop off the Fully Connected layer of Inceptionv1 model, by passing an include_top argument in the function defination (like in keras) ? I want to use only the convolutional blocks of the model. And can anyone give me the link to the pre-trained Imagenet weights file suitable for this model ?
Regards.

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jul 15, 2020

Hi, Can anyone tell me how can I chop off the Fully Connected layer of Inceptionv1 model, by passing an include_top argument in the function defination (like in keras) ? I want to use only the convolutional blocks of the model. And can anyone give me the link to the pre-trained Imagenet weights file suitable for this model ?
Regards.

Hi @AnubhavCR7,

As this is a custom implementation, the argument include_top=True|False is not provided (at least as of now) unlike other networks that form a part of the keras.applications and comes with a distinctive API.

As a workaround, for using the convolutional base only, I'd recommend using the following code snippet:

googlenet = create_googlenet('googlenet_weights.h5')
base_conv_model = keras.Model(inputs=googlenet.input, outputs=googlenet.get_layer('inception_5b/output').output)

# in case you're looking to train a custom classifier beneath,
# would require bit additional code
base_conv_model.trainable = False
new_classifier = keras.Sequential([
     base_conv_model,
     keras.layers.GlobalAveragePooling2D(),
     keras.layers.Dropout(rate=0.4),
     keras.layers.Dense(new_num_classes, activation='softmax')
])

new_classifier.compile(loss='categorical_cross_entropy', optimizer='sgd', metrics=['accuracy'])
new_classifier.fit(X, y)

Please make sure that images in X (above code) are: (ref: https://gist.github.com/joelouismarino/a2ede9ab3928f999575423b9887abd14#file-googlenet-py-L186-L193)

  • mean normalized using [R: 123.68, G: 116.779, B: 103.939]
  • uses BGR format
  • is in channels first format i.e. NCHW not NHWC!

An easy way to do all of this using a single function call would be:

X = keras.applications.imagenet_utils.preprocess_input(images, mode='caffe')

PS: some parts of the code might not be compatible with latest version of Keras (v2.4, as of writing this comment) a.k.a tf.keras (the new default) and might need some additional changes in code base to use the same.

And, regarding the pre-trained (ImageNet) weights for GoogLeNet, a Google Drive link is provided by author in the very starting readme of this gist.

googlenet.py also contains a demo image classification. To run the demo, you will need to install the pre-trained weights and the class labels. You will also need this test image.

@AnubhavCR7

This comment has been minimized.

Copy link

@AnubhavCR7 AnubhavCR7 commented Jul 16, 2020

@swghosh Thank you for your response. I was actually thinking of fine-tuning the GoogleNet model to run on my own custom dataset. I am using Tensorflow 1.15.0, Keras 2.3.1 and Python 3.7.7 The modifications that you said like, Mean normalization, using BGR format, channels first format etc will still be required ?

And one more thing, if I want to train the last 2 conv blocks of the base model, I should start training from layer number 102, right ?
Regards.

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jul 16, 2020

@swghosh Thank you for your response. I was actually thinking of fine-tuning the GoogleNet model to run on my own custom dataset. I am using Tensorflow 1.15.0, Keras 2.3.1 and Python 3.7.7 The modifications that you said like, Mean normalization, using BGR format, channels first format etc will still be required ?

And one more thing, if I want to train the last 2 conv blocks of the base model, I should start training from layer number 102, right ?
Regards.

The data pre-processing steps (mean normalization, BGR, channels first) are a must and will still be required. Data pre-processing should work right away and prefer using, keras.applications.imagenet_utils.preprocess_input(image, mode='caffe') (also, works with keras.preprocessing.image.ImageDataGenerator, something like gen = ImageDataGenerator(preprocessing_function=lambda x: preprocess_input(x, mode='caffe') should suffice).

Also, since you are using TFv1.x + Keras v2.3.x, I am hoping that you shouldn't run into any errors. (if you do, feel free to ping here with error details).

Lastly, for training the 2 conv blocks (fine-tuning) yes you could choose the layer number from the codebase (PS: I din't check layer number for last 2 conv blocks, please reverify if required) or you may use layer names as an alternative, bit cleaner way to do.

@AnubhavCR7

This comment has been minimized.

Copy link

@AnubhavCR7 AnubhavCR7 commented Jul 23, 2020

@swghosh Thank you for your response. I was actually thinking of fine-tuning the GoogleNet model to run on my own custom dataset. I am using Tensorflow 1.15.0, Keras 2.3.1 and Python 3.7.7 The modifications that you said like, Mean normalization, using BGR format, channels first format etc will still be required ?
And one more thing, if I want to train the last 2 conv blocks of the base model, I should start training from layer number 102, right ?
Regards.

The data pre-processing steps (mean normalization, BGR, channels first) are a must and will still be required. Data pre-processing should work right away and prefer using, keras.applications.imagenet_utils.preprocess_input(image, mode='caffe') (also, works with keras.preprocessing.image.ImageDataGenerator, something like gen = ImageDataGenerator(preprocessing_function=lambda x: preprocess_input(x, mode='caffe') should suffice).

Also, since you are using TFv1.x + Keras v2.3.x, I am hoping that you shouldn't run into any errors. (if you do, feel free to ping here with error details).

Lastly, for training the 2 conv blocks (fine-tuning) yes you could choose the layer number from the codebase (PS: I din't check layer number for last 2 conv blocks, please reverify if required) or you may use layer names as an alternative, bit cleaner way to do.

@swghosh Thank you. The weights file for GoogleNet will consist of the weights of the top_layer or classification layer also. So if I try to load that weights file into the GoogleNet architecture with it's top_layer chopped off, i.e. only the conv base will be there, that should not create any problem, right ?

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jul 23, 2020

@swghosh Thank you. The weights file for GoogleNet will consist of the weights of the top_layer or classification layer also. So if I try to load that weights file into the GoogleNet architecture with it's top_layer chopped off, i.e. only the conv base will be there, that should not create any problem, right ?

It won't work with the top_layer chopped off, as Keras expects that the number of layers in weights file exactly be equal to the number of layers in the architecture where you are trying to load the weights. Please load weights on complete archietcture as given in create_googlenet function, then remove the unnecessary layers using:

base_conv_model = keras.Model(inputs=googlenet.input, outputs=googlenet.get_layer('inception_5b/output').output)

as shared above.

@AnubhavCR7

This comment has been minimized.

Copy link

@AnubhavCR7 AnubhavCR7 commented Jul 24, 2020

@swghosh
Ok, I see, I am using Tensorflow backend with "channels_last" data format but the code requires "channels_first" data format. Setting the keras backend to "channels_first" at the very beginning of the code should solve this problem, right ?

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Jul 24, 2020

@swghosh
Ok, I see, I am using Tensorflow backend with "channels_last" data format but the code requires "channels_first" data format. Setting the keras backend to "channels_first" at the very beginning of the code should solve this problem, right ?

Right. (read in comments above)

@AnubhavCR7

This comment has been minimized.

Copy link

@AnubhavCR7 AnubhavCR7 commented Jul 24, 2020

if keras.backend.image_data_format() == 'channels_last': shape = (224, 224, 3) else: shape = (3, 224,224)

And then in the very first line where the input shape is specified in the model architecture : input = Input(shape = shape)

This snippet should work I guess. Regards.

@3dimaging

This comment has been minimized.

Copy link

@3dimaging 3dimaging commented Aug 4, 2020

Thank you for your nice work! The code works very well for me.

@mansi-aggarwal-2504

This comment has been minimized.

Copy link

@mansi-aggarwal-2504 mansi-aggarwal-2504 commented Sep 8, 2020

out = model.predict(img) # note: the model has three outputs

What do these three outputs denote? Also, when I retrieve the predicted class using out[2], I get the same predicted class for all inputs.
Can someone help me this?

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Sep 8, 2020

out = model.predict(img) # note: the model has three outputs

What do these three outputs denote? Also, when I retrieve the predicted class using out[2], I get the same predicted class for all inputs.
Can someone help me this?

@mansi-aggarwal-2504, I'm not sure why out[2] producing same predicted class for all images. Seems like a problem though (if pre-trained weights are loaded correctly).

PS: out[2] is the softmax prediction from the actual classifier (primary classifier of our interest) whereas out[0] and out[1] are outputs of auxiliary classifiers just used to make sure that the model converges faster during training. There is little to no use of the auxiliary classifiers at inference time.

However, if the pre-trained weights are loaded correctly and valid images (containing objects from the 1000 ImageNet classes) are used for inference the problem of getting same predicted class (even from auxiliary outputs) should not occur.

If you think that this seems like a problem with the pre-trained weights / the implementation here, feel free to post a Colab notebook link with minimal code required to replicate the issue. I'll look into it accordingly and try to figure out.

@mansi-aggarwal-2504

This comment has been minimized.

Copy link

@mansi-aggarwal-2504 mansi-aggarwal-2504 commented Sep 8, 2020

@swghosh thank you for your reply. Actually, I am not using the pre-trained model. I am using the architecture to train a model and I have two classes. And I am getting the same predicted class for all inputs.
Also, I will try to provide my code snipped soon. Thank you.

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Sep 9, 2020

UPDATE (09/09/2020)

TensorFlow Hub is a library for the publication, discovery, and consumption of reusable parts of machine learning models. A module is a self-contained piece of a TensorFlow graph, along with its weights and assets, that can be reused across different tasks in a process known as transfer learning.

A clean and easier alternative to the codebase and pre-trained weights provided here would be to make use of InceptionV1 from TensorFlow Hub as that also allows for interoperability with the latest Keras (tf.keras) and TensorFlow 2.x frameworks. Typically, it should serve the same purpose as this gist.

Imagenet Inception V1 on TensorFlow Hub: https://tfhub.dev/google/imagenet/inception_v1/classification/1

@mikechen66

This comment has been minimized.

Copy link

@mikechen66 mikechen66 commented Sep 12, 2020

After a lot of trial and errors I could setup the LRN layer even with the TensorFlow backend. The major resistance was being offered by the fact that the latest Keras API lacks the LRN layer. Also, keras.backend.* set of functions couldn't help it much but only produce incomplete shape errors.

Setting the batch_input_shape for the old LRN2D class also do not appear to be working, and produces the same error "Cannot convert a partially known TensorShape to a Tensor".

The workaround is actually pretty simple and based on use of keras.layers.Lambda with the tf.nn.local_response_normalization function.

googlenet_custom_layers.py

from keras.layers.core import Layer, Lambda
import tensorflow as tf

# wraps up the tf.nn.local_response_normalisation
# into the keras.layers.Lambda so
# we can have a custom keras layer as
# a class that will perform LRN ops     
class LRN(Lambda):

    def __init__(self, alpha=0.0001, beta=0.75, depth_radius=5, **kwargs):
        # using parameter defaults as per GoogLeNet
        params = {
            "alpha": alpha,
            "beta": beta,
            "depth_radius": depth_radius
        }
        # construct a function for use with Keras Lambda
        lrn_fn = lambda inputs: tf.nn.local_response_normalization(inputs, **params)

        # pass the function to Keras Lambda
        return super().__init__(lrn_fn, **kwargs)

# this layer is also required by GoogLeNet (same as above)
class PoolHelper(Layer):
    
    def __init__(self, **kwargs):
        super(PoolHelper, self).__init__(**kwargs)
    
    def call(self, x, mask=None):
        return x[:,:,1:,1:]
    
    def get_config(self):
        config = {}
        base_config = super(PoolHelper, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Hope this helps!

It generated the same error as the original script with the TensorFlow.

if I change the shape from (3, 224, 224) to (224, 224, 3)
input = Input(shape=(224, 224, 3))

It has the following error.

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 28, 28, 64), (None, 28, 28, 128), (None, 28, 28, 32), (None, 28, 28, 32)]

or

If the shape keeps no changed.
input = Input(shape=(3, 224, 224))

It has the following error.

[ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 1, 28, 64), (None, 1, 28, 128), (None, 1, 28, 32), (None, 1, 28, 32)]

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Sep 12, 2020

@mikechen66

After a lot of trial and errors I could setup the LRN layer even with the TensorFlow backend. The major resistance was being offered by the fact that the latest Keras API lacks the LRN layer. Also, keras.backend.* set of functions couldn't help it much but only produce incomplete shape errors.
Setting the batch_input_shape for the old LRN2D class also do not appear to be working, and produces the same error "Cannot convert a partially known TensorShape to a Tensor".
The workaround is actually pretty simple and based on use of keras.layers.Lambda with the tf.nn.local_response_normalization function.
googlenet_custom_layers.py

from keras.layers.core import Layer, Lambda
import tensorflow as tf

# wraps up the tf.nn.local_response_normalisation
# into the keras.layers.Lambda so
# we can have a custom keras layer as
# a class that will perform LRN ops     
class LRN(Lambda):

    def __init__(self, alpha=0.0001, beta=0.75, depth_radius=5, **kwargs):
        # using parameter defaults as per GoogLeNet
        params = {
            "alpha": alpha,
            "beta": beta,
            "depth_radius": depth_radius
        }
        # construct a function for use with Keras Lambda
        lrn_fn = lambda inputs: tf.nn.local_response_normalization(inputs, **params)

        # pass the function to Keras Lambda
        return super().__init__(lrn_fn, **kwargs)

# this layer is also required by GoogLeNet (same as above)
class PoolHelper(Layer):
    
    def __init__(self, **kwargs):
        super(PoolHelper, self).__init__(**kwargs)
    
    def call(self, x, mask=None):
        return x[:,:,1:,1:]
    
    def get_config(self):
        config = {}
        base_config = super(PoolHelper, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Hope this helps!

It generated the same error as the original script.
ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 1, 28, 64), (None, 1, 28, 128), (None, 1, 28, 32), (None, 1, 28, 32)]

https://gist.github.com/joelouismarino/a2ede9ab3928f999575423b9887abd14#gistcomment-3451596

@swghosh

This comment has been minimized.

Copy link

@swghosh swghosh commented Sep 12, 2020

@mikechen66

After a lot of trial and errors I could setup the LRN layer even with the TensorFlow backend. The major resistance was being offered by the fact that the latest Keras API lacks the LRN layer. Also, keras.backend.* set of functions couldn't help it much but only produce incomplete shape errors.
Setting the batch_input_shape for the old LRN2D class also do not appear to be working, and produces the same error "Cannot convert a partially known TensorShape to a Tensor".
The workaround is actually pretty simple and based on use of keras.layers.Lambda with the tf.nn.local_response_normalization function.
googlenet_custom_layers.py

from keras.layers.core import Layer, Lambda
import tensorflow as tf

# wraps up the tf.nn.local_response_normalisation
# into the keras.layers.Lambda so
# we can have a custom keras layer as
# a class that will perform LRN ops     
class LRN(Lambda):

    def __init__(self, alpha=0.0001, beta=0.75, depth_radius=5, **kwargs):
        # using parameter defaults as per GoogLeNet
        params = {
            "alpha": alpha,
            "beta": beta,
            "depth_radius": depth_radius
        }
        # construct a function for use with Keras Lambda
        lrn_fn = lambda inputs: tf.nn.local_response_normalization(inputs, **params)

        # pass the function to Keras Lambda
        return super().__init__(lrn_fn, **kwargs)

# this layer is also required by GoogLeNet (same as above)
class PoolHelper(Layer):
    
    def __init__(self, **kwargs):
        super(PoolHelper, self).__init__(**kwargs)
    
    def call(self, x, mask=None):
        return x[:,:,1:,1:]
    
    def get_config(self):
        config = {}
        base_config = super(PoolHelper, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Hope this helps!

It generated the same error as the original script.
ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 1, 28, 64), (None, 1, 28, 128), (None, 1, 28, 32), (None, 1, 28, 32)]

https://gist.github.com/joelouismarino/a2ede9ab3928f999575423b9887abd14#gistcomment-3451596

Prefer to use the LRN layer definition provided in this Gist (instead of my custom code) as it well supports (atleast older versions of) TensorFlow so that you can correctly use the pre-trained weights used over here.

In case your're training your model from scratch, feel free to use a Keras Lambda layer wrapper around tf.nn.local_response_normalization.

Repeated doc: this implementation requires channels_first format whatsover and channels_last won't work.

@3dimaging

This comment has been minimized.

Copy link

@3dimaging 3dimaging commented Sep 12, 2020

I have tried both "tensorflow" and "theano" backend to train model from scratch. I didn't see any issue.

@mikechen66

This comment has been minimized.

Copy link

@mikechen66 mikechen66 commented Sep 13, 2020

Thanks for your reply. I use Keras 2.4.3 and TensorFlow 2.3. It might be a reason for the issue of version compatibility. After updating the following lines of code to adapt to the above-mentioned environment, I can run the script with the correct classification. But I get the wrong total parameter number while running googlenet.summary().

1. Modify the script to adapt to TensorFlow 2.4.3 and Keras 2.4

Predicted Class: 282 , Class Name: n02123159 tiger cat

I make the following modification.

Modify the import statements

if keras.backend.backend() == 'tensorflow':
    # -from keras import backend as K
    from keras import backend
    # -import tensorflow as tf
    import tensorflow.compat.v1 as tf
    tf.compat.v1.disable_eager_execution()
    from keras.utils.conv_utils import convert_kernel

Set up the GPU to avoid the runtime error: Could not create cuDNN handle...

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

Delete the following lines of code highlighted with "# -" and add the new lines of code below the highlighted code.

    if keras.backend.backend() == 'tensorflow':
        # convert the convolutional kernels for tensorflow
        ops = []
        for layer in googlenet.layers:
            if layer.__class__.__name__ == 'Conv2D':
                # -original_w = K.get_value(layer.kernel)
                original_w = keras.backend.get_value(layer.kernel)
                converted_w = convert_kernel(original_w)
                # -ops.append(tf.assign(layer.kernel, converted_w).op)
                ops.append(tf.compat.v1.assign(layer.kernel, converted_w).op)
        # -K.get_session().run(ops)
        tf.compat.v1.keras.backend.get_session().run(ops)

2. Total parameter number(wrong)

I get the total parameter number of 13,378,280. But the original GoogLeNet Inception v1 has 5.79+ million parameters in total. What's wrong with the huge gap of the total parameter numbers?

After adding the three line of code and deleting the sections including the sections from "if weights_path..." to the main section.

input = Input(shape=(3, 224, 224))
googlenet = create_googlenet(input)
googlenet.summary()

Reference:
Google Inception v1 Paper: Page: 5/9
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Cheers

@mikechen66

This comment has been minimized.

Copy link

@mikechen66 mikechen66 commented Sep 14, 2020

I

@3dimaging

This comment has been minimized.

Copy link

@3dimaging 3dimaging commented Sep 14, 2020

Mikechen66 is right! The parameters is way more than reported. I got 10,309,430 parameters. However, the googlenet model from keras repository gives me 5,587,394 parameters, which is close to the report.

@debaditya-unimelb

This comment has been minimized.

Copy link

@debaditya-unimelb debaditya-unimelb commented Sep 18, 2020

@swghosh Thank you for the code, I really appreciate it. However, when I try to input image size different from 224 x 224 (the new image size is 742 x 224) I get matrix Multiplication errors like the following:

Matrix size-incompatible: In[0]: [50,7168], In[1]: [7680,1024] [[{{node loss1/fcNew/MatMul}}]]

Where loss1/fcNew is a newly added layer with 1024 dimension.

I was able to get off with the error when I deteted all the pool_helper function from the network. Any thoughts why? What exactly is the pool_helper function doing?

@debaditya-unimelb

This comment has been minimized.

Copy link

@debaditya-unimelb debaditya-unimelb commented Sep 18, 2020

@swghosh Also, I notice that I could perhaps remove all the ZeroPadding2D paddings, and replace the next convolutional layer with padding='same', and the code will still work. Will it be wrong to do so? If not what will be the difference?

@mikechen66

This comment has been minimized.

Copy link

@mikechen66 mikechen66 commented Oct 5, 2020

I change the whole code to comply with TensorFlow 2.x as follows. But users need to generate the new googlenet weights in the h5 format in order to adapt the above TensorFlow data_format.

1. Change the data_format setting based on TensorFlow 2.x

Change "axis = 1" to "axis = -1" to comply with TensorFlow. It corresponds to "channel_dim = -1 if K.image_data_format() == 'channels_last' ". It is very annoying to change the parameter data_format=='channels_first' in keras.json regularly because most of the applications being developed by TensorFlow. It is the most popular setting in the Machine Learning. '

2. Remove flatten layers

The flatten layers indirectly help to generate huge parameters int the dense layers in both the main and auxiliary classifiers. After removing the expensive flatten layers, I get the correct total size of 9+ million parameters. it is very close to the total size that Google has announced.

3. Change the padding parameter

Change the parameter 'padding=valid' to 'padding='same' in the Inception section to completely comply with the official GoogleNet Paper.

4. Delete the parameter of zero-padding

The function of ZeroPadding2D is not useful in the TF realization as follows.

5. GoogleNet weights

The original googlenet weights could not be used i the following new script since the incompatible data_format. Therefore, users need to generate the new googlenet weights in the h5 format in order to adapt to TensorFlow (width, height, channels).

6. The script is modified as follows.

import imageio
from PIL import Image
import numpy as np
import tensorflow as tf

import keras
from keras.models import Model
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, \
    Dropout, Flatten, Concatenate, Reshape, Activation

from keras.regularizers import l2
from keras.optimizers import SGD
from lrn import LRN 
from keras import backend


# Set up the GPU to avoid the runtime error: Could not create cuDNN handle...
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

# Define the Googlenet class 
class Googlenet(object):

    # Adopt the static method to enbale the elegant realization of the model  
    @staticmethod
    # Build the GoogLeNet Inveption v1
    def build(input_shape, num_classes):

        input = Input(shape=input_shape)
 
        conv1_7x7_s2 = Conv2D(64, kernel_size=(7,7), strides=(2,2), padding='same', activation='relu', name='conv1/7x7_s2', kernel_regularizer=l2(0.0002))(input)
        pool1_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same', name='pool1/3x3_s2')(conv1_7x7_s2)
        pool1_norm1 = LRN(name='pool1/norm1')( pool1_3x3_s2)
        conv2_3x3_reduce = Conv2D(64, kernel_size=(1,1), padding='valid', activation='relu', name='conv2/3x3_reduce', kernel_regularizer=l2(0.0002))(pool1_norm1)
        conv2_3x3 = Conv2D(192, kernel_size=(3,3), padding='same', activation='relu', name='conv2/3x3', kernel_regularizer=l2(0.0002))(conv2_3x3_reduce)
        conv2_norm2 = LRN(name='conv2/norm2')(conv2_3x3)
        pool2_3x3_s2 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same', name='pool2/3x3_s2')(conv2_norm2)

        inception_3a_1x1 = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_3a/1x1', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
        inception_3a_3x3_reduce = Conv2D(96, kernel_size=(1,1), padding='same', activation='relu', name='inception_3a/3x3_reduce', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
        inception_3a_3x3 = Conv2D(128, kernel_size=(3,3), padding='same', activation='relu', name='inception_3a/3x3', kernel_regularizer=l2(0.0002))(inception_3a_3x3_reduce)
        inception_3a_5x5_reduce = Conv2D(16, kernel_size=(1,1), padding='same', activation='relu', name='inception_3a/5x5_reduce', kernel_regularizer=l2(0.0002))(pool2_3x3_s2)
        inception_3a_5x5 = Conv2D(32, kernel_size=(5,5), padding='same', activation='relu', name='inception_3a/5x5', kernel_regularizer=l2(0.0002))(inception_3a_5x5_reduce)
        inception_3a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_3a/pool')(pool2_3x3_s2)
        inception_3a_pool_proj = Conv2D(32, kernel_size=(1,1), padding='same', activation='relu', name='inception_3a/pool_proj', kernel_regularizer=l2(0.0002))(inception_3a_pool)
        inception_3a_output = Concatenate(axis=-1, name='inception_3a/output')([inception_3a_1x1, inception_3a_3x3, inception_3a_5x5, inception_3a_pool_proj])

        inception_3b_1x1 = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_3b/1x1', kernel_regularizer=l2(0.0002))(inception_3a_output)
        inception_3b_3x3_reduce = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_3b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_3a_output)
        inception_3b_3x3 = Conv2D(192, kernel_size=(3,3), padding='same', activation='relu', name='inception_3b/3x3', kernel_regularizer=l2(0.0002))(inception_3b_3x3_reduce)
        inception_3b_5x5_reduce = Conv2D(32, kernel_size=(1,1), padding='same', activation='relu', name='inception_3b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_3a_output)
        inception_3b_5x5 = Conv2D(96, kernel_size=(5,5), padding='same', activation='relu', name='inception_3b/5x5', kernel_regularizer=l2(0.0002))(inception_3b_5x5_reduce)
        inception_3b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_3b/pool')(inception_3a_output)
        inception_3b_pool_proj = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_3b/pool_proj', kernel_regularizer=l2(0.0002))(inception_3b_pool)
        inception_3b_output = Concatenate(axis=-1, name='inception_3b/output')([inception_3b_1x1, inception_3b_3x3, inception_3b_5x5, inception_3b_pool_proj])

        inception_4a_1x1 = Conv2D(192, kernel_size=(1,1), padding='same', activation='relu', name='inception_4a/1x1', kernel_regularizer=l2(0.0002))(inception_3b_output)
        inception_4a_3x3_reduce = Conv2D(96, kernel_size=(1,1), padding='same', activation='relu', name='inception_4a/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_3b_output)
        inception_4a_3x3 = Conv2D(208,kernel_size=(3,3), padding='same', activation='relu', name='inception_4a/3x3' ,kernel_regularizer=l2(0.0002))(inception_4a_3x3_reduce)
        inception_4a_5x5_reduce = Conv2D(16, kernel_size=(1,1), padding='same', activation='relu', name='inception_4a/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_3b_output)
        inception_4a_5x5 = Conv2D(48, kernel_size=(5,5), padding='same', activation='relu', name='inception_4a/5x5', kernel_regularizer=l2(0.0002))(inception_4a_5x5_reduce)
        inception_4a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4a/pool')(inception_3b_output)
        inception_4a_pool_proj = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_4a/pool_proj', kernel_regularizer=l2(0.0002))(inception_4a_pool)
        inception_4a_output = Concatenate(axis=-1, name='inception_4a/output')([inception_4a_1x1, inception_4a_3x3, inception_4a_5x5, inception_4a_pool_proj])

        loss1_ave_pool = AveragePooling2D(pool_size=(5,5), strides=(3,3), name='loss1/ave_pool')(inception_4a_output)
        loss1_conv = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='loss1/conv', kernel_regularizer=l2(0.0002))(loss1_ave_pool)
        loss1_fc = Dense(1024, activation='relu', name='loss1/fc', kernel_regularizer=l2(0.0002))(loss1_conv)
        loss1_drop_fc = Dropout(rate=0.7)(loss1_fc)
        loss1_classifier = Dense(num_classes, name='loss1/classifier', kernel_regularizer=l2(0.0002))(loss1_drop_fc)
        loss1_classifier_act = Activation('softmax')(loss1_classifier)

        inception_4b_1x1 = Conv2D(160, kernel_size=(1,1), padding='same', activation='relu', name='inception_4b/1x1', kernel_regularizer=l2(0.0002))(inception_4a_output)
        inception_4b_3x3_reduce = Conv2D(112, kernel_size=(1,1), padding='same', activation='relu', name='inception_4b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4a_output)
        inception_4b_3x3 = Conv2D(224, kernel_size=(3,3), padding='same', activation='relu', name='inception_4b/3x3', kernel_regularizer=l2(0.0002))(inception_4b_3x3_reduce)
        inception_4b_5x5_reduce = Conv2D(24, kernel_size=(1,1), padding='same', activation='relu', name='inception_4b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4a_output)
        inception_4b_5x5 = Conv2D(64, kernel_size=(5,5), padding='same', activation='relu', name='inception_4b/5x5', kernel_regularizer=l2(0.0002))(inception_4b_5x5_reduce)
        inception_4b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4b/pool')(inception_4a_output)
        inception_4b_pool_proj = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_4b/pool_proj', kernel_regularizer=l2(0.0002))(inception_4b_pool)
        inception_4b_output = Concatenate(axis=-1, name='inception_4b/output')([inception_4b_1x1, inception_4b_3x3, inception_4b_5x5, inception_4b_pool_proj])
        
        inception_4c_1x1 = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_4c/1x1', kernel_regularizer=l2(0.0002))(inception_4b_output)
        inception_4c_3x3_reduce = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_4c/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4b_output)
        inception_4c_3x3 = Conv2D(256, kernel_size=(3,3), padding='same', activation='relu', name='inception_4c/3x3', kernel_regularizer=l2(0.0002))(inception_4c_3x3_reduce)
        inception_4c_5x5_reduce = Conv2D(24, kernel_size=(1,1), padding='same', activation='relu', name='inception_4c/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4b_output)
        inception_4c_5x5 = Conv2D(64, kernel_size=(5,5), padding='same', activation='relu', name='inception_4c/5x5', kernel_regularizer=l2(0.0002))(inception_4c_5x5_reduce)
        inception_4c_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4c/pool')(inception_4b_output)
        inception_4c_pool_proj = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_4c/pool_proj', kernel_regularizer=l2(0.0002))(inception_4c_pool)
        inception_4c_output = Concatenate(axis=-1, name='inception_4c/output')([inception_4c_1x1, inception_4c_3x3, inception_4c_5x5, inception_4c_pool_proj])

        inception_4d_1x1 = Conv2D(112, kernel_size=(1,1), padding='same', activation='relu', name='inception_4d/1x1', kernel_regularizer=l2(0.0002))(inception_4c_output)
        inception_4d_3x3_reduce = Conv2D(144, kernel_size=(1,1), padding='same', activation='relu', name='inception_4d/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4c_output)
        inception_4d_3x3 = Conv2D(288, kernel_size=(3,3), padding='same', activation='relu', name='inception_4d/3x3', kernel_regularizer=l2(0.0002))(inception_4d_3x3_reduce)
        inception_4d_5x5_reduce = Conv2D(32, kernel_size=(1,1), padding='same', activation='relu', name='inception_4d/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4c_output)
        inception_4d_5x5 = Conv2D(64, kernel_size=(5,5), padding='same', activation='relu', name='inception_4d/5x5', kernel_regularizer=l2(0.0002))(inception_4d_5x5_reduce)
        inception_4d_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4d/pool')(inception_4c_output)
        inception_4d_pool_proj = Conv2D(64, kernel_size=(1,1), padding='same', activation='relu', name='inception_4d/pool_proj', kernel_regularizer=l2(0.0002))(inception_4d_pool)
        inception_4d_output = Concatenate(axis=-1, name='inception_4d/output')([inception_4d_1x1, inception_4d_3x3, inception_4d_5x5, inception_4d_pool_proj])
    
        loss2_ave_pool = AveragePooling2D(pool_size=(5,5), strides=(3,3), name='loss2/ave_pool')(inception_4d_output)
        loss2_conv = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='loss2/conv', kernel_regularizer=l2(0.0002))(loss2_ave_pool)
        loss2_fc = Dense(1024, activation='relu', name='loss2/fc', kernel_regularizer=l2(0.0002))(loss2_conv)
        loss2_drop_fc = Dropout(rate=0.7)(loss2_fc)
        loss2_classifier = Dense(num_classes, name='loss2/classifier', kernel_regularizer=l2(0.0002))(loss2_drop_fc)
        loss2_classifier_act = Activation('softmax')(loss2_classifier)


        inception_4e_1x1 = Conv2D(256, kernel_size=(1,1), padding='same', activation='relu', name='inception_4e/1x1', kernel_regularizer=l2(0.0002))(inception_4d_output)
        inception_4e_3x3_reduce = Conv2D(160, kernel_size=(1,1), padding='same', activation='relu', name='inception_4e/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4d_output)
        inception_4e_3x3 = Conv2D(320, kernel_size=(3,3), padding='same', activation='relu', name='inception_4e/3x3', kernel_regularizer=l2(0.0002))(inception_4e_3x3_reduce)
        inception_4e_5x5_reduce = Conv2D(32, kernel_size=(1,1), padding='same', activation='relu', name='inception_4e/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4d_output)
        inception_4e_5x5 = Conv2D(128, kernel_size=(5,5), padding='same', activation='relu', name='inception_4e/5x5', kernel_regularizer=l2(0.0002))(inception_4e_5x5_reduce)
        inception_4e_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_4e/pool')(inception_4d_output)
        inception_4e_pool_proj = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_4e/pool_proj', kernel_regularizer=l2(0.0002))(inception_4e_pool)
        inception_4e_output = Concatenate(axis=-1, name='inception_4e/output')([inception_4e_1x1, inception_4e_3x3, inception_4e_5x5, inception_4e_pool_proj])


        inception_5a_1x1 = Conv2D(256, kernel_size=(1,1), padding='same', activation='relu', name='inception_5a/1x1', kernel_regularizer=l2(0.0002))(inception_4e_output)
        inception_5a_3x3_reduce = Conv2D(160, kernel_size=(1,1), padding='same', activation='relu', name='inception_5a/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_4e_output)
        inception_5a_3x3 = Conv2D(320, kernel_size=(3,3), padding='same', activation='relu', name='inception_5a/3x3', kernel_regularizer=l2(0.0002))(inception_5a_3x3_reduce)
        inception_5a_5x5_reduce = Conv2D(32, kernel_size=(1,1), padding='same', activation='relu', name='inception_5a/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_4e_output)
        inception_5a_5x5 = Conv2D(128, kernel_size=(5,5), padding='same', activation='relu', name='inception_5a/5x5', kernel_regularizer=l2(0.0002))(inception_5a_5x5_reduce)
        inception_5a_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_5a/pool')(inception_4e_output)
        inception_5a_pool_proj = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_5a/pool_proj', kernel_regularizer=l2(0.0002))(inception_5a_pool)
        inception_5a_output = Concatenate(axis=-1, name='inception_5a/output')([inception_5a_1x1, inception_5a_3x3, inception_5a_5x5, inception_5a_pool_proj])


        inception_5b_1x1 = Conv2D(384, kernel_size=(1,1), padding='same', activation='relu', name='inception_5b/1x1', kernel_regularizer=l2(0.0002))(inception_5a_output)
        inception_5b_3x3_reduce = Conv2D(192, kernel_size=(1,1), padding='same', activation='relu', name='inception_5b/3x3_reduce', kernel_regularizer=l2(0.0002))(inception_5a_output)
        inception_5b_3x3 = Conv2D(384, kernel_size=(3,3), padding='same', activation='relu', name='inception_5b/3x3', kernel_regularizer=l2(0.0002))(inception_5b_3x3_reduce)
        inception_5b_5x5_reduce = Conv2D(48, kernel_size=(1,1), padding='same', activation='relu', name='inception_5b/5x5_reduce', kernel_regularizer=l2(0.0002))(inception_5a_output)
        inception_5b_5x5 = Conv2D(128, kernel_size=(5,5), padding='same', activation='relu', name='inception_5b/5x5', kernel_regularizer=l2(0.0002))(inception_5b_5x5_reduce)
        inception_5b_pool = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same', name='inception_5b/pool')(inception_5a_output)
        inception_5b_pool_proj = Conv2D(128, kernel_size=(1,1), padding='same', activation='relu', name='inception_5b/pool_proj', kernel_regularizer=l2(0.0002))(inception_5b_pool)
        inception_5b_output = Concatenate(axis=-1, name='inception_5b/output')([inception_5b_1x1, inception_5b_3x3, inception_5b_5x5, inception_5b_pool_proj])


        pool5_7x7_s1 = AveragePooling2D(pool_size=(7,7), strides=(1,1), name='pool5/7x7_s2')(inception_5b_output)
        pool5_drop_7x7_s1 = Dropout(rate=0.4)(pool5_7x7_s1)
        loss3_classifier = Dense(num_classes, name='loss3/classifier', kernel_regularizer=l2(0.0002))(pool5_drop_7x7_s1)
        loss3_classifier_act = Activation('softmax', name='prob')(loss3_classifier)

        inception_v1 = Model(inputs=input, outputs=[loss1_classifier_act, loss2_classifier_act, loss3_classifier_act])

        return inception_v1


if __name__ == "__main__":

    input_shape = (224, 224, 3)
    num_classes = 1000

    inception_v1 = Googlenet.build(input_shape, num_classes)

    inception_v1.summary()

@mikechen66

This comment has been minimized.

Copy link

@mikechen66 mikechen66 commented Oct 5, 2020

Even though the plain model of Inception v1 has the detailed description of the layers, I prefer to the simplified model with the total size of 6+ million parameters(removing auxiliary classifiers) as follows. Since swghosh provided the googlenet_custom_layers.py, I has changed its name to lrn.py as a library.

import tensorflow as tf 
from tensorflow.keras.layers import Input, Conv2D, Dense, Dropout, MaxPooling2D, AveragePooling2D
from tensorflow.keras.layers import concatenate
from tensorflow.keras.models import Model
from tensorflow.keras.regularizers import l2
from lrn import LRN 

# Set up the GPU to avoid the runtime error: Could not create cuDNN handle...
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

def googlenet(input_shape, num_classes):

    input = Input(shape=input_shape)

    conv1_7x7 = Conv2D(filters=64, kernel_size=(7,7), strides=(2,2), padding='same', activation='relu', 
                       kernel_regularizer=l2(0.01))(input)
    maxpool1_3x3 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(conv1_7x7)
    pool1_norm1 = LRN()(maxpool1_3x3)
    conv2_3x3_reduce = Conv2D(filters=64, kernel_size=(1,1),  strides=(1,1), padding='valid', activation='relu', 
                       kernel_regularizer=l2(0.01))(pool1_norm1)
    conv2_3x3 = Conv2D(filters=192, kernel_size=(3,3), strides=(1,1), padding='same', activation='relu', 
                       kernel_regularizer=l2(0.01))(conv2_3x3_reduce)
    conv2_norm2 = LRN()(conv2_3x3)
    maxpool2_3x3 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(conv2_norm2)

    inception_3a = inception(input=maxpool2_3x3, axis=3, params=[(64,),(96,128),(16,32),(32,)])
    inception_3b = inception(input=inception_3a, axis=3, params=[(128,),(128,192),(32,96),(64,)])
    maxpool3_3x3 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(inception_3b)

    inception_4a = inception(input=maxpool3_3x3, axis=3, params=[(192,),(96,208),(16,48),(64,)])
    inception_4b = inception(input=inception_4a, axis=3, params=[(160,),(112,224),(24,64),(64,)])
    inception_4c = inception(input=inception_4b, axis=3, params=[(128,),(128,256),(24,64),(64,)])
    inception_4d = inception(input=inception_4c, axis=3, params=[(112,),(144,288),(32,64),(64,)])
    inception_4e = inception(input=inception_4d, axis=3, params=[(256,),(160,320),(32,128),(128,)])
    maxpool4_3x3 = MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='same')(inception_4e)

    inception_5a = inception(input=maxpool4_3x3, axis=3, params=[(256,),(160,320),(32,128),(128,)])
    inception_5b = inception(input=inception_5a, axis=3, params=[(384,),(192,384),(48,128),(128,)]) 
    avgpool1_7x7 = AveragePooling2D(pool_size=(7,7), strides=(7,7), padding='same')(inception_5b)

    drop = Dropout(rate=0.4)(avgpool1_7x7)
    linear = Dense(num_classes, activation='softmax', kernel_regularizer=l2(0.01))(drop)
    
    model = Model(inputs=input, outputs=linear)

    return model 

def inception(input, axis, params):

    # Bind the vertical cells tegother for an elegant realization 
    [branch1, branch2, branch3, branch4] = params

    conv_11 = Conv2D(filters=branch1[0], kernel_size=(1,1), padding='same', activation='relu', 
                     kernel_regularizer=l2(0.01))(input)

    conv_12 = Conv2D(filters=branch2[0], kernel_size=(1,1), padding='same', activation='relu', 
                     kernel_regularizer=l2(0.01))(input)
    conv_22 = Conv2D(filters=branch2[1], kernel_size=(3,3), padding='same', activation='relu', 
                     kernel_regularizer=l2(0.01))(conv_12)

    conv_13 = Conv2D(filters=branch3[0], kernel_size=(1,1), padding='same', activation='relu', 
                     kernel_regularizer=l2(0.01))(input)
    conv_23 = Conv2D(filters=branch3[1], kernel_size=(5,5), padding='same', activation='relu', 
                     kernel_regularizer=l2(0.01))(conv_13)

    maxpool_14 = MaxPooling2D(pool_size=(3,3), strides=(1,1), padding='same')(input)
    maxpool_proj_24 = Conv2D(filters=branch4[0], kernel_size=(1,1), strides=(1,1), padding='same', 
                             activation='relu', kernel_regularizer=l2(0.01))(maxpool_14)

    inception_output = concatenate([conv_11, conv_22, conv_23, maxpool_proj_24], axis=3)  

    return inception_output

if __name__ == "__main__":

    input_shape = (224, 224, 3)
    num_classes = 1000

    # Assign the values 
    model = googlenet(input_shape, num_classes)

    model.summary()

Cheers!

@babloogpb1

This comment has been minimized.

Copy link

@babloogpb1 babloogpb1 commented Oct 29, 2020

Thanks for your reply. I use Keras 2.4.3 and TensorFlow 2.3. It might be a reason for the issue of version compatibility. After updating the following lines of code to adapt to the above-mentioned environment, I can run the script with the correct classification. But I get the wrong total parameter number while running googlenet.summary().

1. Modify the script to adapt to TensorFlow 2.4.3 and Keras 2.4

Predicted Class: 282 , Class Name: n02123159 tiger cat

I make the following modification.

Modify the import statements

if keras.backend.backend() == 'tensorflow':
    # -from keras import backend as K
    from keras import backend
    # -import tensorflow as tf
    import tensorflow.compat.v1 as tf
    tf.compat.v1.disable_eager_execution()
    from keras.utils.conv_utils import convert_kernel

Set up the GPU to avoid the runtime error: Could not create cuDNN handle...

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

Delete the following lines of code highlighted with "# -" and add the new lines of code below the highlighted code.

    if keras.backend.backend() == 'tensorflow':
        # convert the convolutional kernels for tensorflow
        ops = []
        for layer in googlenet.layers:
            if layer.__class__.__name__ == 'Conv2D':
                # -original_w = K.get_value(layer.kernel)
                original_w = keras.backend.get_value(layer.kernel)
                converted_w = convert_kernel(original_w)
                # -ops.append(tf.assign(layer.kernel, converted_w).op)
                ops.append(tf.compat.v1.assign(layer.kernel, converted_w).op)
        # -K.get_session().run(ops)
        tf.compat.v1.keras.backend.get_session().run(ops)

2. Total parameter number(wrong)

I get the total parameter number of 13,378,280. But the original GoogLeNet Inception v1 has 5.79+ million parameters in total. What's wrong with the huge gap of the total parameter numbers?

After adding the three line of code and deleting the sections including the sections from "if weights_path..." to the main section.

input = Input(shape=(3, 224, 224))
googlenet = create_googlenet(input)
googlenet.summary()

Reference:
Google Inception v1 Paper: Page: 5/9
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Cheers

Thank you for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment