Last active December 8, 2020 09:40
Quantization aware training in keras
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Conv2D, Flatten
from tensorflow.keras.optimizers import RMSprop
# download the mnist to the path '~/.keras/datasets/' if it is the first time to be called
# X shape (60,000 28x28), y shape (10,000, )
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# data pre-processing
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1) / 255. # normalize
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1) / 255. # normalize
y_train = to_categorical(y_train, num_classes=10) #one hot
y_test = to_categorical(y_test, num_classes=10) #one hot
# Create model
model = Sequential()
model.add(Conv2D(16, (3, 3), input_shape=(28, 28, 1)))
model.add(Activation('softmax', name='pred'))
# Quantization aware training
sess = tf.keras.backend.get_session()
# You can plot the quantize training graph on tensorboard
# tf.summary.FileWriter('/workspace/tensorboard', graph=sess.graph)
# Define optimizer
rmsprop = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
# We add metrics to get more results you want to see
metrics=['accuracy']), y_train, epochs=1, batch_size=256)
print('\nTesting ------------')
# Evaluate the model with the metrics we defined earlier
loss, accuracy = model.evaluate(x_test, y_test)
print('test loss: ', loss)
print('test accuracy: ', accuracy)
# Print the min max in fakequant
for node in sess.graph.as_graph_def().node:
if 'weights_quant/AssignMaxLast' in \
or 'weights_quant/AssignMinLast' in
tensor = sess.graph.get_tensor_by_name( + ':0')
print('{} = {}'.format(,
Copy link

I have tried this code with no succes. I used tensorflow r1.13.
What version you jave used?

Copy link

I can run in official tensorflow 1.13 docker.
Is there any error message?

Copy link

martinmatak commented Jun 10, 2019

Don't you need to set quantization during evaluation phase explicitly as it is set in the official of quantization aware process:

Copy link

SandorSeres commented Jun 10, 2019 via email

Copy link

Don't you need to set quantization during evaluation phase explicitly as it is set in the official of quantization aware process:

I think training graph can forward as well as backward.
Hence, we still can get each tensor from training graph.
The tensor of 'pred' output in training graph should be same as eval graph.

Copy link

It seems I done something wrong last time. :(
Now I used
docker run -it -v ${PWD}:/work tensorflow/tensorflow python /work/
and it was running fine.
Now I will need to find it out how to put this model into Google Coral DevBoard TPU.
Have you tried it already?

Copy link

Hi @rocking5566,

I am having trouble freezing the trained model. Did you manage to freeze the model for future inference purpose?

Copy link

I am trying to quantize a segmentation model. The model is all convolutional, yet I found out that only the last layer has fake quantization node. All the other convolutional layers are conv+bn+relu. The only layer with fake quantization node is just conv without bn or relu.
Did you manage to convert all the convolutional layers to fake quantization node?

Copy link

anniezhi commented Aug 9, 2019

Hi @SandorSeres, did you succeed in implementing your model to Google Coral? I'm using TF instead of Keras, but also faced with quantization problems (BatchNorm specifically).

