-
-
Save yoshihikoueno/4ff0694339f88d579bb3d9b07e609122 to your computer and use it in GitHub Desktop.
# workaround to fix opimizer bug in tensorflow | |
optimizer = tf.keras.optimizers.Adam( | |
learning_rate=tf.Variable(0.001), | |
beta_1=tf.Variable(0.9), | |
beta_2=tf.Variable(0.999), | |
epsilon=tf.Variable(1e-7), | |
) | |
optimizer.iterations # this access will invoke optimizer._iterations method and create optimizer.iter attribute | |
optimizer.decay = tf.Variable(0.0) # Adam.__init__ assumes ``decay`` is a float object, so this needs to be converted to tf.Variable **after** __init__ method. | |
/// | |
The root problem is that Adam.__init__ will initialize variables with python float objects which will not be tracked by tensorflow. | |
We need to let them tracked and make them appear in Adam._checkpoint_dependencies in order to load weights without actually calling the optimizer itself. | |
By converting Python float to tf.Variable, they will be tracked because tf.Variable is a subclass of ``trackable.Trackable``. | |
/// |
@lixuanhng Hmm, seems like optimizer's internal variables (like optimizer.beta_1
) and slot variables (e.g. model.biLSTM_2.backward_layer.cell.bias
) are all not being resolved.
Is there any particular reason not to use tf.keras
API for saving and restoring the weights? If no, I would recommend using them.
model = REModel(batch_size=batch_num, vocab_size=re_args.vocab_size, embedding_size=re_args.embedding_size, num_classes=re_args.num_classes, pos_num=re_args.pos_num, pos_size=re_args.pos_size, gru_units=re_args.gru_units, embedding_matrix=wordembedding)
optimizer = tf.keras.optimizers.Adam(learning_rate=tf.Variable(0.01), beta_1=tf.Variable(0.9), beta_2=tf.Variable(0.999), epsilon=tf.Variable(1e-7),)
optimizer.iterations
optimizer.decay = tf.Variable(0.0)
model.compile(optimizer=optimizer)
# Or, if you also want to specify a loss,
# model.compile(loss=loss, optimizer=optimizer)
# Run a model
inputs_x = [sin_word_tensor, sin_pos1_tensor, sin_pos2_tensor]
predictions = model(inputs_x)
# You at least need to run a model once to make tensorflow prepare all the variables.
# Or, you can also manually do that by calling `model.build` method, then you don't have to run a model.
# save
ckpt_path = 'ckpt'
model.save_weights(path)
model.load_weights(path)
# If you want to be sure that everything is working as expected here,
# you may insert `assert_consumed` method call here.
# inference
inputs_x = [sin_word_tensor, sin_pos1_tensor, sin_pos2_tensor]
predictions = model(inputs_x)
The problem with not using tf.keras
API for saving/restoring is that core tensorflow API and tf.keras
API have
different way of handing variable objects in saved data, if I remember correctly.
So you have to manually treat those gaps between the two (native tensorflow and tf.keras
), which is very tedious.
Hi, just want to know if you have resolved these problems now? I just encountered the same issue. Thank you very much!
Hi @lixuanhng , were you able to solve this problem? I'm encountering the same issue in another project. Yours is the most verbose account of this error that I'm facing.
Do reply if you ever able to solve, would be really helpful! Thanks!!
The warning massage is like:
Seems like optimizer object are not resolved?