Created
July 17, 2020 08:43
-
-
Save yoshihikoueno/4ff0694339f88d579bb3d9b07e609122 to your computer and use it in GitHub Desktop.
code snippet to fix ``WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1`` in tensorflow2.2
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# workaround to fix opimizer bug in tensorflow | |
optimizer = tf.keras.optimizers.Adam( | |
learning_rate=tf.Variable(0.001), | |
beta_1=tf.Variable(0.9), | |
beta_2=tf.Variable(0.999), | |
epsilon=tf.Variable(1e-7), | |
) | |
optimizer.iterations # this access will invoke optimizer._iterations method and create optimizer.iter attribute | |
optimizer.decay = tf.Variable(0.0) # Adam.__init__ assumes ``decay`` is a float object, so this needs to be converted to tf.Variable **after** __init__ method. | |
/// | |
The root problem is that Adam.__init__ will initialize variables with python float objects which will not be tracked by tensorflow. | |
We need to let them tracked and make them appear in Adam._checkpoint_dependencies in order to load weights without actually calling the optimizer itself. | |
By converting Python float to tf.Variable, they will be tracked because tf.Variable is a subclass of ``trackable.Trackable``. | |
/// |
Hi, just want to know if you have resolved these problems now? I just encountered the same issue. Thank you very much!
Hi @lixuanhng , were you able to solve this problem? I'm encountering the same issue in another project. Yours is the most verbose account of this error that I'm facing.
Do reply if you ever able to solve, would be really helpful! Thanks!!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@lixuanhng Hmm, seems like optimizer's internal variables (like
optimizer.beta_1
) and slot variables (e.g.model.biLSTM_2.backward_layer.cell.bias
) are all not being resolved.Is there any particular reason not to use
tf.keras
API for saving and restoring the weights? If no, I would recommend using them.The problem with not using
tf.keras
API for saving/restoring is that core tensorflow API andtf.keras
API havedifferent way of handing variable objects in saved data, if I remember correctly.
So you have to manually treat those gaps between the two (native tensorflow and
tf.keras
), which is very tedious.