Last active
March 24, 2021 07:38
-
-
Save awjuliani/1256e7ad7c8ac54051d09963606c8a47 to your computer and use it in GitHub Desktop.
Reinforcement Learning Tutorial in Tensorflow: Model-based RL
Hi Arthur,
Thank you very much for making these tutorials! They are awesome!
However there seems to be a number of incompatibilities/bugs in this notebook. I had to make the following modifications to get the notebook running on Tensorflow 1.0.0:
- I had to comment out the line:
from modelAny import *
because neither was any script by the name modelAny provided, nor were any of the resources of the script required by the rest of the code. rnn_cell
seems to be removed fromtensorflow.python.ops
in the current generation. Also this was never used in the rest of the code. So I commented outfrom tensorflow.python.ops import rnn_cell
.tf.concat()
has a different syntax now. I had to make the following modification:
predicted_state = tf.concat([predicted_observation,predicted_reward,predicted_done],1)
tf.mul()
had to be replaced bytf.multiply()
as follows:
done_loss = tf.multiply(predicted_done, true_done) + tf.multiply(1-predicted_done, 1-true_done)
And everything executed as expected :)
Thank you
Anirban
RuntimeWarning: overflow encountered in multiply x = um.multiply(x, x, out=x).
Then the reward starts to have large values like 11062986271742011518222336.000000.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@krystynak
Your actions shape has to have a second dimension. For instance, below work and will result in 0,5 empty array:
np.hstack([np.empty(0).reshape(0,4), np.empty(0).reshape(0,1)])
While below will give you your error:
np.hstack([np.empty(0).reshape(0,4),np.empty(0).reshape(0,)])
I wold reshape actions to (-1,1) or initialize it as np.empty(0).reshape(0,1) and append using np.vstack