-
-
Save awjuliani/fffe41519166ee41a6bd5f5ce8ae2630 to your computer and use it in GitHub Desktop.
@mphielipp
I think you should check your version of TF first.
python -c 'import tensorflow as tf; print(tf.__version__)'
The version should be 0.12.x
@mphielipp Did it work for you? I installed the latest tensorflow 0.12.1 and
pip show tensorflow
says 0.12.1
but I still get the same error as you.
@mphielipp Replace that line with:
self.AW = tf.Variable(tf.random_normal([h_size // 2, env.actions]))
It expects an integer, not a float.
Hi, First thanks so much for your detailed write ups and commented implementations. I have been working through them while developing my own RL environment outside of gym
.
I have a few questions regarding the implementation for Double-DQN here:
-
The Double-DQN paper (https://arxiv.org/pdf/1511.06581.pdf) algorithm mentions updating \theta with each step t. It looks like the implementation here updates \theta every
update_freq
steps, and updates \theta- immediately afterwards. Is there something I don't understand? I guess it ends up being a heuristic decision when to perform these updates, just wondering what your intuition is for the \theta, \theta- update cycle. -
Second is your nice tensorflow hack to update the targetQ weights. Does it rely on the order of initialization? Might there be a more verbose but explicit way to do it, maybe storing the targetQ ops by name in a dictionary?
-
Last is there a reason for not using a nonlinearity/activation in the network?
I would like to ask a question: do we have to split the inputs in order to achieve dueling DQN?
why can't i just input all the inputs into value layer and advantage layer?
I'm getting this message: -
----> 2 mainQN = Qnetwork(h_size)
---> 16 self.AW = tf.Variable(tf.random_normal([h_size/2,env.actions]))
---> 77 seed2=seed2)
--> 189 name=name)
--> 582 _Attr(op_def, input_arg.type_attr))
lib\site-packages\tensorflow\python\framework\op_def_library.py in _SatisfiesTypeConstraint(dtype, attr_def)
58 "DataType %s for attr '%s' not in list of allowed values: %s" %
59 (dtypes.as_dtype(dtype).name, attr_def.name,
---> 60 ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: DataType float32 for attr 'T' not in list of allowed values: int32, int64