-
-
Save yingzwang/2c5b455907942c7bdf3c0fece640095b to your computer and use it in GitHub Desktop.
Hi Ying,
This is a really great approach to solve CartPole problem. I wonder if you would like to support more information about the DQN architecture. Such like report or references.
I am thinking about why your implement is of high efficiency.
Hi Ying,
This is a really great approach to solve CartPole problem. I wonder if you would like to support more information about the DQN architecture. Such like report or references.
I am thinking about why your implement is of high efficiency.
Hi dylan, HaiyangChen
I'm not associated with yingzwang, but i can give some information, this an implementation of DQN algorithm ( https://deepmind.com/research/dqn/. ) So, the architecture of the algorithm is essentially the same as the one presented in the paper. The difference is a soft-update to the weights of the target network by using exponential moving averages parameterized by tau. She also uses a decreasing exploration strategy, which clearly helps in this problem. The rest is just good hyper parameter tunning.
The code is also very good, good code practices all around.
DQN graph (generated by tensorboard)