Skip to content

Instantly share code, notes, and snippets.

@steveKapturowski
Created June 4, 2017 22:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save steveKapturowski/06ee71a25b40eed0a67ee99a4801f908 to your computer and use it in GitHub Desktop.
Save steveKapturowski/06ee71a25b40eed0a67ee99a4801f908 to your computer and use it in GitHub Desktop.
async dqn+cts

Generated using dqn+cts implementation from https://github.com/steveKapturowski/tensorflow-rl Trained for 60M agent steps with 16 agents and final epsilon values sampled from (.1, .5, .01) as in the async q learning method from Asynchronous Methods for Deep Reinforcement Learning

Evaluation was produced using an epsilon of .01 and q learning updates continued to be performed every 16 agent steps with a learning rate of 4e-7 to aid in exploration. Since I neglected to save the rmsprop variables, performance was degraded compared to the mean score around 2200-2500 observed by the .01 agents during the main training phase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment