Skip to content

Instantly share code, notes, and snippets.

@steveKapturowski
Created June 6, 2017 19:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save steveKapturowski/28af74ed9141a56fe5cb698eccddfff7 to your computer and use it in GitHub Desktop.
Save steveKapturowski/28af74ed9141a56fe5cb698eccddfff7 to your computer and use it in GitHub Desktop.
async dqn+cts

Generated using dqn+cts implementation from https://github.com/steveKapturowski/tensorflow-rl Trained for 80M agent steps with 16 agents and final epsilon values sampled from (.1, .5, .01) as in the async q learning method from Asynchronous Methods for Deep Reinforcement Learning

Double DQN parameters match Deep Reinforcement Learning with Double Q-learning except for the replay buffer size and per-thread epsilon annealing schedule which were both set to 400k. Pseudocount / CTS density model parameters should match those of Unifying Count-Based Exploration and Intrinsic Motivation

Evaluation was produced using an epsilon of .01

@jazcollins
Copy link

Hi there, I am hoping to reproduce this result with your code! In terms of the command to run, so far I've got:

python main.py MontezumaRevenge-v0 --alg_type dqn-cts -n 16 --epsilon_annealing_steps=400000 --replay_size=400000

Am I missing anything? For example, how can I set the final epsilon values to sample from, or is that already built into the codebase? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment