Skip to content

Instantly share code, notes, and snippets.

  1. Install simple_dqn.
  2. Run ./train.sh Breakout-v0 --evironment gym.
  3. Check results/Breakout-v0.csv for best performing epoch (in my case it was 61).
  4. Run ./test_gym.sh snapshots/Breakout-v0_61.pkl (replace 61 with your best epoch).
  5. Optional: run ./upload_gym.sh results/Breakout-v0 --api_key <your_key> to upload the results.

The Simple DQN implementation uses network architecture and hyperparameters from DeepMind Nature paper.

Used normalized advantage functions (NAF) from this paper:

Continuous Deep Q-Learning with Model-based Acceleration
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
http://arxiv.org/abs/1603.00748

The command line used was:

python naf.py Pendulum-v0 --l2_reg 0.001
@tambetm
tambetm / README.md
Last active October 19, 2016 20:43

Used normalized advantage functions (NAF) from this paper:

Continuous Deep Q-Learning with Model-based Acceleration
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
http://arxiv.org/abs/1603.00748

The command line used was:

python naf.py InvertedPendulum-v1 --batch_norm --optimizer_lr 0.0001 --noise fixed --noise_scale 0.01 --tau 1 --l2_reg 0.001 --batch_size 1000

Used dueling network architecture with Q-learning, as outlined in this paper:

Dueling Network Architectures for Deep Reinforcement Learning
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
http://arxiv.org/abs/1511.06581

Command line:

python duel.py CartPole-v0 --gamma 0.995

Used dueling network architecture with Q-learning, as outlined in this paper:

Dueling Network Architectures for Deep Reinforcement Learning
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
http://arxiv.org/abs/1511.06581

Command line:

python duel.py Acrobot-v0

Used dueling network architecture with Q-learning, as outlined in this paper:

Dueling Network Architectures for Deep Reinforcement Learning
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
http://arxiv.org/abs/1511.06581

Command line:

python duel.py MountainCar-v0

Used dueling network architecture with Q-learning, as outlined in this paper:

Dueling Network Architectures for Deep Reinforcement Learning
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
http://arxiv.org/abs/1511.06581

Refer to code for hyperparameter values.

Used normalized advantage functions (NAF) from this paper:

Continuous Deep Q-Learning with Model-based Acceleration
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
http://arxiv.org/abs/1603.00748

Refer to code for hyperparameters.

Used dueling network architecture with Q-learning, as outlined in this paper:

Dueling Network Architectures for Deep Reinforcement Learning
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
http://arxiv.org/abs/1511.06581

Refer to code for hyperparameter values.

@tambetm
tambetm / Pong-v0.md
Last active October 22, 2016 18:16
  1. Install simple_dqn.
  2. Run ./train.sh Pong-v0 --evironment gym.
  3. Check results/Pong-v0.csv for best performing epoch (in my case it was 81).
  4. Run ./test_gym.sh snapshots/Pong-v0_81.pkl (replace 81 with your best epoch).
  5. Optional: run ./upload_gym.sh results/Pong-v0 --api_key <your_key> to upload the results.

The Simple DQN implementation uses network architecture and hyperparameters from DeepMind Nature paper.