Skip to content

Instantly share code, notes, and snippets.

@chuchro3
Last active June 14, 2017 20:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chuchro3/eb454a8e2a5e96e536938e2d6f050fa0 to your computer and use it in GitHub Desktop.
Save chuchro3/eb454a8e2a5e96e536938e2d6f050fa0 to your computer and use it in GitHub Desktop.
This is a submission of the second training epoch of Flappy Bird using Deep Q-Learning, built with Python and Tensorflow.
The first epoch was trained with the following settings:
First buffer with 1000 random iterations
training episodes: 20000
learning rate for Adam Optimizer: 1e-5
we used our sinusoidal epsilon function with parameters:
starting epsilon: 1.0
epsilon decay rate: .9998
number of epsilon cycles: 7
replay memory size: 75000
batch size: 32
The second epoch was trained with the following settings:
First buffer with 5000 forward passes through the network without updates
training episodes: 20000
learning rate for Adam Optimizer: 9e-7
we used our sinusoidal epsilon function with parameters:
starting epsilon: .2
epsilon decay rate: .9997
number of epsilon cycles: 4
replay memory size: 75000
batch size: 32
More details about the network architecture, sinusoidal decay function and other can be found in our paper (coming soon).
Link to Poster: <http://web.stanford.edu/~chuchro3/projects/openaigym/OpenAiGymDeepLearningPoster.pdf>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment