Skip to content

Instantly share code, notes, and snippets.

This is a submission of the second training epoch of Flappy Bird using Deep Q-Learning, built with Python and Tensorflow.
The first epoch was trained with the following settings:
First buffer with 1000 random iterations
training episodes: 20000
learning rate for Adam Optimizer: 1e-5
we used our sinusoidal epsilon function with parameters:
starting epsilon: 1.0
@chuchro3
chuchro3 / frozenlake
Created February 24, 2017 21:14
FrozenLake-v0
Model free Q-Learning in an MDP style environment
Utilized code from Berkeley's CS188 Reinforcement Learning project
Introduced an epsilon decay to offer a transition between early exploration and late exploitation
QLearning paramters:
alpha = 0.1
epsilon = 1.0
@chuchro3
chuchro3 / cartpole
Last active February 24, 2017 21:07
CartPole-v1
Tabular Q-Learning on CartPole-v1:
Utilized code from Berkeley's CS188 Q-Learning project
Discretized the state space from continuous values
#cart_x, cart_velocity, pole_theta, pole_velocity
[5,10,20,10]
Introduced an epsilon decay to offer a transition between early exploration and late exploitation