Deep reinforcement learning for environments with small state spaces.
The CartPole Gym environment was mastered in just a few minutes of training. Training and evaluation code is available at github.com/andreimuntean/deep-q-learning-lite.
- OpenAI Gym 0.8
- TensorFlow 1.0
Uses environments provided by OpenAI Gym.
The network has a single hidden layer with 40 rectified linear units. The output layer has as many nodes as there are actions. Each output node represents the expected utility of an action.
Heavily influenced by DeepMind's seminal paper 'Playing Atari with Deep Reinforcement Learning' (Mnih et al., 2013) and 'Human-level control through deep reinforcement learning' (Mnih et al., 2015).