Human-level control through deep reinforcement learning
Deep Reinforcement Learning with Double Q-learning
Dueling Network Architectures for Deep Reinforcement Learning
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
Online articles:
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
Karpathy Deep Reinforcement Learning: Pong from Pixels
ArXiv:
High-Dimensional Continuous Control Using Generalized Advantage Estimation
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Proximal Policy Optimization Algorithms
from project
Continuous control with deep reinforcement learning
Benchmarking Deep Reinforcement Learning for Continuous Control
Setting up a Reinforcement Learning Task with a Real-World Robot