Alvaro avalcarce

## README.md

      
              8 files
            
          
              1 fork
            
          
              0 comments
            
          
              2 stars
            
          
                avalcarce
                / README.md
            
            
              Last active
              September 24, 2017 17:11
            
              
                RL DQN solution for MountainCar-v0, CartPole-v0 and CartPole-v1 on OpenAI's Gym
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to some classic control problems. I've used it to solve MountainCar-v0 problem, CartPole-v0 and [CartPole-v1] (https://gym.openai.com/envs/CartPole-v1) in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
The code is fundamentally a translation of necnec's algorithm with Theano & Lasagne to Tensorflow.
I've run it on Python 3.5 under Windows 7.
References


Deep Learning tutorial, David Silver, Google DeepMind.
necnec's algorithm


## README.md

      
              7 files
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Last active
              March 27, 2018 14:53
            
              
                Solving MountainCar-v0
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to some classic control problems. I've used it to solve MountainCar-v0 problem, CartPole-v0 and [CartPole-v1] (https://gym.openai.com/envs/CartPole-v1) in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
Some of the hyperparameters used in the main.py script to solve MountainCar-v0 have been optained partly through exhaustive search, and partly via Bayesian optimization with Scikit-Optimize. The optimized hyperparameters and their values are:

Size of 1st fully connected layer: 198
Size of 2nd fully connected layer: 96
Learning rate: 2.33E-4
Period (in steps) for the update of the target network parameters as per the DQN algorithm: 999


## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              February 24, 2017 14:06
            
              
                Solving CartPole-v0 with DQN
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the CartPole-v0 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
Some of the hyperparameters used in the main.py script have been optainedvia Bayesian optimization with Scikit-Optimize. The optimized hyperparameters and their values are:

Size of 1st fully connected layer: 208
Size of 2nd fully connected layer: 71
Learning rate: 1.09E-3
Period (in steps) for the update of the target network parameters as per the DQN algorithm: 800


## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              3 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              February 27, 2017 09:27
            
              
                Solving MountainCar-v0 with DQN in the least possible number of learning episodes for a minimum average reward of -110.
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to some classic control problems. I've used it to solve MountainCar-v0 problem, CartPole-v0 and [CartPole-v1] (https://gym.openai.com/envs/CartPole-v1) in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
Some of the hyperparameters used in the main.py script to solve MountainCar-v0 have been optained via Bayesian optimization with Scikit-Optimize. The optimized hyperparameters and their values are:

Size of 1st fully connected layer: 47
Size of 2nd fully connected layer: 197
Epsilon (as in greedy epsilon exploration) decay factor: 0.8513032459
Minimum epsilon: 1.872686e-05


## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              March 3, 2017 14:43
            
              
                Solving CartPole-v0 with DQN and Prioritized Experience Replay
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the CartPole-v0 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Deep Q Network (DQN) with Prioritized Experience Replay (PER).
All hyper parameters have been chosen by hand based on past experience.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are:

  
## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              March 3, 2017 17:02
            
              
                Solving MountainCar-v0 with DQN and Prioritized Experience Replay
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the MountainCar-v0 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Deep Q Network (DQN) with Prioritized Experience Replay (PER).
All hyper parameters have been chosen by hand based on past experience.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are:

  
## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              March 6, 2017 12:09
            
              
                Solving CartPole-v1 with DQN and Prioritized Experience Replay
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the CartPole-v1 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Deep Q Network (DQN) with Prioritized Experience Replay (PER).
All hyper parameters have been chosen by hand based on past experience.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are:

  
## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              March 7, 2017 13:06
            
              
                Solving MountainCar-v0 with Double DQN and Prioritized Experience Replay (with proportional prioritization)
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the MountainCar-v0 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Double Deep Q Network (DQN) with Prioritized Experience Replay (PER), where the proportional prioritization variant has been implemented.
All hyper parameters have been chosen by hand based on several experiments.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are:

  
## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                avalcarce
                / README.md
            
            
              Created
              March 7, 2017 13:13
            
              
                Solving CartPole-v0 with Double DQN and Prioritized Experience Replay (with proportional prioritization)
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the CartPole-v0 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Double Deep Q Network (DQN) with Prioritized Experience Replay (PER), where the proportional prioritization variant has been implemented.
All hyper parameters have been chosen by hand based on several experiments.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are:

  
## README.md

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                avalcarce
                / README.md
            
            
              Created
              March 7, 2017 13:15
            
              
                Solving CartPole-v1 with Double DQN and Prioritized Experience Replay (with proportional prioritization)
              
          
    Synopsis

This is a Deep Reinforcement Learning solution to the CartPole-v1 environment in OpenAI's Gym.
This code uses Tensorflow to model a value function for a Reinforcement Learning agent.
I've run it with Tensorflow 1.0 on Python 3.5 under Windows 7.
The algorithm is a Double Deep Q Network (DQN) with Prioritized Experience Replay (PER), where the proportional prioritization variant has been implemented.
All hyper parameters have been chosen by hand based on several experiments.
However, the learning rate, the priorization exponent alpha and the initial importance sampling exponen beta0 have been optained via Bayesian optimization with Scikit-Optimize.
The hyperparameters are: