Benjamin Ellenberger benelot

## README.md

      
        
          
            
              
              2 files
            
          
          
            
              
              0 forks
            
          
          
            
              
              0 comments
            
          
          
            
              
              0 stars
            
          
        
        
          
              
          
          
            
                benelot
                / README.md
            
            
              Created
              October 19, 2016 20:43
                — forked from tambetm/README.md
            
          
        
      
        
  
      
    Used normalized advantage functions (NAF) from this paper:
Continuous Deep Q-Learning with Model-based Acceleration

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

http://arxiv.org/abs/1603.00748
The command line used was:
python naf.py InvertedPendulum-v1 --batch_norm --optimizer_lr 0.0001 --noise fixed --noise_scale 0.01 --tau 1 --l2_reg 0.001 --batch_size 1000