Soren Bouma sorenbouma

## policyResnet.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                sorenbouma
                / policyResnet.md
            
            
              Last active
              August 3, 2016 22:30
            
          
    #Policy Resnet!
This is based on Andrej Karpathy's RL Tutorial:(http://karpathy.github.io/2016/05/31/rl/), but uses a residual neural net written in theano+lasagne to approximate the policy, and adam to optimize the parameters.
Similar architecture to the original resnet paper: arxiv.org/abs/1512.03385, but with no global pooling and a 512 unit fc layer after all the residual blocks.

4 residual blocks: 16 filters->32 filters->32 filters->64 filters
all filters 3x3.
where number of filters increased, I used stride 2x2 to decrease height/width
relu nonlinearity


## cem.py
import gym
import numpy as np
import matplotlib.pyplot as plt
env = gym.make('CartPole-v0')
env.render(close=True)
#vector of means(mu) and standard dev(sigma) for each paramater
mu=np.random.uniform(size=state.shape)
sigma=np.random.uniform(low=0.001,size=state.shape)
	import gym
	import numpy as np
	import matplotlib.pyplot as plt
	env = gym.make('CartPole-v0')
	env.render(close=True)
	#vector of means(mu) and standard dev(sigma) for each paramater
	mu=np.random.uniform(size=state.shape)
	sigma=np.random.uniform(low=0.001,size=state.shape)