Skip to content

Instantly share code, notes, and snippets.

View sorenbouma's full-sized avatar

Soren Bouma sorenbouma

View GitHub Profile

#Policy Resnet!

This is based on Andrej Karpathy's RL Tutorial:(http://karpathy.github.io/2016/05/31/rl/), but uses a residual neural net written in theano+lasagne to approximate the policy, and adam to optimize the parameters.

Similar architecture to the original resnet paper: arxiv.org/abs/1512.03385, but with no global pooling and a 512 unit fc layer after all the residual blocks.

  • 4 residual blocks: 16 filters->32 filters->32 filters->64 filters
  • all filters 3x3.
  • where number of filters increased, I used stride 2x2 to decrease height/width
  • relu nonlinearity
@sorenbouma
sorenbouma / cem.py
Last active April 17, 2018 05:54
This is a basic python implementation of the Cross-Entropy Method for reinforcement learning on OpenAI gym's CartPole environment.
import gym
import numpy as np
import matplotlib.pyplot as plt
env = gym.make('CartPole-v0')
env.render(close=True)
#vector of means(mu) and standard dev(sigma) for each paramater
mu=np.random.uniform(size=state.shape)
sigma=np.random.uniform(low=0.001,size=state.shape)