Skip to content

Instantly share code, notes, and snippets.

Siddharthan siddpr

Block or report user

Report or block siddpr

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@siddpr
siddpr / Evolutionary Strategy
Last active May 24, 2017
Implementation of ES - a scalable alternative to RL
View Evolutionary Strategy
## Continous cart pole using evolutionary strategy (ES)
## This is an implementation of the paper https://arxiv.org/pdf/1703.03864.pdf
## Running this script should do the trick
import gym
import numpy as np
from gym import wrappers
env = gym.make('InvertedPendulum-v1')
env = wrappers.Monitor(env, '/home/sid/ccp_pg', force= True)
View Policy gradient.
# Continuous cart pole using policy gradients (PG)
# Running this script does the trick!
import gym
import numpy as np
from gym import wrappers
env = gym.make('InvertedPendulum-v1')
env = wrappers.Monitor(env, '/home/sid/ccp_pg', force= True)
View Policy gradient.
# Continuous cart pole using policy gradients (PG)
# Running this script does the trick!
import gym
import numpy as np
from gym import wrappers
env = gym.make('InvertedPendulum-v1')
env = wrappers.Monitor(env, '/home/sid/ccp_pg', force= True)
@siddpr
siddpr / Policy gradient.
Created May 20, 2017
The following uses simple vannila REINFORCE algorithm with average reward as baseline b
View Policy gradient.
# Continuous cart pole using policy gradients (PG)
# Running this script does the trick!
import gym
import numpy as np
from gym import wrappers
env = gym.make('InvertedPendulum-v1')
env = wrappers.Monitor(env, '/home/sid/ccp_pg', force= True)
You can’t perform that action at this time.