Skip to content

Instantly share code, notes, and snippets.

@korymath
Created May 24, 2016 18:58
Show Gist options
  • Save korymath/c76cc0973f9f12bcbd4ea3dc5573b2b6 to your computer and use it in GitHub Desktop.
Save korymath/c76cc0973f9f12bcbd4ea3dc5573b2b6 to your computer and use it in GitHub Desktop.
import gym
env = gym.make('Reacher-v1')
env.reset()
env.render()
env.monitor.start('/tmp/reacher-1', force=True)
for i_episode in xrange(101):
observation = env.reset()
for t in xrange(100):
env.render()
# print observation
# action selection
action = env.action_space.sample()
# take the action and observe the reward and next state
observation, reward, done, info = env.step(action)
if done:
print "Episode finished after {} timesteps".format(t+1)
break
env.monitor.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment