Last active November 13, 2020 21:12
OpenAI gym tutorial

Getting Setup: Follow the instruction on

git clone
cd gym
pip install -e . # minimal install

Basic Example using CartPole-v0:

Level 1: Getting environment up and running

import gym
env = gym.make('CartPole-v0')
for _ in range(1000): # run for 1000 steps
    action = env.action_space.sampe() # pick a random action
    env.step(action) # take action

Level 2: Running trials(AKA episodes)

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset() # reset for each new trial
    for t in range(100): # run for 100 timesteps or until done, whichever is first
        action = env.action_space.sample() # select a random action (see
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))

Level 3: Non-random actions

import gym
env = gym.make('CartPole-v0')
highscore = 0
for i_episode in range(20): # run 20 episodes
  observation = env.reset()
  points = 0 # keep track of the reward each episode
  while True: # run until episode is done
    action = 1 if observation[2] > 0 else 0 # if angle if positive, move right. if angle is negative, move left
    observation, reward, done, info = env.step(action)
    points += reward
    if done:
      if points > highscore: # record high score
        highscore = points
Shumakriss commented Jun 17, 2017

Thank you for the tutorial!

Hate to be picky but it is code and there's a type on line 6 in the code block for Level 1:

action = env.action_space.sampe() # pick a random action

should be:

action = env.action_space.sample() # pick a random action

Also, I think the break in Level 3 is indented one level too many since it will not restart the episode when it's "done".

royerk commented Oct 28, 2017

Hi, thanks for the tuto. I believe there is a small mistake, the break in the final example needs one less indentation.

if done:

      if points > highscore: # record high score

           highscore = points


kschultz1986 commented Dec 17, 2017

Basic tutorial question:

import gym
env = gym.make('CartPole-v0')
for _ in range(1000): # run for 1000 steps
   action = env.action_space.sampe() # pick a random action
   env.step(action) # take action

What am I supposed to do with this? Paste it to command line? Paste it to a file and run it with some command?

jvmncs commented Dec 23, 2017

@kschultz1986 you should probably learn how to use Python first. you're not going to be able to use Gym if you don't know how to write and run a Python program, which seems to be the case here.

but if you insist...
assuming you've installed python and gym and all the dependencies correctly on your system, you can paste that code into a text file (say, and then run python

Amazing! Thank you for this tutorial

