Skip to content

Instantly share code, notes, and snippets.

@iambrian
Last active November 13, 2020 21:12
Show Gist options
  • Save iambrian/2bcc8fc03eaecb2cbe53012d2f505465 to your computer and use it in GitHub Desktop.
Save iambrian/2bcc8fc03eaecb2cbe53012d2f505465 to your computer and use it in GitHub Desktop.
OpenAI gym tutorial

Getting Setup: Follow the instruction on https://gym.openai.com/docs

git clone https://github.com/openai/gym
cd gym
pip install -e . # minimal install

Basic Example using CartPole-v0:

Level 1: Getting environment up and running

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000): # run for 1000 steps
    env.render()
    action = env.action_space.sampe() # pick a random action
    env.step(action) # take action

Level 2: Running trials(AKA episodes)

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset() # reset for each new trial
    for t in range(100): # run for 100 timesteps or until done, whichever is first
        env.render()
        action = env.action_space.sample() # select a random action (see https://github.com/openai/gym/wiki/CartPole-v0)
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

Level 3: Non-random actions

import gym
env = gym.make('CartPole-v0')
highscore = 0
for i_episode in range(20): # run 20 episodes
  observation = env.reset()
  points = 0 # keep track of the reward each episode
  while True: # run until episode is done
    env.render()
    action = 1 if observation[2] > 0 else 0 # if angle if positive, move right. if angle is negative, move left
    observation, reward, done, info = env.step(action)
    points += reward
    if done:
      if points > highscore: # record high score
        highscore = points
        break
@Shumakriss
Copy link

Shumakriss commented Jun 17, 2017

Thank you for the tutorial!

Hate to be picky but it is code and there's a type on line 6 in the code block for Level 1:

action = env.action_space.sampe() # pick a random action

should be:

action = env.action_space.sample() # pick a random action

Also, I think the break in Level 3 is indented one level too many since it will not restart the episode when it's "done".

@fjshields1221
Copy link

FFS iambrian

@royerk
Copy link

royerk commented Oct 28, 2017

Hi, thanks for the tuto. I believe there is a small mistake, the break in the final example needs one less indentation.

if done:

      if points > highscore: # record high score

           highscore = points

      break

@kschultz1986
Copy link

kschultz1986 commented Dec 17, 2017

Basic tutorial question:

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000): # run for 1000 steps
   env.render()
   action = env.action_space.sampe() # pick a random action
   env.step(action) # take action

What am I supposed to do with this? Paste it to command line? Paste it to a file and run it with some command?

@jvmncs
Copy link

jvmncs commented Dec 23, 2017

@kschultz1986 you should probably learn how to use Python first. you're not going to be able to use Gym if you don't know how to write and run a Python program, which seems to be the case here.

but if you insist...
assuming you've installed python and gym and all the dependencies correctly on your system, you can paste that code into a text file (say, test.py) and then run python test.py

@JaeDukSeo
Copy link

Amazing! Thank you for this tutorial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment