Skip to content

Instantly share code, notes, and snippets.

@davidADSP
Last active January 23, 2021 20:07
Show Gist options
  • Save davidADSP/1368346492393421122f47c965bdd9ed to your computer and use it in GitHub Desktop.
Save davidADSP/1368346492393421122f47c965bdd9ed to your computer and use it in GitHub Desktop.
Training a PPO model on Pendulum
import gym
from stable_baselines import PPO1
from stable_baselines.common.policies import MlpPolicy
from stable_baselines.common.callbacks import EvalCallback
env = gym.make('Pendulum-v0')
model = PPO1(MlpPolicy, env)
# Separate evaluation env
eval_env = gym.make('Pendulum-v0')
eval_callback = EvalCallback(eval_env, best_model_save_path='./logs/',
log_path='./logs/', eval_freq=500,
deterministic=True, render=False)
model.learn(5000, callback=eval_callback)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment