Skip to content

Instantly share code, notes, and snippets.

@araffin
Created August 18, 2018 08:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save araffin/722b1e0da4064107797480a37643f331 to your computer and use it in GitHub Desktop.
Save araffin/722b1e0da4064107797480a37643f331 to your computer and use it in GitHub Desktop.
import gym
from stable_baselines.common.policies import MlpPolicy
from stable_baselines.common.vec_env import DummyVecEnv, VecNormalize
from stable_baselines import PPO2
env = DummyVecEnv([lambda: gym.make("Reacher-v2")])
# Automatically normalize the input features
env = VecNormalize(env, norm_obs=True, norm_reward=False,
clip_obs=10.)
model = PPO2(MlpPolicy, env)
model.learn(total_timesteps=2000)
# Don't forget to save the running average when saving the agent
log_dir = "/tmp/"
model.save(log_dir + "ppo_reacher")
env.save_running_average(log_dir)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment