Skip to content

Instantly share code, notes, and snippets.

@Paulescu
Last active April 28, 2022 12:54
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save Paulescu/d965343593f2242cc1bad796086cbc12 to your computer and use it in GitHub Desktop.
from tqdm import tqdm
n_episodes = 100
reward_per_episode = []
success_per_episode = []
for i in tqdm(range(0, n_episodes)):
state = env.reset()
total_reward = 0
done = False
reward = None
while not done:
action = agent.act(state)
next_state, reward, done, info = env.step(action)
total_reward += reward
state = next_state
reward_per_episode.append(total_reward)
success_per_episode.append(1 if reward > 0 else 0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment