Skip to content

Instantly share code, notes, and snippets.

@tankala
Last active October 19, 2018 12:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tankala/916ce440293825ca995b0da910781770 to your computer and use it in GitHub Desktop.
Save tankala/916ce440293825ca995b0da910781770 to your computer and use it in GitHub Desktop.
With deep learning model playing CartPole game
scores = []
choices = []
for each_game in range(100):
score = 0
prev_obs = []
for step_index in range(goal_steps):
env.render()
if len(prev_obs)==0:
action = random.randrange(0,2)
else:
action = np.argmax(trained_model.predict(prev_obs.reshape(-1, len(prev_obs)))[0])
choices.append(action)
new_observation, reward, done, info = env.step(action)
prev_obs = new_observation
score+=reward
if done:
break
env.reset()
scores.append(score)
print(scores)
print('Average Score:', sum(scores)/len(scores))
print('choice 1:{} choice 0:{}'.format(choices.count(1)/len(choices),choices.count(0)/len(choices)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment