Skip to content

Instantly share code, notes, and snippets.



Last active Jun 16, 2016
What would you like to do?
OpenAI cartpole evaluation ala iaroslav-ai

This gist documents my OpenAI evaluations at

I attempted to reproduce the quickest documented CartPole-v0 solution to-date, reporting 29 episodes to solve,by iaroslav-ai, documented at

I don't know why I had a different result the first time, or indeed why it was faster the second time.

I also saw it fail with a traceback:

/srv/s/openai/gym/gym/envs/classic_control/ in render(self, return_rgb_array)
---> 95             arr = arr.reshape(self.height, self.width, 4)
ValueError: total size of new array must be unchanged

Note: no GPU/CUDA used


This comment has been minimized.

Copy link

@rohinarora rohinarora commented Jun 14, 2016

Can someone from the community comment on why there might have been so different results by running the same algorithm?
As stated above- once it solved in 29 episodes, once in 23 episodes, and once it didn't. I am curious about it


This comment has been minimized.

Copy link

@JKCooper2 JKCooper2 commented Jun 16, 2016

Variance in solve time comes from randomness in both the environment and the agent. If you set the random seeds in both then the result should be the same every time.

This is partly why the 'Algorithms' section has been created so the performance over multiple evaluations can be measured (an algorithm's true ability should be independent of randomness)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.