Skip to content

Instantly share code, notes, and snippets.

@icoxfog417
Last active October 14, 2016 01:53
Show Gist options
  • Save icoxfog417/19c426dccbcda9185dcb8cf3d7506087 to your computer and use it in GitHub Desktop.
Save icoxfog417/19c426dccbcda9185dcb8cf3d7506087 to your computer and use it in GitHub Desktop.

cartpole-q-learning

Solve CartPole environment on OpenAI Gym by simple q-learning algorithm (Parameter tuned version)

  • gamma: 0.99
  • bin size: [3, 3, 8, 5]
  • low bound: [None, -0.5, None, -math.radians(50)]
  • high bound: [None, 0.5, None, math.radians(50)]
  • learning rate update rule: max(0.1, min(0.5, 1.0 - math.log10((t + 1) / 25)))
  • epsilon update rule: max(0.01, min(1.0, 1.0 - math.log10((t + 1) / 25)))

These parameter settings are refered to sakulkar's algorithm.
You can confim no parameter tuning version at here. You can feel the effect of these.

Model Overview

q-learning.PNG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment