icoxfog417/cartpole_q_learning_doc2.md

## cartpole_q_learning_doc2.md

      
    Raw
  

              cartpole_q_learning_doc2.md
            
          
    cartpole-q-learning

Solve CartPole environment on OpenAI Gym by simple q-learning algorithm (Parameter tuned version)

gamma: 0.99
bin size: [3, 3, 8, 5]
low bound: [None, -0.5, None, -math.radians(50)]
high bound: [None, 0.5, None, math.radians(50)]
learning rate update rule: max(0.1, min(0.5, 1.0 - math.log10((t + 1) / 25)))
epsilon update rule: max(0.01, min(1.0, 1.0 - math.log10((t + 1) / 25)))

These parameter settings are refered to sakulkar's algorithm.

You can confim no parameter tuning version at here. You can feel the effect of these.
Repository

Model Overview