Last active
May 9, 2023 06:15
-
-
Save simoninithomas/baafe42d1a665fb297ca669aa2fa6f92 to your computer and use it in GitHub Desktop.
my code is exactly same but I am getting total 143 rewards in 10000(ten thousand) episode. very low accuracy
Hey there this code is obsolete check this instead: https://huggingface.co/learn/deep-rl-course/unit2/introduction
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
---> 30 qtable[state, action] = qtable[state, action] + learning_rate * (reward + gamma * np.max(qtable[new_state, :]) - qtable[state, action])
31
32
IndexError: arrays used as indices must be of integer (or boolean) type
Any idea why this is happening?