Last active
July 19, 2022 11:55
-
-
Save Sultan91/9b7fbd7c36ccbb4d76193a642c53ab48 to your computer and use it in GitHub Desktop.
Pseudo code for sarsa lambda
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Procedure SARSA(lambda) | |
Initialize Q(s,a) arbitrarily and e(s,a)=0 for all s,a pairs | |
Repeat(per episode): | |
Initialize s,a | |
e(s,a)=0 | |
Repeat(per timestep in each episode): | |
Take action a, observe r,s` | |
Choose a` from s` via epsilon-greedy policy Q | |
error = r+discount*Q(s`,a`)-Q(s,a) | |
any e(s,:)=0 | |
e(s,a)=1 | |
Q = Q + learn_rate*error*e | |
e = e*discount*lambda |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment