Skip to content

Instantly share code, notes, and snippets.

@adityavk
Last active March 29, 2024 22:33
Show Gist options
  • Save adityavk/549bfd9b5ebce20b391557f4a174cfe9 to your computer and use it in GitHub Desktop.
Save adityavk/549bfd9b5ebce20b391557f4a174cfe9 to your computer and use it in GitHub Desktop.
Example implementation of value iteration for a toy discrete MDP
planner = Planner(env.P)
# Extract the value function, the history of log V(s), and the optimal policy learnt by VI
value_func_vi, log_value_func_history_vi, value_iteration_policy = planner.value_iteration(gamma=0.99)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment