Skip to content

Instantly share code, notes, and snippets.

def fill_in_transition(self):
"""
Compute the transition matrix of the grid
input: /
output: T {np.array} -- the transition matrix of the grid
"""
T = np.zeros((self.state_size, self.state_size, self.action_size)) # Empty matrix of dimension S*S*A
####
# Add your code here
def policy_iteration(self, threshold = 0.0001, gamma = 0.8):
"""
Policy iteration on GridWorld
input:
- threshold {float} -- threshold value used to stop the policy iteration algorithm
- gamma {float} -- discount factor
output:
- policy {np.array} -- policy found using the policy iteration algorithm
- V {np.array} -- value function corresponding to the policy
- epochs {int} -- number of epochs to find this policy
@chinmay8bit
chinmay8bit / fibonacii_formual.ipynb
Last active September 16, 2024 14:23
Untitled3.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.