Skip to content

Instantly share code, notes, and snippets.

@YannBerthelot
Created September 20, 2023 07:52
Show Gist options
  • Save YannBerthelot/2fa3b1b4e8726dbefeb5490e8b0d6e96 to your computer and use it in GitHub Desktop.
Save YannBerthelot/2fa3b1b4e8726dbefeb5490e8b0d6e96 to your computer and use it in GitHub Desktop.
Compute the values of states in taxi driver MDP using linear equation solver
import numpy as np
_coefficients_equation_A = np.array(
[0.9 * (13 / 48) - 1, 0.9 * (3 / 8), 0.9 * (17 / 48)]
)
_coefficients_equation_B = np.array(
[0.9 * (9 / 32), 0.9 * (7 / 16) - 1, 0.9 * (9 / 32)]
)
_coefficients_equation_C = np.array(
[0.9 * (3 / 8), 0.9 * (17 / 48), 0.9 * (13 / 48) - 1]
)
coefficients_variables = np.array(
[_coefficients_equation_A, _coefficients_equation_B, _coefficients_equation_C]
)
coefficients_constants = np.array([-5, -31 / 2, -31 / 6])
solutions = np.linalg.solve(coefficients_variables, coefficients_constants)
print(solutions)
solution_attendue = np.array([156420 / 1789, 5113540 / 51881, 13602460 / 155643])
assert np.allclose(solutions, solution_attendue)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment