Skip to content

Instantly share code, notes, and snippets.

@jangirrishabh
Created June 14, 2018 11:41
Show Gist options
  • Save jangirrishabh/30cf6c59b9915054f3cf6d278f8f8a11 to your computer and use it in GitHub Desktop.
Save jangirrishabh/30cf6c59b9915054f3cf6d278f8f8a11 to your computer and use it in GitHub Desktop.
Snippet for toyCarIRL, blog usage, not executable
if __name__ == '__main__':
logger = logging.getLogger()
logger.setLevel(logging.INFO)
randomPolicyFE = [ 7.74363107 , 4.83296402 , 6.1289194 , 0.39292849 , 2.0488831 , 0.65611318 , 6.90207523 , 2.46475348]
# ^the random policy feature expectations
expertPolicyYellowFE = [7.5366e+00, 4.6350e+00 , 7.4421e+00, 3.1817e-01, 8.3398e+00, 1.3710e-08, 1.3419e+00 , 0.0000e+00]
# ^feature expectations for the "follow Yellow obstacles" behavior
expertPolicyRedFE = [7.9100e+00, 5.3745e-01, 5.2363e+00, 2.8652e+00, 3.3120e+00, 3.6478e-06, 3.82276074e+00 , 1.0219e-17]
# ^feature expectations for the follow Red obstacles behavior
expertPolicyBrownFE = [5.2210e+00, 5.6980e+00, 7.7984e+00, 4.8440e-01, 2.0885e-04, 9.2215e+00, 2.9386e-01 , 4.8498e-17]
# ^feature expectations for the "follow Brown obstacles" behavior
expertPolicyBumpingFE = [ 7.5313e+00, 8.2716e+00, 8.0021e+00, 2.5849e-03 ,2.4300e+01 ,9.5962e+01 ,1.5814e+01 ,1.5538e+03]
# ^feature expectations for the "nasty bumping" behavior
epsilon = 0.1
irlearner = irlAgent(randomPolicyFE, expertPolicyRedFE, epsilon, NUM_STATES, FRAMES, BEHAVIOR)
print (irlearner.optimalWeightFinder())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment