This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# https://github.com/JuliaML/Reinforce.jl/blob/master/examples/mountain_car.jl | |
using Reinforce | |
using Reinforce.MountainCarEnv: MountainCar | |
using Plots | |
gr() | |
# Deterministic policy that is solving the problem | |
mutable struct BasicCarPolicy <: Reinforce.AbstractPolicy end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Environment setup | |
env = MountainCar() | |
function episode!(env, π = RandomPolicy()) | |
ep = Episode(env, π) | |
for (s, a, r, s′) in ep | |
gui(plot(env)) | |
end | |
ep.total_reward, ep.niter | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Deterministic policy that is solving the problem | |
mutable struct BasicCarPolicy <: Reinforce.AbstractPolicy end | |
Reinforce.action(policy::BasicCarPolicy, r, s, A) = s.velocity < 0 ? 1 : 3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Main part | |
R, n = episode!(env, BasicCarPolicy()) | |
println("reward: $R, iter: $n") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# src/Reinforce.jl | |
abstract type AbstractPolicy end | |
""" | |
a = action(policy, r, s, A) | |
Take in the last reward `r`, current state `s`, | |
and set of valid actions `A = actions(env, s)`, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mutable struct Episode{E<:AbstractEnvironment,P<:AbstractPolicy,F<:AbstractFloat} | |
env::E | |
policy::P | |
total_reward::F # total reward of the episode | |
last_reward::F | |
niter::Int # current step in this episode | |
freq::Int # number of steps between choosing actions | |
maxn::Int # max steps in an episode - should be constant during an episode | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
const min_position = -1.2 | |
const max_position = 0.6 | |
const max_speed = 0.07 | |
const goal_position = 0.5 | |
const min_start = -0.6 | |
const max_start = 0.4 | |
const car_width = 0.05 | |
const car_height = car_width/2.0 | |
const clearance = 0.2*car_height |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
const min_position = -1.2 | |
const max_position = 0.6 | |
const max_speed = 0.07 | |
const goal_position = 0.5 | |
const min_start = -0.6 | |
const max_start = 0.4 | |
const car_width = 0.05 | |
const car_height = car_width/2.0 | |
const clearance = 0.2*car_height |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
actions(env::MountainCar, s) = DiscreteSet(1:3) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
finished(env::MountainCar, s′) = env.state.position >= goal_position |