Skip to content

Instantly share code, notes, and snippets.

@zsunberg
Last active June 14, 2020 08:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zsunberg/a6cae2f92b5f8fae8f624dc173bc5c6b to your computer and use it in GitHub Desktop.
Save zsunberg/a6cae2f92b5f8fae8f624dc173bc5c6b to your computer and use it in GitHub Desktop.
A common RL environment interface

The only code in the entire package initially is

abstract type CommonEnv end

function reset! end
function step! end
function actions end

(of course there will be extensive documentation, etc.)

What does it mean for an RL Framework to "support" CommonEnv?

Suppose you have an environment type in your package called YourEnv. Support for CommonRLEnv means:

  1. You provide a constructor method

    YourEnv(env::CommonEnv) # might require extra args and keyword args in some cases
  2. You provide an implementation of the interface functions in YourEnv only using functions from CommonRL

  3. You provide CommonEnv constructor method

    CommonEnv(env::YourEnv) # might require extra args and keyword args

    which returns a YourCommonEnv <: CommonEnv

  4. You implement at minimum

    • CommonRL.reset!(::YourCommonEnv)
    • CommonRL.step!(::YourCommonEnv, a)
    • CommonRL.actions(::YourCommonEnv) and as many of the other functions (see below) that you'd like to support.

Can people implement problems directly with the CommonEnv interface?

Yes! And this might often be the easiest and most well-documented way for them to implement a simple new environment that works with your algorithms. For example, a 1-D LQR problem with discrete actions might look like this:

mutable struct LQREnv <: CommonEnv
    s::Float64
end

function CommonRL.reset!(m::LQREnv)
    m.s = 0.0
end

function CommonRL.step!(m::LQREnv, a)
    r = -s^2 - a^2
    sp = m.s = m.s + a + randn()
    return sp, r, false, NamedTuple()
end

CommonRL.actions(m::LQREnv) = (-1.0, 0.0, 1.0)

What does it mean for an algorithm to "support" CommonEnv?

You should have a method of your solver or algorithm that accepts a CommonEnv, perhaps handling it by converting it to your framework first, e.g.

solve(env::CommonEnv) = solve(YourEnv(env))

What about other functions?

Other functions, for example clone, render, or observationspace, might be made available. An algorithm or another framework can check if a function is available with, e.g.

provides(env, clone)

For a default, there will be provides(::CommonEnv, ::Function) = false.

To provide an optional function with your environment, you will write, for example,

@provide CommonRL.clone(m::LQREnv) = LQREnv(m.s)

or

@provide CommonRL.clone(m::CommonEnv{YourEnv}) = CommonEnv(YourPackage.makecopy(m.env))

(The macro automatically implements provides(::Type{<:YourCommonEnv}, ::typeof(clone)) = true.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment