Skip to content

Instantly share code, notes, and snippets.

View zsunberg's full-sized avatar

Zachary Sunberg zsunberg

View GitHub Profile
from julia.CommonRLSpaces import Box
from julia.Main import Float64
from julia.POMDPs import solve, pdf,action
from julia.QMDP import QMDPSolver
from julia.POMCPOW import POMCPOWSolver
from julia.POMDPTools import stepthrough, alphavectors, Uniform, Deterministic
from julia.Distributions import Normal,AbstractMvNormal,MvNormal
from quickpomdps import QuickPOMDP
@zsunberg
zsunberg / lunar_lander.jl
Created July 7, 2023 13:49
Lunar Lander model for POMDPs.jl (happy for someone to make a package for this!)
struct LunarLander <: POMDP{Vector{Float64}, Vector{Float64}, Vector{Float64}}
dt::Float64
m::Float64
I::Float64
Q::Vector{Float64}
R::Vector{Float64}
end
function LunarLander(;dt::Float64=0.1, m::Float64=1.0, I::Float64=10.0)
Q = [0.0, 0.0, 0.0, 0.1, 0.1, 0.01]
@zsunberg
zsunberg / multithread_comparison.jl
Created December 24, 2021 01:50
Is it better to launch 10000 tasks or nthreads() tasks in Julia
using BenchmarkTools
function operate!(shared, locks)
i = rand(1:length(shared))
lock(locks[i]) do
shared[i] += 1
end
end
function operate_many!(shared, locks, channel)
@zsunberg
zsunberg / jrl_error.jl
Created October 18, 2020 04:51
JuliaReinforcementLearning script that produces a bounds errror
using ReinforcementLearningZoo
using ReinforcementLearningBase
using ReinforcementLearningCore: NeuralNetworkApproximator, EpsilonGreedyExplorer, QBasedPolicy, CircularCompactSARTSATrajectory
using ReinforcementLearning
using Flux
using Flux: glorot_uniform, huber_loss
import Random
import BSON
RL = ReinforcementLearningBase
@zsunberg
zsunberg / model.mof.json
Created August 27, 2020 21:00
A MOI Model that GLPK fails on
{
"name": "MathOptFormat Model",
"version": {
"major": 0,
"minor": 4
},
"variables": [
{
"name": "x[1,1]"
},
@zsunberg
zsunberg / CommonRL.md
Last active June 14, 2020 08:56
A common RL environment interface

The only code in the entire package initially is

abstract type CommonEnv end

function reset! end
function step! end
function actions end

(of course there will be extensive documentation, etc.)

Here are the two ways that I was referring to about augmenting the state space (these are illustrative rather than efficient or complete implementations):

  1. Add a single new terminal state
struct VariableDiscountWrapper1{S, A, F<:Function} <: MDP{Union{S, TerminalState}, A}
    m::MDP{S, A}
    discount::F
end
@zsunberg
zsunberg / ekf_usage.jl
Last active June 19, 2019 18:01
Sketch of Extended Kalman Filter package usage
using ExtendedKalmanFilters
using Distributions
using DelimitedFiles
# We may also want to look at DynamicalSystems.jl
# The package should accept AbstractArrays wherever possible so people can use StaticArrays
# Model semantics
# x_{t+1} = f(x_t, u_t) + w_t
# y_t = h(x_t) + v_t # should the control be an argument of h?
@zsunberg
zsunberg / transmat.jl
Created October 29, 2018 18:14
Procedure to generate a transition matrix from an MDP
using POMDPs
using POMDPModelTools
function transition_matrix_a_s_sp(mdp::MDP)
na = n_actions(mdp)
ns = n_states(mdp)
mat = zeros(na, ns, ns) # this should be sparse
for a in actions(mdp)
ai = actionindex(mdp, a)
@zsunberg
zsunberg / gw_bench.jl
Created August 31, 2018 23:58
Grid world benchmark showing that the current julia compiler cannot handle multiple state types. Output for julia 1.0 at bottom.
using POMDPs
using POMDPModelTools
using POMDPSimulators
using POMDPPolicies
using StaticArrays
using Parameters
using Random
using BenchmarkTools
using POMDPModels
using Test