Skip to content

Instantly share code, notes, and snippets.

@wojzaremba
Created April 24, 2016 17:48
Show Gist options
  • Save wojzaremba/b474cfa2fbc93ddf3f8d0ea2ff76f362 to your computer and use it in GitHub Desktop.
Save wojzaremba/b474cfa2fbc93ddf3f8d0ea2ff76f362 to your computer and use it in GitHub Desktop.
git : https://github.com/wojzaremba/trpo commit : 6bb9fe32d5bb3413cd76e60518d49c58b2716ad1
is an implementation of TRPO with poor man memory. It concatenates last observation and state to the current observation.
It allows to solve tasks that require very short memory (e.g. reverse). Execute this script on 3 tasks Copy-v0,
DuplicatedInput-v0 and Reverse-v0) by calling:
python run.py
It starts 3 screen instances. Training takes ~1 min.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment