Skip to content

Instantly share code, notes, and snippets.

@tilarids tilarids/README
Created Aug 30, 2016

What would you like to do?
TRPO (described in with an additional neural network to predict value (used for advantage calculation).
More details and steps to reproduce:
Commit used to produce the result:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.