Install precise dependencies indicated:
pip install -r requirements.txt
Download the cached agent (alternately, skip this and re-train agent by uncommenting the learn
and save
steps above):
wget https://minio.thelio.carlboettiger.info/shared-data/sac_tuned.zip
Run example
python fishing_SAC_tuned.py
Example output:
mean reward: 7.756014 std: 0.0 tuned_value: 7.755079
mean reward from sims: 1.327337048649788
Note that the mean reward reported by evaluate_policy()
is a nearly perfect match the
(claimed) tuned value, while the mean from env.simulate()
is way less. (Also check out
the example simulation figures in results dir, which show fish stock crashing).