Last active
October 23, 2016 03:25
-
-
Save Itsukara/5d9e3bc6163ee8202d33b7bc48ec6da1 to your computer and use it in GitHub Desktop.
How to reproduce Montezuma's Revenge score by Itsukara
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# See followings for explanation of this algorithm | |
# * https://github.com/Itsukara/async_deep_reinforce/blob/master/README.md (English) | |
# * http://www.slideshare.net/ItsukaraIitsuka/drl-challenge-on-montezumas-revenge (English) | |
# * http://www.slideshare.net/ItsukaraIitsuka/deepmind20166-unifying-countbased-exploration-and-intrinsic-motivation-pseudocount-montezumas-revenge Japanese) | |
# Preparation of environment | |
sudo apt-get install git | |
git clone https://github.com/Itsukara/async_deep_reinforce.git | |
mkdir Download | |
cp async_deep_reinforce/gcp-install/gcp-install-a3c-env.tgz Download/ | |
cd Download/ | |
tar zxvf gcp-install-a3c-env.tgz | |
bash -x install-Anaconda.sh | |
. ~/.bashrc | |
bash -x install-tensorflow.sh | |
bash -x install-opencv3.sh | |
bash -x install-ALE.sh | |
bash -x install-additionals.sh | |
cd ../async_deep_reinforce | |
# Training | |
./run-option-gym montezuma-j-tes30-b0020-ff-fs2 &> log.montezuma-j-tes30-b0020-ff-fs2 | |
# You can stop training by control-c | |
# In this case, latest checkpoint is not that of best average score | |
# You have to search checkpoint of best average score from log.montezuma-j-tes30-b0020-ff-fs2 | |
# You can resume traning by above command ( ./run-option-gym ... ) | |
# The following command displays training curve while training or after training | |
python plot.py log.montezuma-j-tes30-b0020-ff-fs2 | |
# The training curve is periodically updated (every 10 seconds) | |
# plot.py has many options. The following is usefull for analysis of training status | |
# -i {r,lives,s,tes,v,pr} option changes information to display | |
# The following command creates high-score-play-movies while training or after training | |
./run-avconv-and-rm screen.new-record | |
# The following command creates new-room-visiting-movies while training or after training | |
./run-avconv-and-rm screen.new-room | |
# The following command output visited rooms | |
python rooms.py log.montezuma-j-tes30-b0020-ff-fs2 | |
# The follwing command report various information to /tmp periodically (every 5 minutes) | |
report log.montezuma-j-tes30-b0020-ff-fs2 | |
# It is usefull when you run training in many machines. | |
# You just run "nohuu report ..." in every machine and you can watch every training curves | |
# by periodically scp "/tmp/*.png" to your local machine. (I'm using a shell script for do) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment