Created
January 28, 2018 22:51
-
-
Save AdamStelmaszczyk/588176762829c8d4668b354ddfb2668d to your computer and use it in GitHub Desktop.
InvalidArgumentError
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ python run_job.py -n 5 -g 60 -c 12 --use_sync --name neptune_job_name | |
args.offline: False | |
('bash command: ', 'srun -A luna -N 5 -n 5 -c 12 -t 6:00:00 distributed_tensorpack_mkl.sh 35083 45394 Breakout-v0 adam 1 "3nodes 12cores" "neptune_job_name_1517179219.79" 0.00015 128 60 0 None 256 100 1 uniform normal False . True 1 False /net/archive/groups/plggluna/intel_2/logs/ 1e-08 0.9 0.999 0 False False False False 120 False') | |
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1341 ; SLURM_JOB_NUM_NODES 5 | |
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1343 ; SLURM_JOB_NUM_NODES 5 | |
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1347 ; SLURM_JOB_NUM_NODES 5 | |
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1340 ; SLURM_JOB_NUM_NODES 5 | |
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1348 ; SLURM_JOB_NUM_NODES 5 | |
plgrid/tools/python/2.7.13 unloaded. | |
plgrid/tools/python/2.7.13 unloaded. | |
plgrid/tools/python/2.7.13 unloaded. | |
plgrid/tools/python/2.7.13 loaded. | |
plgrid/tools/python/2.7.13 loaded. | |
plgrid/tools/python/2.7.13 unloaded. | |
plgrid/tools/python/2.7.13 loaded. | |
plgrid/tools/python/2.7.13 loaded. | |
plgrid/tools/python/2.7.13 unloaded. | |
plgrid/tools/python/2.7.13 loaded. | |
plgrid/libs/mkl/11.3.1 unloaded. | |
plgrid/libs/mkl/11.3.1 unloaded. | |
plgrid/libs/mkl/11.3.1 unloaded. | |
plgrid/libs/mkl/2017.0.0 loaded. | |
The following have been reloaded with a version change: | |
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0 | |
plgrid/libs/mkl/2017.0.0 loaded. | |
plgrid/libs/mkl/2017.0.0 loaded. | |
The following have been reloaded with a version change: | |
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0 | |
The following have been reloaded with a version change: | |
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0 | |
plgrid/libs/mkl/11.3.1 unloaded. | |
plgrid/libs/mkl/11.3.1 unloaded. | |
plgrid/libs/mkl/2017.0.0 loaded. | |
The following have been reloaded with a version change: | |
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0 | |
plgrid/libs/mkl/2017.0.0 loaded. | |
The following have been reloaded with a version change: | |
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0 | |
tools/gcc/6.2.0 loaded. | |
tools/gcc/6.2.0 loaded. | |
tools/gcc/6.2.0 loaded. | |
tools/gcc/6.2.0 loaded. | |
tools/gcc/6.2.0 loaded. | |
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False | |
OFFLINE: False | |
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False | |
OFFLINE: False | |
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False | |
OFFLINE: False | |
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False | |
OFFLINE: False | |
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False | |
OFFLINE: False | |
2018-01-28 23:40:28.823877: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:28.823904: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:28.823923: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:28.823930: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:28.823937: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
2018-01-28 23:40:29.068483: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.068537: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.068556: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.068563: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.068570: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.078694: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1340:45394} | |
2018-01-28 23:40:29.078734: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> localhost:45395, 1 -> p1343:45395} | |
2018-01-28 23:40:29.080302: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45395 | |
[2018-01-28 23:40:29,081] Making new env: Breakout-v0 | |
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']} | |
[worker:0] Starting the TF server | |
args.mkl == 0 | |
2018-01-28 23:40:29.153339: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.153368: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.153386: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.153394: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.153401: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
using tensorflow convolution | |
2018-01-28 23:40:29.278708: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.278740: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.278759: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.278766: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.278773: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:29.288344: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> localhost:45394} | |
2018-01-28 23:40:29.288398: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1341:45395, 1 -> p1343:45395} | |
2018-01-28 23:40:29.289763: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45394 | |
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']} | |
[ps:0] Starting the TF server | |
[0128 23:40:29 @train.py:84] [ps:0] joining the server. | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
[2018-01-28 23:40:31,077] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,084] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,091] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,097] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,104] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,110] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,116] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,123] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,130] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,136] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,143] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,150] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,156] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,164] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,170] Making new env: Breakout-v0 | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
[2018-01-28 23:40:31,177] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,184] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,190] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,197] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,204] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,211] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,218] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,224] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,231] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,238] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,245] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,252] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,259] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,266] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,273] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,280] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,287] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,294] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,301] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,308] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,315] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,322] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,329] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,336] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,343] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,350] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,357] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,364] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,372] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,379] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,386] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,393] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,400] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,408] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,415] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,423] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,429] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,437] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,444] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,452] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,459] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,466] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,474] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,481] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,488] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,496] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,503] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,511] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,518] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,526] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,533] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,541] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,548] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,556] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,563] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,570] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,579] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,587] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,594] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,602] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,609] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,617] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,625] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,632] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,640] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,648] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,655] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,663] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,671] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,678] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,687] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,694] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,702] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,710] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,718] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,725] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,733] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,741] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,749] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,757] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,764] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,772] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,780] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,789] Making new env: Breakout-v0 | |
[2018-01-28 23:40:31,796] Making new env: Breakout-v0 | |
None <type 'NoneType'> | |
worker host: grpc://localhost:45395 | |
[0128 23:40:31 @train.py:717] [BA3C] Train on gpu 0 and infer on gpu 0,0,0 | |
[0128 23:40:31 @train.py:723] using async version | |
DUMMY PREDICTOR 0 | |
MultiGPUTrainer __init__ dummy = 0 | |
[0128 23:40:31 @multigpu.py:57] Training a model of 1 tower | |
[0128 23:40:31 @multigpu.py:67] Building graph for training tower 0..., /cpu:0 | |
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179231.85============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train | |
grad_list = self._multi_tower_grads() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads | |
self.model.build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
12 | |
[0128 23:40:31 @_common.py:61] conv0 input: [None, 84, 84, 16] | |
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] conv0 output: [None, 80, 80, 32] | |
[0128 23:40:31 @_common.py:61] pool0 input: [None, 80, 80, 32] | |
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] pool0 output: [None, 40, 40, 32] | |
[0128 23:40:31 @_common.py:61] conv1 input: [None, 40, 40, 32] | |
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] conv1 output: [None, 36, 36, 32] | |
[0128 23:40:31 @_common.py:61] pool1 input: [None, 36, 36, 32] | |
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] pool1 output: [None, 18, 18, 32] | |
[0128 23:40:31 @_common.py:61] conv2 input: [None, 18, 18, 32] | |
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] conv2 output: [None, 14, 14, 64] | |
[0128 23:40:31 @_common.py:61] pool2 input: [None, 14, 14, 64] | |
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] pool2 output: [None, 7, 7, 64] | |
[0128 23:40:31 @_common.py:61] conv3 input: [None, 7, 7, 64] | |
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] conv3 output: [None, 5, 5, 64] | |
[0128 23:40:31 @_common.py:61] fc1_0 input: [None, 5, 5, 64] | |
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:31 @_common.py:70] fc1_0 output: [None, 1, 1, 256] | |
[0128 23:40:32 @_common.py:61] fc-pi input: [None, 256] | |
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:32 @_common.py:70] fc-pi output: [None, 6] | |
[0128 23:40:32 @_common.py:61] fc-v input: [None, 256] | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[0128 23:40:32 @_common.py:70] fc-v output: [None, 1] | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
MOVING_SUMMARY_VARIABLES | |
[] | |
[0128 23:40:33 @modelutils.py:22] Model Parameters: | |
conv0/W:0: shape=[5, 5, 16, 32], dim=12800 | |
conv1/W:0: shape=[5, 5, 32, 32], dim=25600 | |
conv2/W:0: shape=[5, 5, 32, 64], dim=51200 | |
conv3/W:0: shape=[3, 3, 64, 64], dim=36864 | |
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600 | |
fc-pi/W:0: shape=[256, 6], dim=1536 | |
fc-pi/b:0: shape=[6], dim=6 | |
fc-v/W:0: shape=[256, 1], dim=256 | |
fc-v/b:0: shape=[1], dim=1 | |
Total param=537863 (2.051785 MB assuming all float32) | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
2018-01-28 23:40:34.823812: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:34.823850: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:34.823859: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:34.823867: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:34.823874: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. | |
2018-01-28 23:40:34.834021: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1340:45394} | |
2018-01-28 23:40:34.834060: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1341:45395, 1 -> localhost:45395} | |
2018-01-28 23:40:34.835825: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45395 | |
[2018-01-28 23:40:34,837] Making new env: Breakout-v0 | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']} | |
[worker:1] Starting the TF server | |
args.mkl == 0 | |
using tensorflow convolution | |
[2018-01-28 23:40:35,057] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,062] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,068] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,073] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,079] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,086] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,092] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,098] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,104] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,111] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,117] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,123] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,129] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,135] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,142] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,148] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,155] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,161] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,167] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,173] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,180] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,187] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,194] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,200] Making new env: Breakout-v0 | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
[2018-01-28 23:40:35,207] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,214] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,220] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,227] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,233] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,240] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,246] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,253] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,260] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,267] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,273] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,280] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,287] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,295] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,302] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,309] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,316] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,322] Making new env: Breakout-v0 | |
[0128 23:40:35 @multigpu.py:228] Setup callbacks ... | |
[2018-01-28 23:40:35,329] Making new env: Breakout-v0 | |
Creating Predictorfactor 0 | |
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0... | |
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.33============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
[2018-01-28 23:40:35,336] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,343] Making new env: Breakout-v0 | |
12 | |
[2018-01-28 23:40:35,350] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,357] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,364] Making new env: Breakout-v0 | |
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,371] Making new env: Breakout-v0 | |
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,378] Making new env: Breakout-v0 | |
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,386] Making new env: Breakout-v0 | |
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,393] Making new env: Breakout-v0 | |
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,400] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,408] Making new env: Breakout-v0 | |
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,415] Making new env: Breakout-v0 | |
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,422] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,429] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,436] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,442] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,449] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,456] Making new env: Breakout-v0 | |
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0... | |
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.46============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
[2018-01-28 23:40:35,464] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,471] Making new env: Breakout-v0 | |
12 | |
[2018-01-28 23:40:35,478] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,485] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,493] Making new env: Breakout-v0 | |
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,500] Making new env: Breakout-v0 | |
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,508] Making new env: Breakout-v0 | |
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,515] Making new env: Breakout-v0 | |
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,523] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,530] Making new env: Breakout-v0 | |
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,537] Making new env: Breakout-v0 | |
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,545] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,552] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,559] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,566] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,574] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,581] Making new env: Breakout-v0 | |
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0... | |
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.58============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
[2018-01-28 23:40:35,589] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,596] Making new env: Breakout-v0 | |
12 | |
[2018-01-28 23:40:35,603] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,611] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,619] Making new env: Breakout-v0 | |
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,626] Making new env: Breakout-v0 | |
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,634] Making new env: Breakout-v0 | |
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,641] Making new env: Breakout-v0 | |
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,648] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,656] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,663] Making new env: Breakout-v0 | |
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,670] Making new env: Breakout-v0 | |
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0) | |
[2018-01-28 23:40:35,678] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,685] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,693] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,701] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,708] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,716] Making new env: Breakout-v0 | |
[0128 23:40:35 @base.py:177] =============================================================== | |
[0128 23:40:35 @base.py:179] CHIEF! | |
[0128 23:40:35 @base.py:180] [p1341] Creating the session | |
[0128 23:40:35 @base.py:181] =============================================================== | |
[2018-01-28 23:40:35,724] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,731] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,739] Making new env: Breakout-v0 | |
[2018-01-28 23:40:35,747] Making new env: Breakout-v0 | |
None <type 'NoneType'> | |
worker host: grpc://localhost:45395 | |
[0128 23:40:35 @train.py:717] [BA3C] Train on gpu 0 and infer on gpu 0,0,0 | |
[0128 23:40:35 @train.py:723] using async version | |
DUMMY PREDICTOR 0 | |
MultiGPUTrainer __init__ dummy = 0 | |
[0128 23:40:35 @multigpu.py:57] Training a model of 1 tower | |
[0128 23:40:35 @multigpu.py:67] Building graph for training tower 0..., /cpu:0 | |
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179235.78============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train | |
grad_list = self._multi_tower_grads() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads | |
self.model.build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
12 | |
[0128 23:40:35 @_common.py:61] conv0 input: [None, 84, 84, 16] | |
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] conv0 output: [None, 80, 80, 32] | |
[0128 23:40:35 @_common.py:61] pool0 input: [None, 80, 80, 32] | |
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] pool0 output: [None, 40, 40, 32] | |
[0128 23:40:35 @_common.py:61] conv1 input: [None, 40, 40, 32] | |
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] conv1 output: [None, 36, 36, 32] | |
[0128 23:40:35 @_common.py:61] pool1 input: [None, 36, 36, 32] | |
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] pool1 output: [None, 18, 18, 32] | |
[0128 23:40:35 @_common.py:61] conv2 input: [None, 18, 18, 32] | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] conv2 output: [None, 14, 14, 64] | |
[0128 23:40:35 @_common.py:61] pool2 input: [None, 14, 14, 64] | |
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] pool2 output: [None, 7, 7, 64] | |
[0128 23:40:35 @_common.py:61] conv3 input: [None, 7, 7, 64] | |
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] conv3 output: [None, 5, 5, 64] | |
[0128 23:40:35 @_common.py:61] fc1_0 input: [None, 5, 5, 64] | |
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] fc1_0 output: [None, 1, 1, 256] | |
[0128 23:40:35 @_common.py:61] fc-pi input: [None, 256] | |
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:35 @_common.py:70] fc-pi output: [None, 6] | |
[0128 23:40:35 @_common.py:61] fc-v input: [None, 256] | |
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:36 @_common.py:70] fc-v output: [None, 1] | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
2018-01-28 23:40:36.604219: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session a398c17478e77696 with config: | |
[0128 23:40:36 @base.py:189] =============================================================== | |
[0128 23:40:36 @base.py:190] [p1341] Session created | |
[0128 23:40:36 @base.py:191] =============================================================== | |
[0128 23:40:36 @base.py:112] [p1341] Initializing graph variables ... | |
[0128 23:40:36 @base.py:119] [p1341] Starting concurrency... | |
[0128 23:40:36 @base.py:198] Starting all threads & procs ... | |
[0128 23:40:36 @base.py:122] [p1341] Setting default session | |
[0128 23:40:36 @base.py:125] [p1341] Getting global step | |
[0128 23:40:36 @base.py:127] [p1341] Start training with global_step=0 | |
MOVING_SUMMARY_VARIABLES | |
[] | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
server main loop | |
before socket bind... tcp://*:35083 | |
receiving | |
[0128 23:40:36 @modelutils.py:22] Model Parameters: | |
conv0/W:0: shape=[5, 5, 16, 32], dim=12800 | |
conv1/W:0: shape=[5, 5, 32, 32], dim=25600 | |
conv2/W:0: shape=[5, 5, 32, 64], dim=51200 | |
conv3/W:0: shape=[3, 3, 64, 64], dim=36864 | |
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600 | |
fc-pi/W:0: shape=[256, 6], dim=1536 | |
fc-pi/b:0: shape=[6], dim=6 | |
fc-v/W:0: shape=[256, 1], dim=256 | |
fc-v/b:0: shape=[1], dim=1 | |
Total param=537863 (2.051785 MB assuming all float32) | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 2 was not defined in job "worker" | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module> | |
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index) | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__ | |
self._server_def.SerializeToString(), status) | |
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__ | |
self.gen.next() | |
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status | |
pywrap_tensorflow.TF_GetCode(status)) | |
InvalidArgumentError: Task 3 was not defined in job "worker" | |
[0128 23:40:38 @multigpu.py:228] Setup callbacks ... | |
Creating Predictorfactor 0 | |
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0... | |
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.46============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
12 | |
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0... | |
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.6============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
12 | |
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0... | |
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.73============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module> | |
trainer.train() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train | |
callbacks.setup_graph(self) # TODO use weakref instead? | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph | |
cb.setup_graph(self.trainer) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph | |
self._setup_graph() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph | |
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads), | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs | |
return [self.get_predict_func(input_names, output_names, k) for k in range(n)] | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func | |
return self.predictor_factory.get_predictor(input_names, output_names, tower) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor | |
self._build_predict_tower() | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower | |
self.model, self.towers) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph | |
model.build_graph(input_vars) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph | |
self._build_graph(model_inputs) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph | |
traceback.print_stack(file=sys.stderr) | |
12 | |
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0) | |
[0128 23:40:38 @base.py:177] =============================================================== | |
[0128 23:40:38 @base.py:180] [p1343] Creating the session | |
[0128 23:40:38 @base.py:181] =============================================================== | |
[2018-01-28 23:40:38,884] Making new env: Breakout-v0 | |
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']} | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
[worker:2] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1347] ===== | |
args.mkl == 0 | |
using tensorflow convolution | |
[2018-01-28 23:40:39,212] Making new env: Breakout-v0 | |
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']} | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
[worker:3] Starting the TF server | |
========= EXCEPTION WHILE STARTING TF SERVER [p1348] ===== | |
args.mkl == 0 | |
using tensorflow convolution | |
2018-01-28 23:40:39.673658: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session 3cea52acb9284400 with config: | |
[0128 23:40:39 @base.py:189] =============================================================== | |
[0128 23:40:39 @base.py:190] [p1343] Session created | |
[0128 23:40:39 @base.py:191] =============================================================== | |
[0128 23:40:39 @base.py:112] [p1343] Initializing graph variables ... | |
[0128 23:40:39 @base.py:119] [p1343] Starting concurrency... | |
[0128 23:40:39 @base.py:198] Starting all threads & procs ... | |
[0128 23:40:39 @base.py:122] [p1343] Setting default session | |
[0128 23:40:39 @base.py:125] [p1343] Getting global step | |
[0128 23:40:39 @base.py:127] [p1343] Start training with global_step=0 | |
[2018-01-28 23:40:42,460] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,463] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,465] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,469] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,471] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,474] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,476] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,481] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,483] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,487] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,488] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,493] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,494] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,499] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,500] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,506] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,506] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,512] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,513] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,518] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,519] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,525] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,525] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,531] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,532] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,538] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,538] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,544] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,544] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,550] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,550] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,556] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,557] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,563] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,563] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,569] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,570] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,575] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,576] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,582] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,582] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,588] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,589] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,595] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,596] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,601] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,602] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,608] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,609] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,614] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,615] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,621] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,621] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,627] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,628] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,634] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,634] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,640] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,646] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,653] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,660] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,666] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,662] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,662] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,661] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,667] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,662] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,673] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,674] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,680] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,681] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,686] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,688] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,693] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,694] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,699] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,700] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,706] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,707] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,712] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,718] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,719] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,720] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,726] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,727] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,732] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,739] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,745] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,746] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,752] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,753] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,753] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,759] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,760] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,766] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,767] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,773] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,775] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,779] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,782] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,786] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,789] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,793] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,797] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,800] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,805] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,807] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,811] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,813] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,818] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,820] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,825] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,827] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,832] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,834] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,839] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,840] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,846] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,847] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,852] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,854] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,859] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,861] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,866] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,868] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,873] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,874] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,880] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,881] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,887] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,889] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,894] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,895] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,901] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,902] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,908] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,910] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,915] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,917] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,922] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,924] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,929] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,931] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,936] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,938] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,943] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,945] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,950] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,952] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,957] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,959] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,964] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,966] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,971] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,973] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,978] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,980] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,985] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,987] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,992] Making new env: Breakout-v0 | |
[2018-01-28 23:40:42,995] Making new env: Breakout-v0 | |
[0128 23:40:42 @multigpu.py:323] ERR [p1343] step: count(1), step_time 6033.08, mean_step_time 6033.08, it/s 0.17 | |
[2018-01-28 23:40:42,999] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,002] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,006] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,009] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,013] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,016] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,020] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,023] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,027] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,030] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,034] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,037] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,041] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,044] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,048] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,051] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,056] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,058] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,063] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,066] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,070] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,073] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,077] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,080] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,084] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,087] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,091] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,095] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,099] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,101] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,106] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,109] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,113] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,116] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,120] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,124] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,128] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,130] Making new env: Breakout-v0 | |
[2018-01-28 23:40:43,135] Making new env: Breakout-v0 | |
None <type 'NoneType'> | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 711, in <module> | |
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker'])) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 570, in get_config | |
'worker_host' : server.target, | |
NameError: global name 'server' is not defined | |
None <type 'NoneType'> | |
Traceback (most recent call last): | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 711, in <module> | |
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker'])) | |
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 570, in get_config | |
'worker_host' : server.target, | |
NameError: global name 'server' is not defined | |
DONE | |
DONE | |
[0128 23:40:45 @multigpu.py:323] ERR [p1343] step: count(2), step_time 2104.99, mean_step_time 4069.04, it/s 0.25 | |
[0128 23:40:46 @multigpu.py:323] ERR [p1343] step: count(3), step_time 1081.13, mean_step_time 3073.07, it/s 0.33 | |
[0128 23:40:47 @multigpu.py:323] ERR [p1343] step: count(4), step_time 1069.16, mean_step_time 2572.09, it/s 0.39 | |
[0128 23:40:48 @multigpu.py:323] ERR [p1343] step: count(5), step_time 1080.06, mean_step_time 2273.69, it/s 0.44 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment