Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save AdamStelmaszczyk/588176762829c8d4668b354ddfb2668d to your computer and use it in GitHub Desktop.
Save AdamStelmaszczyk/588176762829c8d4668b354ddfb2668d to your computer and use it in GitHub Desktop.
InvalidArgumentError
$ python run_job.py -n 5 -g 60 -c 12 --use_sync --name neptune_job_name
args.offline: False
('bash command: ', 'srun -A luna -N 5 -n 5 -c 12 -t 6:00:00 distributed_tensorpack_mkl.sh 35083 45394 Breakout-v0 adam 1 "3nodes 12cores" "neptune_job_name_1517179219.79" 0.00015 128 60 0 None 256 100 1 uniform normal False . True 1 False /net/archive/groups/plggluna/intel_2/logs/ 1e-08 0.9 0.999 0 False False False False 120 False')
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1341 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1343 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1347 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1340 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9486553 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1340-1341,1343,1347-1348] ; SLURMD_NODENAME p1348 ; SLURM_JOB_NUM_NODES 5
plgrid/tools/python/2.7.13 unloaded.
plgrid/tools/python/2.7.13 unloaded.
plgrid/tools/python/2.7.13 unloaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/python/2.7.13 unloaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/python/2.7.13 unloaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/2017.0.0 loaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 35083 --tf_port 45394 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517179219.79 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
2018-01-28 23:40:28.823877: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:28.823904: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:28.823923: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:28.823930: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:28.823937: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
2018-01-28 23:40:29.068483: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.068537: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.068556: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.068563: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.068570: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.078694: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1340:45394}
2018-01-28 23:40:29.078734: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> localhost:45395, 1 -> p1343:45395}
2018-01-28 23:40:29.080302: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45395
[2018-01-28 23:40:29,081] Making new env: Breakout-v0
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']}
[worker:0] Starting the TF server
args.mkl == 0
2018-01-28 23:40:29.153339: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.153368: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.153386: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.153394: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.153401: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
using tensorflow convolution
2018-01-28 23:40:29.278708: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.278740: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.278759: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.278766: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.278773: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:29.288344: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> localhost:45394}
2018-01-28 23:40:29.288398: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1341:45395, 1 -> p1343:45395}
2018-01-28 23:40:29.289763: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45394
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']}
[ps:0] Starting the TF server
[0128 23:40:29 @train.py:84] [ps:0] joining the server.
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
[2018-01-28 23:40:31,077] Making new env: Breakout-v0
[2018-01-28 23:40:31,084] Making new env: Breakout-v0
[2018-01-28 23:40:31,091] Making new env: Breakout-v0
[2018-01-28 23:40:31,097] Making new env: Breakout-v0
[2018-01-28 23:40:31,104] Making new env: Breakout-v0
[2018-01-28 23:40:31,110] Making new env: Breakout-v0
[2018-01-28 23:40:31,116] Making new env: Breakout-v0
[2018-01-28 23:40:31,123] Making new env: Breakout-v0
[2018-01-28 23:40:31,130] Making new env: Breakout-v0
[2018-01-28 23:40:31,136] Making new env: Breakout-v0
[2018-01-28 23:40:31,143] Making new env: Breakout-v0
[2018-01-28 23:40:31,150] Making new env: Breakout-v0
[2018-01-28 23:40:31,156] Making new env: Breakout-v0
[2018-01-28 23:40:31,164] Making new env: Breakout-v0
[2018-01-28 23:40:31,170] Making new env: Breakout-v0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[2018-01-28 23:40:31,177] Making new env: Breakout-v0
[2018-01-28 23:40:31,184] Making new env: Breakout-v0
[2018-01-28 23:40:31,190] Making new env: Breakout-v0
[2018-01-28 23:40:31,197] Making new env: Breakout-v0
[2018-01-28 23:40:31,204] Making new env: Breakout-v0
[2018-01-28 23:40:31,211] Making new env: Breakout-v0
[2018-01-28 23:40:31,218] Making new env: Breakout-v0
[2018-01-28 23:40:31,224] Making new env: Breakout-v0
[2018-01-28 23:40:31,231] Making new env: Breakout-v0
[2018-01-28 23:40:31,238] Making new env: Breakout-v0
[2018-01-28 23:40:31,245] Making new env: Breakout-v0
[2018-01-28 23:40:31,252] Making new env: Breakout-v0
[2018-01-28 23:40:31,259] Making new env: Breakout-v0
[2018-01-28 23:40:31,266] Making new env: Breakout-v0
[2018-01-28 23:40:31,273] Making new env: Breakout-v0
[2018-01-28 23:40:31,280] Making new env: Breakout-v0
[2018-01-28 23:40:31,287] Making new env: Breakout-v0
[2018-01-28 23:40:31,294] Making new env: Breakout-v0
[2018-01-28 23:40:31,301] Making new env: Breakout-v0
[2018-01-28 23:40:31,308] Making new env: Breakout-v0
[2018-01-28 23:40:31,315] Making new env: Breakout-v0
[2018-01-28 23:40:31,322] Making new env: Breakout-v0
[2018-01-28 23:40:31,329] Making new env: Breakout-v0
[2018-01-28 23:40:31,336] Making new env: Breakout-v0
[2018-01-28 23:40:31,343] Making new env: Breakout-v0
[2018-01-28 23:40:31,350] Making new env: Breakout-v0
[2018-01-28 23:40:31,357] Making new env: Breakout-v0
[2018-01-28 23:40:31,364] Making new env: Breakout-v0
[2018-01-28 23:40:31,372] Making new env: Breakout-v0
[2018-01-28 23:40:31,379] Making new env: Breakout-v0
[2018-01-28 23:40:31,386] Making new env: Breakout-v0
[2018-01-28 23:40:31,393] Making new env: Breakout-v0
[2018-01-28 23:40:31,400] Making new env: Breakout-v0
[2018-01-28 23:40:31,408] Making new env: Breakout-v0
[2018-01-28 23:40:31,415] Making new env: Breakout-v0
[2018-01-28 23:40:31,423] Making new env: Breakout-v0
[2018-01-28 23:40:31,429] Making new env: Breakout-v0
[2018-01-28 23:40:31,437] Making new env: Breakout-v0
[2018-01-28 23:40:31,444] Making new env: Breakout-v0
[2018-01-28 23:40:31,452] Making new env: Breakout-v0
[2018-01-28 23:40:31,459] Making new env: Breakout-v0
[2018-01-28 23:40:31,466] Making new env: Breakout-v0
[2018-01-28 23:40:31,474] Making new env: Breakout-v0
[2018-01-28 23:40:31,481] Making new env: Breakout-v0
[2018-01-28 23:40:31,488] Making new env: Breakout-v0
[2018-01-28 23:40:31,496] Making new env: Breakout-v0
[2018-01-28 23:40:31,503] Making new env: Breakout-v0
[2018-01-28 23:40:31,511] Making new env: Breakout-v0
[2018-01-28 23:40:31,518] Making new env: Breakout-v0
[2018-01-28 23:40:31,526] Making new env: Breakout-v0
[2018-01-28 23:40:31,533] Making new env: Breakout-v0
[2018-01-28 23:40:31,541] Making new env: Breakout-v0
[2018-01-28 23:40:31,548] Making new env: Breakout-v0
[2018-01-28 23:40:31,556] Making new env: Breakout-v0
[2018-01-28 23:40:31,563] Making new env: Breakout-v0
[2018-01-28 23:40:31,570] Making new env: Breakout-v0
[2018-01-28 23:40:31,579] Making new env: Breakout-v0
[2018-01-28 23:40:31,587] Making new env: Breakout-v0
[2018-01-28 23:40:31,594] Making new env: Breakout-v0
[2018-01-28 23:40:31,602] Making new env: Breakout-v0
[2018-01-28 23:40:31,609] Making new env: Breakout-v0
[2018-01-28 23:40:31,617] Making new env: Breakout-v0
[2018-01-28 23:40:31,625] Making new env: Breakout-v0
[2018-01-28 23:40:31,632] Making new env: Breakout-v0
[2018-01-28 23:40:31,640] Making new env: Breakout-v0
[2018-01-28 23:40:31,648] Making new env: Breakout-v0
[2018-01-28 23:40:31,655] Making new env: Breakout-v0
[2018-01-28 23:40:31,663] Making new env: Breakout-v0
[2018-01-28 23:40:31,671] Making new env: Breakout-v0
[2018-01-28 23:40:31,678] Making new env: Breakout-v0
[2018-01-28 23:40:31,687] Making new env: Breakout-v0
[2018-01-28 23:40:31,694] Making new env: Breakout-v0
[2018-01-28 23:40:31,702] Making new env: Breakout-v0
[2018-01-28 23:40:31,710] Making new env: Breakout-v0
[2018-01-28 23:40:31,718] Making new env: Breakout-v0
[2018-01-28 23:40:31,725] Making new env: Breakout-v0
[2018-01-28 23:40:31,733] Making new env: Breakout-v0
[2018-01-28 23:40:31,741] Making new env: Breakout-v0
[2018-01-28 23:40:31,749] Making new env: Breakout-v0
[2018-01-28 23:40:31,757] Making new env: Breakout-v0
[2018-01-28 23:40:31,764] Making new env: Breakout-v0
[2018-01-28 23:40:31,772] Making new env: Breakout-v0
[2018-01-28 23:40:31,780] Making new env: Breakout-v0
[2018-01-28 23:40:31,789] Making new env: Breakout-v0
[2018-01-28 23:40:31,796] Making new env: Breakout-v0
None <type 'NoneType'>
worker host: grpc://localhost:45395
[0128 23:40:31 @train.py:717] [BA3C] Train on gpu 0 and infer on gpu 0,0,0
[0128 23:40:31 @train.py:723] using async version
DUMMY PREDICTOR 0
MultiGPUTrainer __init__ dummy = 0
[0128 23:40:31 @multigpu.py:57] Training a model of 1 tower
[0128 23:40:31 @multigpu.py:67] Building graph for training tower 0..., /cpu:0
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179231.85============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train
grad_list = self._multi_tower_grads()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads
self.model.build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
12
[0128 23:40:31 @_common.py:61] conv0 input: [None, 84, 84, 16]
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] conv0 output: [None, 80, 80, 32]
[0128 23:40:31 @_common.py:61] pool0 input: [None, 80, 80, 32]
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] pool0 output: [None, 40, 40, 32]
[0128 23:40:31 @_common.py:61] conv1 input: [None, 40, 40, 32]
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] conv1 output: [None, 36, 36, 32]
[0128 23:40:31 @_common.py:61] pool1 input: [None, 36, 36, 32]
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] pool1 output: [None, 18, 18, 32]
[0128 23:40:31 @_common.py:61] conv2 input: [None, 18, 18, 32]
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] conv2 output: [None, 14, 14, 64]
[0128 23:40:31 @_common.py:61] pool2 input: [None, 14, 14, 64]
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] pool2 output: [None, 7, 7, 64]
[0128 23:40:31 @_common.py:61] conv3 input: [None, 7, 7, 64]
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] conv3 output: [None, 5, 5, 64]
[0128 23:40:31 @_common.py:61] fc1_0 input: [None, 5, 5, 64]
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:31 @_common.py:70] fc1_0 output: [None, 1, 1, 256]
[0128 23:40:32 @_common.py:61] fc-pi input: [None, 256]
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:32 @_common.py:70] fc-pi output: [None, 6]
[0128 23:40:32 @_common.py:61] fc-v input: [None, 256]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0128 23:40:32 @_common.py:70] fc-v output: [None, 1]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
MOVING_SUMMARY_VARIABLES
[]
[0128 23:40:33 @modelutils.py:22] Model Parameters:
conv0/W:0: shape=[5, 5, 16, 32], dim=12800
conv1/W:0: shape=[5, 5, 32, 32], dim=25600
conv2/W:0: shape=[5, 5, 32, 64], dim=51200
conv3/W:0: shape=[3, 3, 64, 64], dim=36864
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600
fc-pi/W:0: shape=[256, 6], dim=1536
fc-pi/b:0: shape=[6], dim=6
fc-v/W:0: shape=[256, 1], dim=256
fc-v/b:0: shape=[1], dim=1
Total param=537863 (2.051785 MB assuming all float32)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
2018-01-28 23:40:34.823812: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:34.823850: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:34.823859: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:34.823867: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:34.823874: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-01-28 23:40:34.834021: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1340:45394}
2018-01-28 23:40:34.834060: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1341:45395, 1 -> localhost:45395}
2018-01-28 23:40:34.835825: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:45395
[2018-01-28 23:40:34,837] Making new env: Breakout-v0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']}
[worker:1] Starting the TF server
args.mkl == 0
using tensorflow convolution
[2018-01-28 23:40:35,057] Making new env: Breakout-v0
[2018-01-28 23:40:35,062] Making new env: Breakout-v0
[2018-01-28 23:40:35,068] Making new env: Breakout-v0
[2018-01-28 23:40:35,073] Making new env: Breakout-v0
[2018-01-28 23:40:35,079] Making new env: Breakout-v0
[2018-01-28 23:40:35,086] Making new env: Breakout-v0
[2018-01-28 23:40:35,092] Making new env: Breakout-v0
[2018-01-28 23:40:35,098] Making new env: Breakout-v0
[2018-01-28 23:40:35,104] Making new env: Breakout-v0
[2018-01-28 23:40:35,111] Making new env: Breakout-v0
[2018-01-28 23:40:35,117] Making new env: Breakout-v0
[2018-01-28 23:40:35,123] Making new env: Breakout-v0
[2018-01-28 23:40:35,129] Making new env: Breakout-v0
[2018-01-28 23:40:35,135] Making new env: Breakout-v0
[2018-01-28 23:40:35,142] Making new env: Breakout-v0
[2018-01-28 23:40:35,148] Making new env: Breakout-v0
[2018-01-28 23:40:35,155] Making new env: Breakout-v0
[2018-01-28 23:40:35,161] Making new env: Breakout-v0
[2018-01-28 23:40:35,167] Making new env: Breakout-v0
[2018-01-28 23:40:35,173] Making new env: Breakout-v0
[2018-01-28 23:40:35,180] Making new env: Breakout-v0
[2018-01-28 23:40:35,187] Making new env: Breakout-v0
[2018-01-28 23:40:35,194] Making new env: Breakout-v0
[2018-01-28 23:40:35,200] Making new env: Breakout-v0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[2018-01-28 23:40:35,207] Making new env: Breakout-v0
[2018-01-28 23:40:35,214] Making new env: Breakout-v0
[2018-01-28 23:40:35,220] Making new env: Breakout-v0
[2018-01-28 23:40:35,227] Making new env: Breakout-v0
[2018-01-28 23:40:35,233] Making new env: Breakout-v0
[2018-01-28 23:40:35,240] Making new env: Breakout-v0
[2018-01-28 23:40:35,246] Making new env: Breakout-v0
[2018-01-28 23:40:35,253] Making new env: Breakout-v0
[2018-01-28 23:40:35,260] Making new env: Breakout-v0
[2018-01-28 23:40:35,267] Making new env: Breakout-v0
[2018-01-28 23:40:35,273] Making new env: Breakout-v0
[2018-01-28 23:40:35,280] Making new env: Breakout-v0
[2018-01-28 23:40:35,287] Making new env: Breakout-v0
[2018-01-28 23:40:35,295] Making new env: Breakout-v0
[2018-01-28 23:40:35,302] Making new env: Breakout-v0
[2018-01-28 23:40:35,309] Making new env: Breakout-v0
[2018-01-28 23:40:35,316] Making new env: Breakout-v0
[2018-01-28 23:40:35,322] Making new env: Breakout-v0
[0128 23:40:35 @multigpu.py:228] Setup callbacks ...
[2018-01-28 23:40:35,329] Making new env: Breakout-v0
Creating Predictorfactor 0
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0...
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.33============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
[2018-01-28 23:40:35,336] Making new env: Breakout-v0
[2018-01-28 23:40:35,343] Making new env: Breakout-v0
12
[2018-01-28 23:40:35,350] Making new env: Breakout-v0
[2018-01-28 23:40:35,357] Making new env: Breakout-v0
[2018-01-28 23:40:35,364] Making new env: Breakout-v0
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,371] Making new env: Breakout-v0
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,378] Making new env: Breakout-v0
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,386] Making new env: Breakout-v0
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,393] Making new env: Breakout-v0
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,400] Making new env: Breakout-v0
[2018-01-28 23:40:35,408] Making new env: Breakout-v0
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,415] Making new env: Breakout-v0
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,422] Making new env: Breakout-v0
[2018-01-28 23:40:35,429] Making new env: Breakout-v0
[2018-01-28 23:40:35,436] Making new env: Breakout-v0
[2018-01-28 23:40:35,442] Making new env: Breakout-v0
[2018-01-28 23:40:35,449] Making new env: Breakout-v0
[2018-01-28 23:40:35,456] Making new env: Breakout-v0
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0...
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.46============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
[2018-01-28 23:40:35,464] Making new env: Breakout-v0
[2018-01-28 23:40:35,471] Making new env: Breakout-v0
12
[2018-01-28 23:40:35,478] Making new env: Breakout-v0
[2018-01-28 23:40:35,485] Making new env: Breakout-v0
[2018-01-28 23:40:35,493] Making new env: Breakout-v0
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,500] Making new env: Breakout-v0
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,508] Making new env: Breakout-v0
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,515] Making new env: Breakout-v0
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,523] Making new env: Breakout-v0
[2018-01-28 23:40:35,530] Making new env: Breakout-v0
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,537] Making new env: Breakout-v0
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,545] Making new env: Breakout-v0
[2018-01-28 23:40:35,552] Making new env: Breakout-v0
[2018-01-28 23:40:35,559] Making new env: Breakout-v0
[2018-01-28 23:40:35,566] Making new env: Breakout-v0
[2018-01-28 23:40:35,574] Making new env: Breakout-v0
[2018-01-28 23:40:35,581] Making new env: Breakout-v0
[0128 23:40:35 @base.py:132] Building graph for predictor tower 0...
===== [p1341] PRINTING BUILD GRAPH STACK AT 1517179235.58============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
[2018-01-28 23:40:35,589] Making new env: Breakout-v0
[2018-01-28 23:40:35,596] Making new env: Breakout-v0
12
[2018-01-28 23:40:35,603] Making new env: Breakout-v0
[2018-01-28 23:40:35,611] Making new env: Breakout-v0
[2018-01-28 23:40:35,619] Making new env: Breakout-v0
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,626] Making new env: Breakout-v0
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,634] Making new env: Breakout-v0
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,641] Making new env: Breakout-v0
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,648] Making new env: Breakout-v0
[2018-01-28 23:40:35,656] Making new env: Breakout-v0
[2018-01-28 23:40:35,663] Making new env: Breakout-v0
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,670] Making new env: Breakout-v0
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[2018-01-28 23:40:35,678] Making new env: Breakout-v0
[2018-01-28 23:40:35,685] Making new env: Breakout-v0
[2018-01-28 23:40:35,693] Making new env: Breakout-v0
[2018-01-28 23:40:35,701] Making new env: Breakout-v0
[2018-01-28 23:40:35,708] Making new env: Breakout-v0
[2018-01-28 23:40:35,716] Making new env: Breakout-v0
[0128 23:40:35 @base.py:177] ===============================================================
[0128 23:40:35 @base.py:179] CHIEF!
[0128 23:40:35 @base.py:180] [p1341] Creating the session
[0128 23:40:35 @base.py:181] ===============================================================
[2018-01-28 23:40:35,724] Making new env: Breakout-v0
[2018-01-28 23:40:35,731] Making new env: Breakout-v0
[2018-01-28 23:40:35,739] Making new env: Breakout-v0
[2018-01-28 23:40:35,747] Making new env: Breakout-v0
None <type 'NoneType'>
worker host: grpc://localhost:45395
[0128 23:40:35 @train.py:717] [BA3C] Train on gpu 0 and infer on gpu 0,0,0
[0128 23:40:35 @train.py:723] using async version
DUMMY PREDICTOR 0
MultiGPUTrainer __init__ dummy = 0
[0128 23:40:35 @multigpu.py:57] Training a model of 1 tower
[0128 23:40:35 @multigpu.py:67] Building graph for training tower 0..., /cpu:0
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179235.78============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train
grad_list = self._multi_tower_grads()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads
self.model.build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
12
[0128 23:40:35 @_common.py:61] conv0 input: [None, 84, 84, 16]
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] conv0 output: [None, 80, 80, 32]
[0128 23:40:35 @_common.py:61] pool0 input: [None, 80, 80, 32]
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] pool0 output: [None, 40, 40, 32]
[0128 23:40:35 @_common.py:61] conv1 input: [None, 40, 40, 32]
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] conv1 output: [None, 36, 36, 32]
[0128 23:40:35 @_common.py:61] pool1 input: [None, 36, 36, 32]
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] pool1 output: [None, 18, 18, 32]
[0128 23:40:35 @_common.py:61] conv2 input: [None, 18, 18, 32]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] conv2 output: [None, 14, 14, 64]
[0128 23:40:35 @_common.py:61] pool2 input: [None, 14, 14, 64]
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] pool2 output: [None, 7, 7, 64]
[0128 23:40:35 @_common.py:61] conv3 input: [None, 7, 7, 64]
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] conv3 output: [None, 5, 5, 64]
[0128 23:40:35 @_common.py:61] fc1_0 input: [None, 5, 5, 64]
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] fc1_0 output: [None, 1, 1, 256]
[0128 23:40:35 @_common.py:61] fc-pi input: [None, 256]
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:35 @_common.py:70] fc-pi output: [None, 6]
[0128 23:40:35 @_common.py:61] fc-v input: [None, 256]
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:36 @_common.py:70] fc-v output: [None, 1]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
2018-01-28 23:40:36.604219: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session a398c17478e77696 with config:
[0128 23:40:36 @base.py:189] ===============================================================
[0128 23:40:36 @base.py:190] [p1341] Session created
[0128 23:40:36 @base.py:191] ===============================================================
[0128 23:40:36 @base.py:112] [p1341] Initializing graph variables ...
[0128 23:40:36 @base.py:119] [p1341] Starting concurrency...
[0128 23:40:36 @base.py:198] Starting all threads & procs ...
[0128 23:40:36 @base.py:122] [p1341] Setting default session
[0128 23:40:36 @base.py:125] [p1341] Getting global step
[0128 23:40:36 @base.py:127] [p1341] Start training with global_step=0
MOVING_SUMMARY_VARIABLES
[]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
server main loop
before socket bind... tcp://*:35083
receiving
[0128 23:40:36 @modelutils.py:22] Model Parameters:
conv0/W:0: shape=[5, 5, 16, 32], dim=12800
conv1/W:0: shape=[5, 5, 32, 32], dim=25600
conv2/W:0: shape=[5, 5, 32, 64], dim=51200
conv3/W:0: shape=[3, 3, 64, 64], dim=36864
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600
fc-pi/W:0: shape=[256, 6], dim=1536
fc-pi/b:0: shape=[6], dim=6
fc-v/W:0: shape=[256, 1], dim=256
fc-v/b:0: shape=[1], dim=1
Total param=537863 (2.051785 MB assuming all float32)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 75, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[0128 23:40:38 @multigpu.py:228] Setup callbacks ...
Creating Predictorfactor 0
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0...
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.46============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0...
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.6============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:38 @base.py:132] Building graph for predictor tower 0...
===== [p1343] PRINTING BUILD GRAPH STACK AT 1517179238.73============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 727, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 366, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 278, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0128 23:40:38 @base.py:177] ===============================================================
[0128 23:40:38 @base.py:180] [p1343] Creating the session
[0128 23:40:38 @base.py:181] ===============================================================
[2018-01-28 23:40:38,884] Making new env: Breakout-v0
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']}
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
[worker:2] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1347] =====
args.mkl == 0
using tensorflow convolution
[2018-01-28 23:40:39,212] Making new env: Breakout-v0
{'ps': ['p1340:45394'], 'worker': ['p1341:45395', 'p1343:45395']}
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
[worker:3] Starting the TF server
========= EXCEPTION WHILE STARTING TF SERVER [p1348] =====
args.mkl == 0
using tensorflow convolution
2018-01-28 23:40:39.673658: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session 3cea52acb9284400 with config:
[0128 23:40:39 @base.py:189] ===============================================================
[0128 23:40:39 @base.py:190] [p1343] Session created
[0128 23:40:39 @base.py:191] ===============================================================
[0128 23:40:39 @base.py:112] [p1343] Initializing graph variables ...
[0128 23:40:39 @base.py:119] [p1343] Starting concurrency...
[0128 23:40:39 @base.py:198] Starting all threads & procs ...
[0128 23:40:39 @base.py:122] [p1343] Setting default session
[0128 23:40:39 @base.py:125] [p1343] Getting global step
[0128 23:40:39 @base.py:127] [p1343] Start training with global_step=0
[2018-01-28 23:40:42,460] Making new env: Breakout-v0
[2018-01-28 23:40:42,463] Making new env: Breakout-v0
[2018-01-28 23:40:42,465] Making new env: Breakout-v0
[2018-01-28 23:40:42,469] Making new env: Breakout-v0
[2018-01-28 23:40:42,471] Making new env: Breakout-v0
[2018-01-28 23:40:42,474] Making new env: Breakout-v0
[2018-01-28 23:40:42,476] Making new env: Breakout-v0
[2018-01-28 23:40:42,481] Making new env: Breakout-v0
[2018-01-28 23:40:42,483] Making new env: Breakout-v0
[2018-01-28 23:40:42,487] Making new env: Breakout-v0
[2018-01-28 23:40:42,488] Making new env: Breakout-v0
[2018-01-28 23:40:42,493] Making new env: Breakout-v0
[2018-01-28 23:40:42,494] Making new env: Breakout-v0
[2018-01-28 23:40:42,499] Making new env: Breakout-v0
[2018-01-28 23:40:42,500] Making new env: Breakout-v0
[2018-01-28 23:40:42,506] Making new env: Breakout-v0
[2018-01-28 23:40:42,506] Making new env: Breakout-v0
[2018-01-28 23:40:42,512] Making new env: Breakout-v0
[2018-01-28 23:40:42,513] Making new env: Breakout-v0
[2018-01-28 23:40:42,518] Making new env: Breakout-v0
[2018-01-28 23:40:42,519] Making new env: Breakout-v0
[2018-01-28 23:40:42,525] Making new env: Breakout-v0
[2018-01-28 23:40:42,525] Making new env: Breakout-v0
[2018-01-28 23:40:42,531] Making new env: Breakout-v0
[2018-01-28 23:40:42,532] Making new env: Breakout-v0
[2018-01-28 23:40:42,538] Making new env: Breakout-v0
[2018-01-28 23:40:42,538] Making new env: Breakout-v0
[2018-01-28 23:40:42,544] Making new env: Breakout-v0
[2018-01-28 23:40:42,544] Making new env: Breakout-v0
[2018-01-28 23:40:42,550] Making new env: Breakout-v0
[2018-01-28 23:40:42,550] Making new env: Breakout-v0
[2018-01-28 23:40:42,556] Making new env: Breakout-v0
[2018-01-28 23:40:42,557] Making new env: Breakout-v0
[2018-01-28 23:40:42,563] Making new env: Breakout-v0
[2018-01-28 23:40:42,563] Making new env: Breakout-v0
[2018-01-28 23:40:42,569] Making new env: Breakout-v0
[2018-01-28 23:40:42,570] Making new env: Breakout-v0
[2018-01-28 23:40:42,575] Making new env: Breakout-v0
[2018-01-28 23:40:42,576] Making new env: Breakout-v0
[2018-01-28 23:40:42,582] Making new env: Breakout-v0
[2018-01-28 23:40:42,582] Making new env: Breakout-v0
[2018-01-28 23:40:42,588] Making new env: Breakout-v0
[2018-01-28 23:40:42,589] Making new env: Breakout-v0
[2018-01-28 23:40:42,595] Making new env: Breakout-v0
[2018-01-28 23:40:42,596] Making new env: Breakout-v0
[2018-01-28 23:40:42,601] Making new env: Breakout-v0
[2018-01-28 23:40:42,602] Making new env: Breakout-v0
[2018-01-28 23:40:42,608] Making new env: Breakout-v0
[2018-01-28 23:40:42,609] Making new env: Breakout-v0
[2018-01-28 23:40:42,614] Making new env: Breakout-v0
[2018-01-28 23:40:42,615] Making new env: Breakout-v0
[2018-01-28 23:40:42,621] Making new env: Breakout-v0
[2018-01-28 23:40:42,621] Making new env: Breakout-v0
[2018-01-28 23:40:42,627] Making new env: Breakout-v0
[2018-01-28 23:40:42,628] Making new env: Breakout-v0
[2018-01-28 23:40:42,634] Making new env: Breakout-v0
[2018-01-28 23:40:42,634] Making new env: Breakout-v0
[2018-01-28 23:40:42,640] Making new env: Breakout-v0
[2018-01-28 23:40:42,646] Making new env: Breakout-v0
[2018-01-28 23:40:42,653] Making new env: Breakout-v0
[2018-01-28 23:40:42,660] Making new env: Breakout-v0
[2018-01-28 23:40:42,666] Making new env: Breakout-v0
[2018-01-28 23:40:42,662] Making new env: Breakout-v0
[2018-01-28 23:40:42,662] Making new env: Breakout-v0
[2018-01-28 23:40:42,661] Making new env: Breakout-v0
[2018-01-28 23:40:42,667] Making new env: Breakout-v0
[2018-01-28 23:40:42,662] Making new env: Breakout-v0
[2018-01-28 23:40:42,673] Making new env: Breakout-v0
[2018-01-28 23:40:42,674] Making new env: Breakout-v0
[2018-01-28 23:40:42,680] Making new env: Breakout-v0
[2018-01-28 23:40:42,681] Making new env: Breakout-v0
[2018-01-28 23:40:42,686] Making new env: Breakout-v0
[2018-01-28 23:40:42,688] Making new env: Breakout-v0
[2018-01-28 23:40:42,693] Making new env: Breakout-v0
[2018-01-28 23:40:42,694] Making new env: Breakout-v0
[2018-01-28 23:40:42,699] Making new env: Breakout-v0
[2018-01-28 23:40:42,700] Making new env: Breakout-v0
[2018-01-28 23:40:42,706] Making new env: Breakout-v0
[2018-01-28 23:40:42,707] Making new env: Breakout-v0
[2018-01-28 23:40:42,712] Making new env: Breakout-v0
[2018-01-28 23:40:42,718] Making new env: Breakout-v0
[2018-01-28 23:40:42,719] Making new env: Breakout-v0
[2018-01-28 23:40:42,720] Making new env: Breakout-v0
[2018-01-28 23:40:42,726] Making new env: Breakout-v0
[2018-01-28 23:40:42,727] Making new env: Breakout-v0
[2018-01-28 23:40:42,732] Making new env: Breakout-v0
[2018-01-28 23:40:42,739] Making new env: Breakout-v0
[2018-01-28 23:40:42,745] Making new env: Breakout-v0
[2018-01-28 23:40:42,746] Making new env: Breakout-v0
[2018-01-28 23:40:42,752] Making new env: Breakout-v0
[2018-01-28 23:40:42,753] Making new env: Breakout-v0
[2018-01-28 23:40:42,753] Making new env: Breakout-v0
[2018-01-28 23:40:42,759] Making new env: Breakout-v0
[2018-01-28 23:40:42,760] Making new env: Breakout-v0
[2018-01-28 23:40:42,766] Making new env: Breakout-v0
[2018-01-28 23:40:42,767] Making new env: Breakout-v0
[2018-01-28 23:40:42,773] Making new env: Breakout-v0
[2018-01-28 23:40:42,775] Making new env: Breakout-v0
[2018-01-28 23:40:42,779] Making new env: Breakout-v0
[2018-01-28 23:40:42,782] Making new env: Breakout-v0
[2018-01-28 23:40:42,786] Making new env: Breakout-v0
[2018-01-28 23:40:42,789] Making new env: Breakout-v0
[2018-01-28 23:40:42,793] Making new env: Breakout-v0
[2018-01-28 23:40:42,797] Making new env: Breakout-v0
[2018-01-28 23:40:42,800] Making new env: Breakout-v0
[2018-01-28 23:40:42,805] Making new env: Breakout-v0
[2018-01-28 23:40:42,807] Making new env: Breakout-v0
[2018-01-28 23:40:42,811] Making new env: Breakout-v0
[2018-01-28 23:40:42,813] Making new env: Breakout-v0
[2018-01-28 23:40:42,818] Making new env: Breakout-v0
[2018-01-28 23:40:42,820] Making new env: Breakout-v0
[2018-01-28 23:40:42,825] Making new env: Breakout-v0
[2018-01-28 23:40:42,827] Making new env: Breakout-v0
[2018-01-28 23:40:42,832] Making new env: Breakout-v0
[2018-01-28 23:40:42,834] Making new env: Breakout-v0
[2018-01-28 23:40:42,839] Making new env: Breakout-v0
[2018-01-28 23:40:42,840] Making new env: Breakout-v0
[2018-01-28 23:40:42,846] Making new env: Breakout-v0
[2018-01-28 23:40:42,847] Making new env: Breakout-v0
[2018-01-28 23:40:42,852] Making new env: Breakout-v0
[2018-01-28 23:40:42,854] Making new env: Breakout-v0
[2018-01-28 23:40:42,859] Making new env: Breakout-v0
[2018-01-28 23:40:42,861] Making new env: Breakout-v0
[2018-01-28 23:40:42,866] Making new env: Breakout-v0
[2018-01-28 23:40:42,868] Making new env: Breakout-v0
[2018-01-28 23:40:42,873] Making new env: Breakout-v0
[2018-01-28 23:40:42,874] Making new env: Breakout-v0
[2018-01-28 23:40:42,880] Making new env: Breakout-v0
[2018-01-28 23:40:42,881] Making new env: Breakout-v0
[2018-01-28 23:40:42,887] Making new env: Breakout-v0
[2018-01-28 23:40:42,889] Making new env: Breakout-v0
[2018-01-28 23:40:42,894] Making new env: Breakout-v0
[2018-01-28 23:40:42,895] Making new env: Breakout-v0
[2018-01-28 23:40:42,901] Making new env: Breakout-v0
[2018-01-28 23:40:42,902] Making new env: Breakout-v0
[2018-01-28 23:40:42,908] Making new env: Breakout-v0
[2018-01-28 23:40:42,910] Making new env: Breakout-v0
[2018-01-28 23:40:42,915] Making new env: Breakout-v0
[2018-01-28 23:40:42,917] Making new env: Breakout-v0
[2018-01-28 23:40:42,922] Making new env: Breakout-v0
[2018-01-28 23:40:42,924] Making new env: Breakout-v0
[2018-01-28 23:40:42,929] Making new env: Breakout-v0
[2018-01-28 23:40:42,931] Making new env: Breakout-v0
[2018-01-28 23:40:42,936] Making new env: Breakout-v0
[2018-01-28 23:40:42,938] Making new env: Breakout-v0
[2018-01-28 23:40:42,943] Making new env: Breakout-v0
[2018-01-28 23:40:42,945] Making new env: Breakout-v0
[2018-01-28 23:40:42,950] Making new env: Breakout-v0
[2018-01-28 23:40:42,952] Making new env: Breakout-v0
[2018-01-28 23:40:42,957] Making new env: Breakout-v0
[2018-01-28 23:40:42,959] Making new env: Breakout-v0
[2018-01-28 23:40:42,964] Making new env: Breakout-v0
[2018-01-28 23:40:42,966] Making new env: Breakout-v0
[2018-01-28 23:40:42,971] Making new env: Breakout-v0
[2018-01-28 23:40:42,973] Making new env: Breakout-v0
[2018-01-28 23:40:42,978] Making new env: Breakout-v0
[2018-01-28 23:40:42,980] Making new env: Breakout-v0
[2018-01-28 23:40:42,985] Making new env: Breakout-v0
[2018-01-28 23:40:42,987] Making new env: Breakout-v0
[2018-01-28 23:40:42,992] Making new env: Breakout-v0
[2018-01-28 23:40:42,995] Making new env: Breakout-v0
[0128 23:40:42 @multigpu.py:323] ERR [p1343] step: count(1), step_time 6033.08, mean_step_time 6033.08, it/s 0.17
[2018-01-28 23:40:42,999] Making new env: Breakout-v0
[2018-01-28 23:40:43,002] Making new env: Breakout-v0
[2018-01-28 23:40:43,006] Making new env: Breakout-v0
[2018-01-28 23:40:43,009] Making new env: Breakout-v0
[2018-01-28 23:40:43,013] Making new env: Breakout-v0
[2018-01-28 23:40:43,016] Making new env: Breakout-v0
[2018-01-28 23:40:43,020] Making new env: Breakout-v0
[2018-01-28 23:40:43,023] Making new env: Breakout-v0
[2018-01-28 23:40:43,027] Making new env: Breakout-v0
[2018-01-28 23:40:43,030] Making new env: Breakout-v0
[2018-01-28 23:40:43,034] Making new env: Breakout-v0
[2018-01-28 23:40:43,037] Making new env: Breakout-v0
[2018-01-28 23:40:43,041] Making new env: Breakout-v0
[2018-01-28 23:40:43,044] Making new env: Breakout-v0
[2018-01-28 23:40:43,048] Making new env: Breakout-v0
[2018-01-28 23:40:43,051] Making new env: Breakout-v0
[2018-01-28 23:40:43,056] Making new env: Breakout-v0
[2018-01-28 23:40:43,058] Making new env: Breakout-v0
[2018-01-28 23:40:43,063] Making new env: Breakout-v0
[2018-01-28 23:40:43,066] Making new env: Breakout-v0
[2018-01-28 23:40:43,070] Making new env: Breakout-v0
[2018-01-28 23:40:43,073] Making new env: Breakout-v0
[2018-01-28 23:40:43,077] Making new env: Breakout-v0
[2018-01-28 23:40:43,080] Making new env: Breakout-v0
[2018-01-28 23:40:43,084] Making new env: Breakout-v0
[2018-01-28 23:40:43,087] Making new env: Breakout-v0
[2018-01-28 23:40:43,091] Making new env: Breakout-v0
[2018-01-28 23:40:43,095] Making new env: Breakout-v0
[2018-01-28 23:40:43,099] Making new env: Breakout-v0
[2018-01-28 23:40:43,101] Making new env: Breakout-v0
[2018-01-28 23:40:43,106] Making new env: Breakout-v0
[2018-01-28 23:40:43,109] Making new env: Breakout-v0
[2018-01-28 23:40:43,113] Making new env: Breakout-v0
[2018-01-28 23:40:43,116] Making new env: Breakout-v0
[2018-01-28 23:40:43,120] Making new env: Breakout-v0
[2018-01-28 23:40:43,124] Making new env: Breakout-v0
[2018-01-28 23:40:43,128] Making new env: Breakout-v0
[2018-01-28 23:40:43,130] Making new env: Breakout-v0
[2018-01-28 23:40:43,135] Making new env: Breakout-v0
None <type 'NoneType'>
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 711, in <module>
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker']))
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 570, in get_config
'worker_host' : server.target,
NameError: global name 'server' is not defined
None <type 'NoneType'>
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 711, in <module>
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker']))
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 570, in get_config
'worker_host' : server.target,
NameError: global name 'server' is not defined
DONE
DONE
[0128 23:40:45 @multigpu.py:323] ERR [p1343] step: count(2), step_time 2104.99, mean_step_time 4069.04, it/s 0.25
[0128 23:40:46 @multigpu.py:323] ERR [p1343] step: count(3), step_time 1081.13, mean_step_time 3073.07, it/s 0.33
[0128 23:40:47 @multigpu.py:323] ERR [p1343] step: count(4), step_time 1069.16, mean_step_time 2572.09, it/s 0.39
[0128 23:40:48 @multigpu.py:323] ERR [p1343] step: count(5), step_time 1080.06, mean_step_time 2273.69, it/s 0.44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment