Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save AdamStelmaszczyk/20f2ac18e4621c437c1021719c27d368 to your computer and use it in GitHub Desktop.
Save AdamStelmaszczyk/20f2ac18e4621c437c1021719c27d368 to your computer and use it in GitHub Desktop.
InvalidArgumentError 2
$ python run_job.py -n 5 -g 60 -c 12 --use_sync --name neptune_job_name clear^C
[prometheus][plghenrykm@login01 src]$ vim OpenAIGym/train.py
[prometheus][plghenrykm@login01 src]$ python run_job.py -n 5 -g 60 -c 12 --use_sync --name neptune_job_name
args.offline: False
('bash command: ', 'srun -A luna -N 5 -n 5 -c 12 -t 6:00:00 distributed_tensorpack_mkl.sh 17351 9236 Breakout-v0 adam 1 "3nodes 12cores" "neptune_job_name_1517240471.8" 0.00015 128 60 0 None 256 100 1 uniform normal False . True 1 False /net/archive/groups/plggluna/intel_2/logs/ 1e-08 0.9 0.999 0 False False False False 120 False')
SLURM_JOB_ID 9495449 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1567-1568,1577,1580,1584] ; SLURMD_NODENAME p1584 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9495449 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1567-1568,1577,1580,1584] ; SLURMD_NODENAME p1580 ; SLURM_JOB_NUM_NODES 5
mkdir: cannot create directory ‘/net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8’: File exists
SLURM_JOB_ID 9495449 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1567-1568,1577,1580,1584] ; SLURMD_NODENAME p1577 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9495449 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1567-1568,1577,1580,1584] ; SLURMD_NODENAME p1567 ; SLURM_JOB_NUM_NODES 5
SLURM_JOB_ID 9495449 ; SLURM_JOB_NAME distributed_tensorpack_mkl.sh ; SLURM_JOB_NODELIST p[1567-1568,1577,1580,1584] ; SLURMD_NODENAME p1568 ; SLURM_JOB_NUM_NODES 5
mkdir: cannot create directory ‘/net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/’: File exists
mkdir: cannot create directory ‘/net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage/’: File exists
plgrid/libs/qt/5.4.1 loaded.
plgrid/libs/qt/5.4.1 loaded.
plgrid/libs/qt/5.4.1 loaded.
plgrid/libs/qt/5.4.1 loaded.
plgrid/libs/mkl/11.3.1 loaded.
plgrid/libs/mkl/11.3.1 loaded.
plgrid/libs/mkl/11.3.1 loaded.
plgrid/libs/mkl/11.3.1 loaded.
plgrid/libs/qt/5.4.1 loaded.
plgrid/libs/mkl/11.3.1 loaded.
plgrid/tools/gcc/4.9.2 loaded.
plgrid/tools/intel/15.0.2 loaded.
plgrid/tools/gcc/4.9.2 loaded.
plgrid/tools/intel/15.0.2 loaded.
plgrid/tools/gcc/4.9.2 loaded.
plgrid/tools/intel/15.0.2 loaded.
plgrid/tools/gcc/4.9.2 loaded.
plgrid/tools/intel/15.0.2 loaded.
plgrid/tools/tcltk/8.5.19-threads loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/tcltk/8.5.19-threads loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/tcltk/8.5.19-threads loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/tcltk/8.5.19-threads loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/tools/gcc/4.9.2 loaded.
plgrid/tools/intel/15.0.2 loaded.
plgrid/tools/tcltk/8.5.19-threads loaded.
plgrid/tools/python/2.7.13 loaded.
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/11.3.1 unloaded.
tools/gcc/6.2.0 loaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
plgrid/libs/mkl/11.3.1 unloaded.
plgrid/libs/mkl/2017.0.0 loaded.
The following have been reloaded with a version change:
1) plgrid/libs/mkl/11.3.1 => plgrid/libs/mkl/2017.0.0
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
tools/gcc/6.2.0 loaded.
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 17351 --tf_port 9236 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 17351 --tf_port 9236 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 17351 --tf_port 9236 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 17351 --tf_port 9236 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
PROGRAM_ARGS: --mkl 0 --dummy 0 --sync 0 --cpu 1 --artificial_slowdown 0 --queue_size 1 --my_sim_master_queue 1 --train_log_path /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/storage//atari_trainlog/ --predict_batch_size 16 --dummy_predictor 0 --do_train 1 --simulator_procs 100 --env Breakout-v0 --nr_towers 1 --nr_predict_towers 3 --steps_per_epoch 1000 --fc_neurons 256 --batch_size 128 --learning_rate 0.00015 --port 17351 --tf_port 9236 --optimizer adam --use_sync_opt 1 --num_grad 60 --early_stopping None --ps 1 --fc_init uniform --conv_init normal --replace_with_conv True --fc_splits 1 --debug_charts False --epsilon 1e-08 --beta1 0.9 --beta2 0.999 --save_every 0 --models_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8/models/ --experiment_dir /net/archive/groups/plggluna/adam/experiments/neptune_job_name_1517240471.8 --adam_debug False --eval_node False --record_node False --schedule_hyper False
OFFLINE: False
2018-01-29 16:41:21.632560: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.632599: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.632619: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.632627: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.632634: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.643060: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> localhost:9236}
2018-01-29 16:41:21.643096: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1568:9237, 1 -> p1577:9237}
2018-01-29 16:41:21.644721: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:9236
{'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']}
[ps:0] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
[0129 16:41:21 @train.py:85] [ps:0] joining the server.
2018-01-29 16:41:21.932919: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.932971: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.932991: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.932998: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.933005: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:21.943589: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1567:9236}
2018-01-29 16:41:21.943628: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> localhost:9237, 1 -> p1577:9237}
2018-01-29 16:41:21.945305: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:9237
[2018-01-29 16:41:21,946] Making new env: Breakout-v0
{'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']}
[worker:0] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
args.mkl == 0
using tensorflow convolution
[2018-01-29 16:41:22,056] Making new env: Breakout-v0
[2018-01-29 16:41:22,062] Making new env: Breakout-v0
[2018-01-29 16:41:22,068] Making new env: Breakout-v0
[2018-01-29 16:41:22,074] Making new env: Breakout-v0
[2018-01-29 16:41:22,080] Making new env: Breakout-v0
[2018-01-29 16:41:22,086] Making new env: Breakout-v0
[2018-01-29 16:41:22,092] Making new env: Breakout-v0
[2018-01-29 16:41:22,098] Making new env: Breakout-v0
[2018-01-29 16:41:22,105] Making new env: Breakout-v0
[2018-01-29 16:41:22,111] Making new env: Breakout-v0
[2018-01-29 16:41:22,118] Making new env: Breakout-v0
[2018-01-29 16:41:22,125] Making new env: Breakout-v0
[2018-01-29 16:41:22,132] Making new env: Breakout-v0
[2018-01-29 16:41:22,139] Making new env: Breakout-v0
[2018-01-29 16:41:22,146] Making new env: Breakout-v0
[2018-01-29 16:41:22,153] Making new env: Breakout-v0
[2018-01-29 16:41:22,160] Making new env: Breakout-v0
2018-01-29 16:41:22.164770: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.164826: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.164848: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.164855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.164863: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[2018-01-29 16:41:22,167] Making new env: Breakout-v0
[2018-01-29 16:41:22,174] Making new env: Breakout-v0
2018-01-29 16:41:22.175896: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job ps -> {0 -> p1567:9236}
2018-01-29 16:41:22.175933: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job worker -> {0 -> p1568:9237, 1 -> localhost:9237}
2018-01-29 16:41:22.177694: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:316] Started server with target: grpc://localhost:9237
[2018-01-29 16:41:22,179] Making new env: Breakout-v0
[2018-01-29 16:41:22,181] Making new env: Breakout-v0
[2018-01-29 16:41:22,188] Making new env: Breakout-v0
[2018-01-29 16:41:22,195] Making new env: Breakout-v0
[2018-01-29 16:41:22,202] Making new env: Breakout-v0
{'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']}
[worker:1] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
args.mkl == 0
[2018-01-29 16:41:22,208] Making new env: Breakout-v0
[2018-01-29 16:41:22,216] Making new env: Breakout-v0
[2018-01-29 16:41:22,223] Making new env: Breakout-v0
[2018-01-29 16:41:22,230] Making new env: Breakout-v0
[2018-01-29 16:41:22,238] Making new env: Breakout-v0
[2018-01-29 16:41:22,245] Making new env: Breakout-v0
[2018-01-29 16:41:22,252] Making new env: Breakout-v0
[2018-01-29 16:41:22,259] Making new env: Breakout-v0
[2018-01-29 16:41:22,267] Making new env: Breakout-v0
[2018-01-29 16:41:22,274] Making new env: Breakout-v0
using tensorflow convolution
[2018-01-29 16:41:22,282] Making new env: Breakout-v0
[2018-01-29 16:41:22,288] Making new env: Breakout-v0
[2018-01-29 16:41:22,290] Making new env: Breakout-v0
[2018-01-29 16:41:22,295] Making new env: Breakout-v0
[2018-01-29 16:41:22,296] Making new env: Breakout-v0
[2018-01-29 16:41:22,302] Making new env: Breakout-v0
[2018-01-29 16:41:22,303] Making new env: Breakout-v0
[2018-01-29 16:41:22,308] Making new env: Breakout-v0
[2018-01-29 16:41:22,310] Making new env: Breakout-v0
[2018-01-29 16:41:22,313] Making new env: Breakout-v0
[2018-01-29 16:41:22,317] Making new env: Breakout-v0
[2018-01-29 16:41:22,320] Making new env: Breakout-v0
[2018-01-29 16:41:22,325] Making new env: Breakout-v0
[2018-01-29 16:41:22,326] Making new env: Breakout-v0
[2018-01-29 16:41:22,332] Making new env: Breakout-v0
[2018-01-29 16:41:22,332] Making new env: Breakout-v0
[2018-01-29 16:41:22,339] Making new env: Breakout-v0
[2018-01-29 16:41:22,340] Making new env: Breakout-v0
[2018-01-29 16:41:22,345] Making new env: Breakout-v0
[2018-01-29 16:41:22,347] Making new env: Breakout-v0
[2018-01-29 16:41:22,352] Making new env: Breakout-v0
[2018-01-29 16:41:22,355] Making new env: Breakout-v0
[2018-01-29 16:41:22,359] Making new env: Breakout-v0
[2018-01-29 16:41:22,362] Making new env: Breakout-v0
[2018-01-29 16:41:22,365] Making new env: Breakout-v0
[2018-01-29 16:41:22,369] Making new env: Breakout-v0
[2018-01-29 16:41:22,372] Making new env: Breakout-v0
[2018-01-29 16:41:22,376] Making new env: Breakout-v0
[2018-01-29 16:41:22,379] Making new env: Breakout-v0
[2018-01-29 16:41:22,384] Making new env: Breakout-v0
[2018-01-29 16:41:22,386] Making new env: Breakout-v0
[2018-01-29 16:41:22,392] Making new env: Breakout-v0
[2018-01-29 16:41:22,393] Making new env: Breakout-v0
[2018-01-29 16:41:22,399] Making new env: Breakout-v0
[2018-01-29 16:41:22,400] Making new env: Breakout-v0
[2018-01-29 16:41:22,406] Making new env: Breakout-v0
[2018-01-29 16:41:22,407] Making new env: Breakout-v0
[2018-01-29 16:41:22,413] Making new env: Breakout-v0
[2018-01-29 16:41:22,414] Making new env: Breakout-v0
[2018-01-29 16:41:22,420] Making new env: Breakout-v0
[2018-01-29 16:41:22,422] Making new env: Breakout-v0
[2018-01-29 16:41:22,427] Making new env: Breakout-v0
[2018-01-29 16:41:22,429] Making new env: Breakout-v0
[2018-01-29 16:41:22,434] Making new env: Breakout-v0
[2018-01-29 16:41:22,437] Making new env: Breakout-v0
[2018-01-29 16:41:22,441] Making new env: Breakout-v0
[2018-01-29 16:41:22,444] Making new env: Breakout-v0
[2018-01-29 16:41:22,448] Making new env: Breakout-v0
[2018-01-29 16:41:22,452] Making new env: Breakout-v0
[2018-01-29 16:41:22,455] Making new env: Breakout-v0
[2018-01-29 16:41:22,460] Making new env: Breakout-v0
[2018-01-29 16:41:22,463] Making new env: Breakout-v0
[2018-01-29 16:41:22,467] Making new env: Breakout-v0
[2018-01-29 16:41:22,470] Making new env: Breakout-v0
[2018-01-29 16:41:22,475] Making new env: Breakout-v0
[2018-01-29 16:41:22,476] Making new env: Breakout-v0
[2018-01-29 16:41:22,482] Making new env: Breakout-v0
[2018-01-29 16:41:22,483] Making new env: Breakout-v0
[2018-01-29 16:41:22,490] Making new env: Breakout-v0
[2018-01-29 16:41:22,490] Making new env: Breakout-v0
[2018-01-29 16:41:22,497] Making new env: Breakout-v0
[2018-01-29 16:41:22,498] Making new env: Breakout-v0
[2018-01-29 16:41:22,505] Making new env: Breakout-v0
[2018-01-29 16:41:22,505] Making new env: Breakout-v0
[2018-01-29 16:41:22,512] Making new env: Breakout-v0
[2018-01-29 16:41:22,514] Making new env: Breakout-v0
[2018-01-29 16:41:22,519] Making new env: Breakout-v0
[2018-01-29 16:41:22,521] Making new env: Breakout-v0
[2018-01-29 16:41:22,526] Making new env: Breakout-v0
[2018-01-29 16:41:22,529] Making new env: Breakout-v0
[2018-01-29 16:41:22,534] Making new env: Breakout-v0
[2018-01-29 16:41:22,537] Making new env: Breakout-v0
[2018-01-29 16:41:22,541] Making new env: Breakout-v0
[2018-01-29 16:41:22,545] Making new env: Breakout-v0
[2018-01-29 16:41:22,549] Making new env: Breakout-v0
[2018-01-29 16:41:22,553] Making new env: Breakout-v0
[2018-01-29 16:41:22,556] Making new env: Breakout-v0
[2018-01-29 16:41:22,561] Making new env: Breakout-v0
[2018-01-29 16:41:22,563] Making new env: Breakout-v0
[2018-01-29 16:41:22,569] Making new env: Breakout-v0
[2018-01-29 16:41:22,570] Making new env: Breakout-v0
[2018-01-29 16:41:22,576] Making new env: Breakout-v0
[2018-01-29 16:41:22,578] Making new env: Breakout-v0
2018-01-29 16:41:22.580249: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.580450: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.580460: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.580467: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.580475: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[2018-01-29 16:41:22,585] Making new env: Breakout-v0
[2018-01-29 16:41:22,584] Making new env: Breakout-v0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
[2018-01-29 16:41:22,592] Making new env: Breakout-v0
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
[2018-01-29 16:41:22,592] Making new env: Breakout-v0
[2018-01-29 16:41:22,599] Making new env: Breakout-v0
[2018-01-29 16:41:22,600] Making new env: Breakout-v0
[2018-01-29 16:41:22,606] Making new env: Breakout-v0
[2018-01-29 16:41:22,608] Making new env: Breakout-v0
[2018-01-29 16:41:22,614] Making new env: Breakout-v0
[2018-01-29 16:41:22,617] Making new env: Breakout-v0
[2018-01-29 16:41:22,622] Making new env: Breakout-v0
[2018-01-29 16:41:22,625] Making new env: Breakout-v0
[2018-01-29 16:41:22,629] Making new env: Breakout-v0
[2018-01-29 16:41:22,633] Making new env: Breakout-v0
[2018-01-29 16:41:22,637] Making new env: Breakout-v0
[2018-01-29 16:41:22,641] Making new env: Breakout-v0
[2018-01-29 16:41:22,644] Making new env: Breakout-v0
[2018-01-29 16:41:22,649] Making new env: Breakout-v0
[2018-01-29 16:41:22,652] Making new env: Breakout-v0
[2018-01-29 16:41:22,657] Making new env: Breakout-v0
[2018-01-29 16:41:22,660] Making new env: Breakout-v0
[2018-01-29 16:41:22,664] Making new env: Breakout-v0
[2018-01-29 16:41:22,667] Making new env: Breakout-v0
[2018-01-29 16:41:22,673] Making new env: Breakout-v0
[2018-01-29 16:41:22,675] Making new env: Breakout-v0
[2018-01-29 16:41:22,680] Making new env: Breakout-v0
[2018-01-29 16:41:22,682] Making new env: Breakout-v0
2018-01-29 16:41:22.688172: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.688407: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.688417: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-29 16:41:22.688425: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
[2018-01-29 16:41:22,688] Making new env: Breakout-v0
2018-01-29 16:41:22.688432: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[2018-01-29 16:41:22,690] Making new env: Breakout-v0
[2018-01-29 16:41:22,695] Making new env: Breakout-v0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
[2018-01-29 16:41:22,697] Making new env: Breakout-v0
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[2018-01-29 16:41:22,704] Making new env: Breakout-v0
[2018-01-29 16:41:22,704] Making new env: Breakout-v0
[2018-01-29 16:41:22,712] Making new env: Breakout-v0
[2018-01-29 16:41:22,712] Making new env: Breakout-v0
[2018-01-29 16:41:22,719] Making new env: Breakout-v0
[2018-01-29 16:41:22,721] Making new env: Breakout-v0
[2018-01-29 16:41:22,727] Making new env: Breakout-v0
[2018-01-29 16:41:22,728] Making new env: Breakout-v0
[2018-01-29 16:41:22,734] Making new env: Breakout-v0
[2018-01-29 16:41:22,737] Making new env: Breakout-v0
[2018-01-29 16:41:22,742] Making new env: Breakout-v0
[2018-01-29 16:41:22,745] Making new env: Breakout-v0
[2018-01-29 16:41:22,750] Making new env: Breakout-v0
[2018-01-29 16:41:22,753] Making new env: Breakout-v0
[2018-01-29 16:41:22,758] Making new env: Breakout-v0
[2018-01-29 16:41:22,761] Making new env: Breakout-v0
[2018-01-29 16:41:22,766] Making new env: Breakout-v0
[2018-01-29 16:41:22,770] Making new env: Breakout-v0
[2018-01-29 16:41:22,774] Making new env: Breakout-v0
[2018-01-29 16:41:22,778] Making new env: Breakout-v0
[2018-01-29 16:41:22,782] Making new env: Breakout-v0
[2018-01-29 16:41:22,786] Making new env: Breakout-v0
[2018-01-29 16:41:22,790] Making new env: Breakout-v0
[2018-01-29 16:41:22,794] Making new env: Breakout-v0
[2018-01-29 16:41:22,797] Making new env: Breakout-v0
[2018-01-29 16:41:22,805] Making new env: Breakout-v0
[2018-01-29 16:41:22,813] Making new env: Breakout-v0
[2018-01-29 16:41:22,820] Making new env: Breakout-v0
None <type 'NoneType'>
worker host: grpc://localhost:9237
[2018-01-29 16:41:22,828] Making new env: Breakout-v0
[0129 16:41:22 @train.py:718] [BA3C] Train on gpu 0 and infer on gpu 0,0,0
[0129 16:41:22 @train.py:724] using async version
[2018-01-29 16:41:22,835] Making new env: Breakout-v0
DUMMY PREDICTOR 0
[2018-01-29 16:41:22,843] Making new env: Breakout-v0
MultiGPUTrainer __init__ dummy = 0
[0129 16:41:22 @multigpu.py:57] Training a model of 1 tower
[2018-01-29 16:41:22,851] Making new env: Breakout-v0
[0129 16:41:22 @multigpu.py:67] Building graph for training tower 0..., /cpu:0
===== [p1568] PRINTING BUILD GRAPH STACK AT 1517240482.85============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train
grad_list = self._multi_tower_grads()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads
self.model.build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
[2018-01-29 16:41:22,859] Making new env: Breakout-v0
[2018-01-29 16:41:22,867] Making new env: Breakout-v0
[2018-01-29 16:41:22,875] Making new env: Breakout-v0
12
[2018-01-29 16:41:22,883] Making new env: Breakout-v0
[2018-01-29 16:41:22,890] Making new env: Breakout-v0
[2018-01-29 16:41:22,898] Making new env: Breakout-v0
[0129 16:41:22 @_common.py:61] conv0 input: [None, 84, 84, 16]
[2018-01-29 16:41:22,906] Making new env: Breakout-v0
[2018-01-29 16:41:22,914] Making new env: Breakout-v0
[2018-01-29 16:41:22,922] Making new env: Breakout-v0
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] conv0 output: [None, 80, 80, 32]
[2018-01-29 16:41:22,930] Making new env: Breakout-v0
[0129 16:41:22 @_common.py:61] pool0 input: [None, 80, 80, 32]
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] pool0 output: [None, 40, 40, 32]
[0129 16:41:22 @_common.py:61] conv1 input: [None, 40, 40, 32]
[2018-01-29 16:41:22,938] Making new env: Breakout-v0
[2018-01-29 16:41:22,946] Making new env: Breakout-v0
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] conv1 output: [None, 36, 36, 32]
[0129 16:41:22 @_common.py:61] pool1 input: [None, 36, 36, 32]
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] pool1 output: [None, 18, 18, 32]
[0129 16:41:22 @_common.py:61] conv2 input: [None, 18, 18, 32]
[2018-01-29 16:41:22,953] Making new env: Breakout-v0
[2018-01-29 16:41:22,962] Making new env: Breakout-v0
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] conv2 output: [None, 14, 14, 64]
[0129 16:41:22 @_common.py:61] pool2 input: [None, 14, 14, 64]
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] pool2 output: [None, 7, 7, 64]
[0129 16:41:22 @_common.py:61] conv3 input: [None, 7, 7, 64]
[2018-01-29 16:41:22,971] Making new env: Breakout-v0
[2018-01-29 16:41:22,981] Making new env: Breakout-v0
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] conv3 output: [None, 5, 5, 64]
[0129 16:41:22 @_common.py:61] fc1_0 input: [None, 5, 5, 64]
[2018-01-29 16:41:22,989] Making new env: Breakout-v0
[2018-01-29 16:41:22,997] Making new env: Breakout-v0
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:22 @_common.py:70] fc1_0 output: [None, 1, 1, 256]
[2018-01-29 16:41:23,005] Making new env: Breakout-v0
[0129 16:41:23 @_common.py:61] fc-pi input: [None, 256]
[2018-01-29 16:41:23,013] Making new env: Breakout-v0
[2018-01-29 16:41:23,021] Making new env: Breakout-v0
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:23 @_common.py:70] fc-pi output: [None, 6]
[0129 16:41:23 @_common.py:61] fc-v input: [None, 256]
None <type 'NoneType'>
worker host: grpc://localhost:9237
[0129 16:41:23 @train.py:718] [BA3C] Train on gpu 0 and infer on gpu 0,0,0
[0129 16:41:23 @train.py:724] using async version
DUMMY PREDICTOR 0
MultiGPUTrainer __init__ dummy = 0
[0129 16:41:23 @multigpu.py:57] Training a model of 1 tower
[0129 16:41:23 @multigpu.py:67] Building graph for training tower 0..., /cpu:0
===== [p1577] PRINTING BUILD GRAPH STACK AT 1517240483.08============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 137, in train
grad_list = self._multi_tower_grads()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 81, in _multi_tower_grads
self.model.build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
12
[0129 16:41:23 @_common.py:61] conv0 input: [None, 84, 84, 16]
Tensor("tower0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] conv0 output: [None, 80, 80, 32]
[0129 16:41:23 @_common.py:61] pool0 input: [None, 80, 80, 32]
Tensor("tower0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] pool0 output: [None, 40, 40, 32]
[0129 16:41:23 @_common.py:61] conv1 input: [None, 40, 40, 32]
Tensor("tower0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] conv1 output: [None, 36, 36, 32]
[0129 16:41:23 @_common.py:61] pool1 input: [None, 36, 36, 32]
Tensor("tower0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] pool1 output: [None, 18, 18, 32]
[0129 16:41:23 @_common.py:61] conv2 input: [None, 18, 18, 32]
Tensor("tower0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] conv2 output: [None, 14, 14, 64]
[0129 16:41:23 @_common.py:61] pool2 input: [None, 14, 14, 64]
Tensor("tower0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] pool2 output: [None, 7, 7, 64]
[0129 16:41:23 @_common.py:61] conv3 input: [None, 7, 7, 64]
Tensor("tower0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] conv3 output: [None, 5, 5, 64]
[0129 16:41:23 @_common.py:61] fc1_0 input: [None, 5, 5, 64]
Tensor("tower0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] fc1_0 output: [None, 1, 1, 256]
[0129 16:41:23 @_common.py:61] fc-pi input: [None, 256]
Tensor("tower0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:23 @_common.py:70] fc-pi output: [None, 6]
[0129 16:41:23 @_common.py:61] fc-v input: [None, 256]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:23 @_common.py:70] fc-v output: [None, 1]
Tensor("tower0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:24 @_common.py:70] fc-v output: [None, 1]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
MOVING_SUMMARY_VARIABLES
[]
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[0129 16:41:24 @modelutils.py:22] Model Parameters:
conv0/W:0: shape=[5, 5, 16, 32], dim=12800
conv1/W:0: shape=[5, 5, 32, 32], dim=25600
conv2/W:0: shape=[5, 5, 32, 64], dim=51200
conv3/W:0: shape=[3, 3, 64, 64], dim=36864
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600
fc-pi/W:0: shape=[256, 6], dim=1536
fc-pi/b:0: shape=[6], dim=6
fc-v/W:0: shape=[256, 1], dim=256
fc-v/b:0: shape=[1], dim=1
Total param=537863 (2.051785 MB assuming all float32)
MOVING_SUMMARY_VARIABLES
[]
[0129 16:41:24 @modelutils.py:22] Model Parameters:
conv0/W:0: shape=[5, 5, 16, 32], dim=12800
conv1/W:0: shape=[5, 5, 32, 32], dim=25600
conv2/W:0: shape=[5, 5, 32, 64], dim=51200
conv3/W:0: shape=[3, 3, 64, 64], dim=36864
fc1_0/W:0: shape=[5, 5, 64, 256], dim=409600
fc-pi/W:0: shape=[256, 6], dim=1536
fc-pi/b:0: shape=[6], dim=6
fc-v/W:0: shape=[256, 1], dim=256
fc-v/b:0: shape=[1], dim=1
Total param=537863 (2.051785 MB assuming all float32)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[0129 16:41:26 @multigpu.py:228] Setup callbacks ...
Creating Predictorfactor 0
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
===== [p1568] PRINTING BUILD GRAPH STACK AT 1517240486.25============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
===== [p1568] PRINTING BUILD GRAPH STACK AT 1517240486.38============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
===== [p1568] PRINTING BUILD GRAPH STACK AT 1517240486.5============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
[0129 16:41:26 @multigpu.py:228] Setup callbacks ...
Creating Predictorfactor 0
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
12
===== [p1577] PRINTING BUILD GRAPH STACK AT 1517240486.51============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
12
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:0/device:CPU:0)
Tensor("towerp0/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
[0129 16:41:26 @base.py:177] ===============================================================
[0129 16:41:26 @base.py:179] CHIEF!
[0129 16:41:26 @base.py:180] [p1568] Creating the session
[0129 16:41:26 @base.py:181] ===============================================================
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
===== [p1577] PRINTING BUILD GRAPH STACK AT 1517240486.65============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0_1/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_1/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[0129 16:41:26 @base.py:132] Building graph for predictor tower 0...
===== [p1577] PRINTING BUILD GRAPH STACK AT 1517240486.77============== File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 728, in <module>
trainer.train()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/multigpu.py", line 230, in train
callbacks.setup_graph(self) # TODO use weakref instead?
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/group.py", line 66, in _setup_graph
cb.setup_graph(self.trainer)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/callbacks/base.py", line 40, in setup_graph
self._setup_graph()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 367, in _setup_graph
self.trainer.get_predict_funcs(['state'], ['logitsT', 'pred_value'], self.predictor_threads),
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 331, in get_predict_funcs
return [self.get_predict_func(input_names, output_names, k) for k in range(n)]
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 328, in get_predict_func
return self.predictor_factory.get_predictor(input_names, output_names, tower)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 54, in get_predictor
self._build_predict_tower()
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/train/trainer.py", line 71, in _build_predict_tower
self.model, self.towers)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/predict/base.py", line 134, in build_multi_tower_prediction_graph
model.build_graph(input_vars)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/tensorpack_cpu/tensorpack/models/model_desc.py", line 140, in build_graph
self._build_graph(model_inputs)
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 279, in _build_graph
traceback.print_stack(file=sys.stderr)
12
Tensor("towerp0_2/conv0/output:0", shape=(?, 80, 80, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool0/MaxPool:0", shape=(?, 40, 40, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv1/output:0", shape=(?, 36, 36, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool1/MaxPool:0", shape=(?, 18, 18, 32), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv2/output:0", shape=(?, 14, 14, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/pool2/MaxPool:0", shape=(?, 7, 7, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/conv3/output:0", shape=(?, 5, 5, 64), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc1_0/output:0", shape=(?, 1, 1, 256), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-pi/output:0", shape=(?, 6), dtype=float32, device=/job:worker/task:1/device:CPU:0)
Tensor("towerp0_2/fc-v/output:0", shape=(?, 1), dtype=float32, device=/job:worker/task:1/device:CPU:0)
[0129 16:41:26 @base.py:177] ===============================================================
[0129 16:41:26 @base.py:180] [p1577] Creating the session
[0129 16:41:26 @base.py:181] ===============================================================
2018-01-29 16:41:27.514947: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session bb8c9e5e6f8d6e2a with config:
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
[0129 16:41:27 @base.py:189] ===============================================================
[0129 16:41:27 @base.py:190] [p1568] Session created
[0129 16:41:27 @base.py:191] ===============================================================
[0129 16:41:27 @base.py:112] [p1568] Initializing graph variables ...
[0129 16:41:27 @base.py:119] [p1568] Starting concurrency...
[0129 16:41:27 @base.py:198] Starting all threads & procs ...
[0129 16:41:27 @base.py:122] [p1568] Setting default session
[0129 16:41:27 @base.py:125] [p1568] Getting global step
[0129 16:41:27 @base.py:127] [p1568] Start training with global_step=0
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
2018-01-29 16:41:27.738938: I tensorflow/core/distributed_runtime/master_session.cc:999] Start master session 403ed8c384cb098e with config:
[0129 16:41:27 @base.py:189] ===============================================================
[0129 16:41:27 @base.py:190] [p1577] Session created
[0129 16:41:27 @base.py:191] ===============================================================
[0129 16:41:27 @base.py:112] [p1577] Initializing graph variables ...
[0129 16:41:27 @base.py:119] [p1577] Starting concurrency...
[0129 16:41:27 @base.py:198] Starting all threads & procs ...
[0129 16:41:27 @base.py:122] [p1577] Setting default session
[0129 16:41:27 @base.py:125] [p1577] Getting global step
[0129 16:41:27 @base.py:127] [p1577] Start training with global_step=0
server main loop
before socket bind... tcp://*:17351
receiving
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[0129 16:41:31 @multigpu.py:323] ERR [p1577] step: count(1), step_time 6416.33, mean_step_time 6416.33, it/s 0.16
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 2 was not defined in job "worker"
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 76, in <module>
server = tf.train.Server(cluster_spec, job_name=my_job_name, task_index=my_task_index)
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/training/server_lib.py", line 145, in __init__
self._server_def.SerializeToString(), status)
File "/net/software/local/python/2.7.9/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/net/archive/groups/plggluna/adam/a3c_virtualenv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
InvalidArgumentError: Task 3 was not defined in job "worker"
[2018-01-29 16:41:32,629] Making new env: Breakout-v0
[2018-01-29 16:41:32,747] Making new env: Breakout-v0
{'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']}
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
[worker:2] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1580] =====
args.mkl == 0
using tensorflow convolution
{'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']}
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
[worker:3] Starting the TF server
cluster_spec.as_dict(): {'ps': ['p1567:9236'], 'worker': ['p1568:9237', 'p1577:9237']} tf.__version__: 1.2.1
========= EXCEPTION WHILE STARTING TF SERVER [p1584] =====
args.mkl == 0
using tensorflow convolution
[0129 16:41:33 @multigpu.py:323] ERR [p1577] step: count(2), step_time 2272.23, mean_step_time 4344.28, it/s 0.23
[2018-01-29 16:41:33,917] Making new env: Breakout-v0
[2018-01-29 16:41:33,919] Making new env: Breakout-v0
[2018-01-29 16:41:33,922] Making new env: Breakout-v0
[2018-01-29 16:41:33,924] Making new env: Breakout-v0
[2018-01-29 16:41:33,928] Making new env: Breakout-v0
[2018-01-29 16:41:33,931] Making new env: Breakout-v0
[2018-01-29 16:41:33,934] Making new env: Breakout-v0
[2018-01-29 16:41:33,936] Making new env: Breakout-v0
[2018-01-29 16:41:33,940] Making new env: Breakout-v0
[2018-01-29 16:41:33,943] Making new env: Breakout-v0
[2018-01-29 16:41:33,945] Making new env: Breakout-v0
[2018-01-29 16:41:33,952] Making new env: Breakout-v0
[2018-01-29 16:41:33,958] Making new env: Breakout-v0
[2018-01-29 16:41:33,959] Making new env: Breakout-v0
[2018-01-29 16:41:33,958] Making new env: Breakout-v0
[2018-01-29 16:41:33,964] Making new env: Breakout-v0
[2018-01-29 16:41:33,967] Making new env: Breakout-v0
[2018-01-29 16:41:33,970] Making new env: Breakout-v0
[2018-01-29 16:41:33,972] Making new env: Breakout-v0
[2018-01-29 16:41:33,976] Making new env: Breakout-v0
[2018-01-29 16:41:33,979] Making new env: Breakout-v0
[2018-01-29 16:41:33,983] Making new env: Breakout-v0
[2018-01-29 16:41:33,985] Making new env: Breakout-v0
[2018-01-29 16:41:33,989] Making new env: Breakout-v0
[2018-01-29 16:41:33,991] Making new env: Breakout-v0
[2018-01-29 16:41:33,995] Making new env: Breakout-v0
[2018-01-29 16:41:33,998] Making new env: Breakout-v0
[2018-01-29 16:41:34,002] Making new env: Breakout-v0
[2018-01-29 16:41:34,006] Making new env: Breakout-v0
[2018-01-29 16:41:34,008] Making new env: Breakout-v0
[2018-01-29 16:41:34,013] Making new env: Breakout-v0
[2018-01-29 16:41:34,015] Making new env: Breakout-v0
[2018-01-29 16:41:34,019] Making new env: Breakout-v0
[2018-01-29 16:41:34,021] Making new env: Breakout-v0
[2018-01-29 16:41:34,026] Making new env: Breakout-v0
[2018-01-29 16:41:34,027] Making new env: Breakout-v0
[2018-01-29 16:41:34,032] Making new env: Breakout-v0
[2018-01-29 16:41:34,034] Making new env: Breakout-v0
[2018-01-29 16:41:34,039] Making new env: Breakout-v0
[2018-01-29 16:41:34,043] Making new env: Breakout-v0
[2018-01-29 16:41:34,045] Making new env: Breakout-v0
[2018-01-29 16:41:34,047] Making new env: Breakout-v0
[2018-01-29 16:41:34,052] Making new env: Breakout-v0
[2018-01-29 16:41:34,056] Making new env: Breakout-v0
[2018-01-29 16:41:34,058] Making new env: Breakout-v0
[2018-01-29 16:41:34,061] Making new env: Breakout-v0
[2018-01-29 16:41:34,065] Making new env: Breakout-v0
[2018-01-29 16:41:34,068] Making new env: Breakout-v0
[2018-01-29 16:41:34,070] Making new env: Breakout-v0
[2018-01-29 16:41:34,075] Making new env: Breakout-v0
[2018-01-29 16:41:34,077] Making new env: Breakout-v0
[2018-01-29 16:41:34,082] Making new env: Breakout-v0
[2018-01-29 16:41:34,088] Making new env: Breakout-v0
[2018-01-29 16:41:34,089] Making new env: Breakout-v0
[2018-01-29 16:41:34,090] Making new env: Breakout-v0
[2018-01-29 16:41:34,095] Making new env: Breakout-v0
[2018-01-29 16:41:34,096] Making new env: Breakout-v0
[2018-01-29 16:41:34,101] Making new env: Breakout-v0
[2018-01-29 16:41:34,103] Making new env: Breakout-v0
[2018-01-29 16:41:34,109] Making new env: Breakout-v0
[2018-01-29 16:41:34,109] Making new env: Breakout-v0
[2018-01-29 16:41:34,115] Making new env: Breakout-v0
[2018-01-29 16:41:34,116] Making new env: Breakout-v0
[2018-01-29 16:41:34,122] Making new env: Breakout-v0
[2018-01-29 16:41:34,123] Making new env: Breakout-v0
[2018-01-29 16:41:34,128] Making new env: Breakout-v0
[2018-01-29 16:41:34,130] Making new env: Breakout-v0
[2018-01-29 16:41:34,135] Making new env: Breakout-v0
[2018-01-29 16:41:34,137] Making new env: Breakout-v0
[2018-01-29 16:41:34,141] Making new env: Breakout-v0
[2018-01-29 16:41:34,145] Making new env: Breakout-v0
[2018-01-29 16:41:34,148] Making new env: Breakout-v0
[2018-01-29 16:41:34,151] Making new env: Breakout-v0
[2018-01-29 16:41:34,154] Making new env: Breakout-v0
[2018-01-29 16:41:34,158] Making new env: Breakout-v0
[2018-01-29 16:41:34,161] Making new env: Breakout-v0
[2018-01-29 16:41:34,165] Making new env: Breakout-v0
[2018-01-29 16:41:34,168] Making new env: Breakout-v0
[2018-01-29 16:41:34,172] Making new env: Breakout-v0
[2018-01-29 16:41:34,174] Making new env: Breakout-v0
[2018-01-29 16:41:34,178] Making new env: Breakout-v0
[2018-01-29 16:41:34,180] Making new env: Breakout-v0
[2018-01-29 16:41:34,185] Making new env: Breakout-v0
[2018-01-29 16:41:34,187] Making new env: Breakout-v0
[2018-01-29 16:41:34,192] Making new env: Breakout-v0
[2018-01-29 16:41:34,194] Making new env: Breakout-v0
[2018-01-29 16:41:34,199] Making new env: Breakout-v0
[2018-01-29 16:41:34,200] Making new env: Breakout-v0
[2018-01-29 16:41:34,206] Making new env: Breakout-v0
[2018-01-29 16:41:34,207] Making new env: Breakout-v0
[2018-01-29 16:41:34,213] Making new env: Breakout-v0
[2018-01-29 16:41:34,214] Making new env: Breakout-v0
[2018-01-29 16:41:34,220] Making new env: Breakout-v0
[2018-01-29 16:41:34,221] Making new env: Breakout-v0
[2018-01-29 16:41:34,227] Making new env: Breakout-v0
[2018-01-29 16:41:34,227] Making new env: Breakout-v0
[2018-01-29 16:41:34,234] Making new env: Breakout-v0
[2018-01-29 16:41:34,234] Making new env: Breakout-v0
[2018-01-29 16:41:34,241] Making new env: Breakout-v0
[2018-01-29 16:41:34,241] Making new env: Breakout-v0
[2018-01-29 16:41:34,247] Making new env: Breakout-v0
[2018-01-29 16:41:34,248] Making new env: Breakout-v0
[2018-01-29 16:41:34,254] Making new env: Breakout-v0
[2018-01-29 16:41:34,255] Making new env: Breakout-v0
[2018-01-29 16:41:34,261] Making new env: Breakout-v0
[2018-01-29 16:41:34,262] Making new env: Breakout-v0
[2018-01-29 16:41:34,268] Making new env: Breakout-v0
[2018-01-29 16:41:34,269] Making new env: Breakout-v0
[2018-01-29 16:41:34,274] Making new env: Breakout-v0
[2018-01-29 16:41:34,276] Making new env: Breakout-v0
[2018-01-29 16:41:34,281] Making new env: Breakout-v0
[2018-01-29 16:41:34,283] Making new env: Breakout-v0
[2018-01-29 16:41:34,288] Making new env: Breakout-v0
[2018-01-29 16:41:34,290] Making new env: Breakout-v0
[2018-01-29 16:41:34,294] Making new env: Breakout-v0
[2018-01-29 16:41:34,297] Making new env: Breakout-v0
[2018-01-29 16:41:34,301] Making new env: Breakout-v0
[2018-01-29 16:41:34,304] Making new env: Breakout-v0
[2018-01-29 16:41:34,308] Making new env: Breakout-v0
[2018-01-29 16:41:34,311] Making new env: Breakout-v0
[2018-01-29 16:41:34,315] Making new env: Breakout-v0
[2018-01-29 16:41:34,319] Making new env: Breakout-v0
[2018-01-29 16:41:34,322] Making new env: Breakout-v0
[2018-01-29 16:41:34,326] Making new env: Breakout-v0
[2018-01-29 16:41:34,329] Making new env: Breakout-v0
[2018-01-29 16:41:34,333] Making new env: Breakout-v0
[2018-01-29 16:41:34,335] Making new env: Breakout-v0
[2018-01-29 16:41:34,340] Making new env: Breakout-v0
[2018-01-29 16:41:34,342] Making new env: Breakout-v0
[2018-01-29 16:41:34,347] Making new env: Breakout-v0
[2018-01-29 16:41:34,349] Making new env: Breakout-v0
[2018-01-29 16:41:34,354] Making new env: Breakout-v0
[2018-01-29 16:41:34,356] Making new env: Breakout-v0
[2018-01-29 16:41:34,362] Making new env: Breakout-v0
[2018-01-29 16:41:34,363] Making new env: Breakout-v0
[2018-01-29 16:41:34,369] Making new env: Breakout-v0
[2018-01-29 16:41:34,370] Making new env: Breakout-v0
[2018-01-29 16:41:34,376] Making new env: Breakout-v0
[2018-01-29 16:41:34,377] Making new env: Breakout-v0
[2018-01-29 16:41:34,383] Making new env: Breakout-v0
[2018-01-29 16:41:34,384] Making new env: Breakout-v0
[2018-01-29 16:41:34,390] Making new env: Breakout-v0
[2018-01-29 16:41:34,391] Making new env: Breakout-v0
[2018-01-29 16:41:34,398] Making new env: Breakout-v0
[2018-01-29 16:41:34,398] Making new env: Breakout-v0
[2018-01-29 16:41:34,405] Making new env: Breakout-v0
[2018-01-29 16:41:34,405] Making new env: Breakout-v0
[2018-01-29 16:41:34,412] Making new env: Breakout-v0
[2018-01-29 16:41:34,412] Making new env: Breakout-v0
[2018-01-29 16:41:34,419] Making new env: Breakout-v0
[2018-01-29 16:41:34,420] Making new env: Breakout-v0
[2018-01-29 16:41:34,427] Making new env: Breakout-v0
[2018-01-29 16:41:34,427] Making new env: Breakout-v0
[2018-01-29 16:41:34,433] Making new env: Breakout-v0
[2018-01-29 16:41:34,434] Making new env: Breakout-v0
[2018-01-29 16:41:34,440] Making new env: Breakout-v0
[2018-01-29 16:41:34,442] Making new env: Breakout-v0
[2018-01-29 16:41:34,447] Making new env: Breakout-v0
[2018-01-29 16:41:34,449] Making new env: Breakout-v0
[2018-01-29 16:41:34,454] Making new env: Breakout-v0
[2018-01-29 16:41:34,456] Making new env: Breakout-v0
[2018-01-29 16:41:34,462] Making new env: Breakout-v0
[2018-01-29 16:41:34,463] Making new env: Breakout-v0
[2018-01-29 16:41:34,469] Making new env: Breakout-v0
[2018-01-29 16:41:34,470] Making new env: Breakout-v0
[2018-01-29 16:41:34,476] Making new env: Breakout-v0
[2018-01-29 16:41:34,477] Making new env: Breakout-v0
[2018-01-29 16:41:34,483] Making new env: Breakout-v0
[2018-01-29 16:41:34,485] Making new env: Breakout-v0
[2018-01-29 16:41:34,490] Making new env: Breakout-v0
[2018-01-29 16:41:34,492] Making new env: Breakout-v0
[2018-01-29 16:41:34,498] Making new env: Breakout-v0
[2018-01-29 16:41:34,500] Making new env: Breakout-v0
[2018-01-29 16:41:34,505] Making new env: Breakout-v0
[2018-01-29 16:41:34,507] Making new env: Breakout-v0
[2018-01-29 16:41:34,512] Making new env: Breakout-v0
[2018-01-29 16:41:34,514] Making new env: Breakout-v0
[2018-01-29 16:41:34,519] Making new env: Breakout-v0
[2018-01-29 16:41:34,522] Making new env: Breakout-v0
[2018-01-29 16:41:34,526] Making new env: Breakout-v0
[2018-01-29 16:41:34,530] Making new env: Breakout-v0
[2018-01-29 16:41:34,534] Making new env: Breakout-v0
[2018-01-29 16:41:34,537] Making new env: Breakout-v0
[2018-01-29 16:41:34,541] Making new env: Breakout-v0
[2018-01-29 16:41:34,545] Making new env: Breakout-v0
[2018-01-29 16:41:34,548] Making new env: Breakout-v0
[2018-01-29 16:41:34,552] Making new env: Breakout-v0
[2018-01-29 16:41:34,556] Making new env: Breakout-v0
[2018-01-29 16:41:34,560] Making new env: Breakout-v0
[2018-01-29 16:41:34,563] Making new env: Breakout-v0
[2018-01-29 16:41:34,567] Making new env: Breakout-v0
[2018-01-29 16:41:34,570] Making new env: Breakout-v0
[2018-01-29 16:41:34,574] Making new env: Breakout-v0
[2018-01-29 16:41:34,577] Making new env: Breakout-v0
[2018-01-29 16:41:34,582] Making new env: Breakout-v0
[2018-01-29 16:41:34,589] Making new env: Breakout-v0
[2018-01-29 16:41:34,597] Making new env: Breakout-v0
[2018-01-29 16:41:34,604] Making new env: Breakout-v0
None <type 'NoneType'>
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 712, in <module>
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker']))
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 571, in get_config
'worker_host' : server.target,
NameError: global name 'server' is not defined
[2018-01-29 16:41:34,612] Making new env: Breakout-v0
[2018-01-29 16:41:34,620] Making new env: Breakout-v0
None <type 'NoneType'>
Traceback (most recent call last):
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 712, in <module>
config = get_config(args, is_chief, my_task_index, chief_worker_hostname, len(cluster['worker']))
File "/net/archive/groups/plggluna/adam/Distributed-BA3C/src/OpenAIGym//train.py", line 571, in get_config
'worker_host' : server.target,
NameError: global name 'server' is not defined
[0129 16:41:34 @multigpu.py:323] ERR [p1577] step: count(3), step_time 1160.77, mean_step_time 3283.11, it/s 0.3
DONE
DONE
[0129 16:41:36 @multigpu.py:323] ERR [p1577] step: count(4), step_time 1182.74, mean_step_time 2758.02, it/s 0.36
[0129 16:41:37 @multigpu.py:323] ERR [p1577] step: count(5), step_time 1161.0, mean_step_time 2438.61, it/s 0.41
[0129 16:41:39 @multigpu.py:323] ERR [p1577] step: count(6), step_time 2177.08, mean_step_time 2395.02, it/s 0.42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment