AeroXi/screen_log

## screen_log
2019-08-26 21:17:42.954423: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 2 Chunks of size 29364224 totalling 56.01MiB
2019-08-26 21:17:42.954434: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 1 Chunks of size 29425664 totalling 28.06MiB
2019-08-26 21:17:42.954446: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 1 Chunks of size 32751616 totalling 31.23MiB
2019-08-26 21:17:42.954458: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 6 Chunks of size 125018112 totalling 715.36MiB
2019-08-26 21:17:42.954469: I tensorflow/core/common_runtime/bfc_allocator.cc:658] Sum Total of in-use chunks: 10.14GiB
2019-08-26 21:17:42.954485: I tensorflow/core/common_runtime/bfc_allocator.cc:660] Stats:
Limit:                 10895235482
InUse:                 10891294208
MaxInUse:              10891294208
NumAllocs:                    4208
MaxAllocSize:            125018112

2019-08-26 21:17:42.954655: W tensorflow/core/common_runtime/bfc_allocator.cc:275] ****************************************************************************************************
2019-08-26 21:17:42.954734: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
INFO:tensorflow:Error recorded from training_loop: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'bert/encoder/layer_15/intermediate/dense/truediv', defined at:
  File "pretrain_on_vcr.py", line 467, in <module>
    estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
    saving_listeners=saving_listeners
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2186, in _call_model_fn
    features, labels, mode, config)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2470, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1250, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1524, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "pretrain_on_vcr.py", line 148, in model_fn
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 216, in __init__
    do_return_all_layers=True)
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 879, in transformer_model
    kernel_initializer=create_initializer(initializer_range))
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 184, in dense
    return layer.apply(inputs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 828, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 364, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 769, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 951, in call
    return self.activation(outputs)  # pylint: disable=not-callable
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 276, in gelu
    cdf = 0.5 * (1.0 + tf.erf(input_tensor / tf.sqrt(2.0)))
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 862, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 970, in _truediv_python3
    return gen_math_ops.real_div(x, y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5989, in real_div
    "RealDiv", x=x, y=y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
    op_def=op_def)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


INFO:tensorflow:training_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1292, in _do_call
    return fn(*args)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pretrain_on_vcr.py", line 467, in <module>
    estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2400, in train
    rendezvous.raise_errors()
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
    six.reraise(typ, value, traceback)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
    saving_listeners=saving_listeners
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1215, in _train_model_default
    saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1409, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 671, in run
    run_metadata=run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1148, in run
    run_metadata=run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1239, in run
    raise six.reraise(*original_exc_info)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1224, in run
    return self._sess.run(*args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1296, in run
    run_metadata=run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1076, in run
    return self._sess.run(*args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 887, in run
    run_metadata_ptr)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1110, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1286, in _do_run
    run_metadata)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1308, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'bert/encoder/layer_15/intermediate/dense/truediv', defined at:
  File "pretrain_on_vcr.py", line 467, in <module>
    estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
    saving_listeners=saving_listeners
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2186, in _call_model_fn
    features, labels, mode, config)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2470, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1250, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1524, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "pretrain_on_vcr.py", line 148, in model_fn
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 216, in __init__
    do_return_all_layers=True)
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 879, in transformer_model
    kernel_initializer=create_initializer(initializer_range))
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 184, in dense
    return layer.apply(inputs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 828, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 364, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 769, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 951, in call
    return self.activation(outputs)  # pylint: disable=not-callable
  File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 276, in gelu
    cdf = 0.5 * (1.0 + tf.erf(input_tensor / tf.sqrt(2.0)))
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 862, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 970, in _truediv_python3
    return gen_math_ops.real_div(x, y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5989, in real_div
    "RealDiv", x=x, y=y, name=name)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
    op_def=op_def)
  File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
	2019-08-26 21:17:42.954423: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 2 Chunks of size 29364224 totalling 56.01MiB
	2019-08-26 21:17:42.954434: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 1 Chunks of size 29425664 totalling 28.06MiB
	2019-08-26 21:17:42.954446: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 1 Chunks of size 32751616 totalling 31.23MiB
	2019-08-26 21:17:42.954458: I tensorflow/core/common_runtime/bfc_allocator.cc:654] 6 Chunks of size 125018112 totalling 715.36MiB
	2019-08-26 21:17:42.954469: I tensorflow/core/common_runtime/bfc_allocator.cc:658] Sum Total of in-use chunks: 10.14GiB
	2019-08-26 21:17:42.954485: I tensorflow/core/common_runtime/bfc_allocator.cc:660] Stats:
	Limit: 10895235482
	InUse: 10891294208
	MaxInUse: 10891294208
	NumAllocs: 4208
	MaxAllocSize: 125018112

	2019-08-26 21:17:42.954655: W tensorflow/core/common_runtime/bfc_allocator.cc:275] ****************************************************************************************************
	2019-08-26 21:17:42.954734: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	INFO:tensorflow:Error recorded from training_loop: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	[[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	[[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


	Caused by op 'bert/encoder/layer_15/intermediate/dense/truediv', defined at:
	File "pretrain_on_vcr.py", line 467, in <module>
	estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
	saving_listeners=saving_listeners
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
	loss = self._train_model(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
	return self._train_model_default(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
	features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2186, in _call_model_fn
	features, labels, mode, config)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
	model_fn_results = self._model_fn(features=features, **kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2470, in _model_fn
	features, labels, is_export_mode=is_export_mode)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1250, in call_without_tpu
	return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1524, in _call_model_fn
	estimator_spec = self._model_fn(features=features, **kwargs)
	File "pretrain_on_vcr.py", line 148, in model_fn
	use_one_hot_embeddings=use_one_hot_embeddings)
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 216, in __init__
	do_return_all_layers=True)
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 879, in transformer_model
	kernel_initializer=create_initializer(initializer_range))
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 184, in dense
	return layer.apply(inputs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 828, in apply
	return self.__call__(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 364, in __call__
	outputs = super(Layer, self).__call__(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 769, in __call__
	outputs = self.call(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 951, in call
	return self.activation(outputs) # pylint: disable=not-callable
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 276, in gelu
	cdf = 0.5 * (1.0 + tf.erf(input_tensor / tf.sqrt(2.0)))
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 862, in binary_op_wrapper
	return func(x, y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 970, in _truediv_python3
	return gen_math_ops.real_div(x, y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5989, in real_div
	"RealDiv", x=x, y=y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
	op_def=op_def)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
	return func(args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
	op_def=op_def)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
	self._traceback = tf_stack.extract_stack()

	ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	[[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	[[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


	INFO:tensorflow:training_loop marked as finished
	WARNING:tensorflow:Reraising captured error
	Traceback (most recent call last):
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1292, in _do_call
	return fn(*args)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
	options, feed_dict, fetch_list, target_list, run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
	run_metadata)
	tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	[[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	[[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


	During handling of the above exception, another exception occurred:

	Traceback (most recent call last):
	File "pretrain_on_vcr.py", line 467, in <module>
	estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2400, in train
	rendezvous.raise_errors()
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
	six.reraise(typ, value, traceback)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/six.py", line 693, in reraise
	raise value
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
	saving_listeners=saving_listeners
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
	loss = self._train_model(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
	return self._train_model_default(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1215, in _train_model_default
	saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1409, in _train_with_estimator_spec
	_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 671, in run
	run_metadata=run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1148, in run
	run_metadata=run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1239, in run
	raise six.reraise(*original_exc_info)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/six.py", line 693, in reraise
	raise value
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1224, in run
	return self._sess.run(args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1296, in run
	run_metadata=run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1076, in run
	return self._sess.run(args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 887, in run
	run_metadata_ptr)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1110, in _run
	feed_dict_tensor, options, run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1286, in _do_run
	run_metadata)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1308, in _do_call
	raise type(e)(node_def, op, message)
	tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	[[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	[[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


	Caused by op 'bert/encoder/layer_15/intermediate/dense/truediv', defined at:
	File "pretrain_on_vcr.py", line 467, in <module>
	estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
	saving_listeners=saving_listeners
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
	loss = self._train_model(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
	return self._train_model_default(input_fn, hooks, saving_listeners)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
	features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2186, in _call_model_fn
	features, labels, mode, config)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
	model_fn_results = self._model_fn(features=features, **kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2470, in _model_fn
	features, labels, is_export_mode=is_export_mode)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1250, in call_without_tpu
	return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1524, in _call_model_fn
	estimator_spec = self._model_fn(features=features, **kwargs)
	File "pretrain_on_vcr.py", line 148, in model_fn
	use_one_hot_embeddings=use_one_hot_embeddings)
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 216, in __init__
	do_return_all_layers=True)
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 879, in transformer_model
	kernel_initializer=create_initializer(initializer_range))
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/core.py", line 184, in dense
	return layer.apply(inputs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 828, in apply
	return self.__call__(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 364, in __call__
	outputs = super(Layer, self).__call__(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 769, in __call__
	outputs = self.call(inputs, args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py", line 951, in call
	return self.activation(outputs) # pylint: disable=not-callable
	File "/data1/cx/r2c/data/get_bert_embeddings/modeling.py", line 276, in gelu
	cdf = 0.5 * (1.0 + tf.erf(input_tensor / tf.sqrt(2.0)))
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 862, in binary_op_wrapper
	return func(x, y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 970, in _truediv_python3
	return gen_math_ops.real_div(x, y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5989, in real_div
	"RealDiv", x=x, y=y, name=name)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
	op_def=op_def)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
	return func(args, *kwargs)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
	op_def=op_def)
	File "/home/yuweijiang/anaconda3/envs/vcr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
	self._traceback = tf_stack.extract_stack()

	ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1024,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	[[{{node bert/encoder/layer_15/intermediate/dense/truediv}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_15/intermediate/dense/BiasAdd, ConstantFolding/gradients/bert/encoder/layer_0/intermediate/dense/truediv_grad/RealDiv_recip)]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	[[{{node add_1/_9593}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6797_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
	Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.