Skip to content

Instantly share code, notes, and snippets.

@manish-kumar-garg
Created December 26, 2019 06:47
Show Gist options
  • Save manish-kumar-garg/ab616732fb286c61162b03db9732358a to your computer and use it in GitHub Desktop.
Save manish-kumar-garg/ab616732fb286c61162b03db9732358a to your computer and use it in GitHub Desktop.
Returnn - asr_local_attention
RETURNN starting up, version 20191217.234858--git-09b41c6f-dirty, date/time 2019-12-26-06-45-58 (UTC+0000), pid 20215, cwd /home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention, Python /home/ubuntu/tf1.13/bin/python3
RETURNN command line options: ['local-heuristic.argmax.win05.exp3.ctc.config']
Hostname: ip-10-1-21-241
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
TensorFlow: 1.13.1 (b'v1.13.1-0-g6612da8951') (<site-package> in /home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow)
Setup TF inter and intra global thread pools, num_threads None, session opts {'log_device_placement': False, 'device_count': {'GPU': 0}}.
2019-12-26 06:45:59.202047: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-26 06:45:59.425681: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-26 06:45:59.429258: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-26 06:45:59.430831: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55e4d8dd5850 executing computations on platform CUDA. Devices:
2019-12-26 06:45:59.430864: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2019-12-26 06:45:59.430881: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (1): Tesla K80, Compute Capability 3.7
2019-12-26 06:45:59.450542: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300090000 Hz
2019-12-26 06:45:59.452625: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55e4d9498d60 executing computations on platform Host. Devices:
2019-12-26 06:45:59.452659: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2019-12-26 06:45:59.452768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-26 06:45:59.452791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]
CUDA_VISIBLE_DEVICES is set to '1,2'.
Collecting TensorFlow device list...
2019-12-26 06:45:59.455892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:18.0
totalMemory: 11.17GiB freeMemory: 446.06MiB
2019-12-26 06:45:59.456040: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:19.0
totalMemory: 11.17GiB freeMemory: 446.06MiB
2019-12-26 06:45:59.456482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-12-26 06:45:59.459962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-26 06:45:59.459989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1
2019-12-26 06:45:59.460007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N Y
2019-12-26 06:45:59.460016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: Y N
2019-12-26 06:45:59.460292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 221 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:18.0, compute capability: 3.7)
2019-12-26 06:45:59.463035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:1 with 221 MB memory) -> physical GPU (device: 1, name: Tesla K80, pci bus id: 0000:00:19.0, compute capability: 3.7)
Local devices available to TensorFlow:
1/6: name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13382603103062640486
2/6: name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 13903808377023195374
physical_device_desc: "device: XLA_GPU device"
3/6: name: "/device:XLA_GPU:1"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 10467231846590301634
physical_device_desc: "device: XLA_GPU device"
4/6: name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 2300646191363693310
physical_device_desc: "device: XLA_CPU device"
5/6: name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 231800832
locality {
bus_id: 1
links {
link {
device_id: 1
type: "StreamExecutor"
strength: 1
}
}
}
incarnation: 15532672733655798053
physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:18.0, compute capability: 3.7"
6/6: name: "/device:GPU:1"
device_type: "GPU"
memory_limit: 231800832
locality {
bus_id: 1
links {
link {
type: "StreamExecutor"
strength: 1
}
}
}
incarnation: 4180441292150800506
physical_device_desc: "device: 1, name: Tesla K80, pci bus id: 0000:00:19.0, compute capability: 3.7"
Using gpu device 1: Tesla K80
Using gpu device 2: Tesla K80
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822.
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822.
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822.
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822.
Train data:
input: 40 x 1
output: {'classes': [10025, 1], 'raw': {'dtype': 'string', 'shape': ()}, 'data': [40, 2]}
LibriSpeechCorpus, sequences: 822, frames: unknown
Dev data:
LibriSpeechCorpus, sequences: 3000, frames: unknown
Learning-rate-control: file newbob.data does not exist yet
Update config key 'max_seq_length' for epoch 1: {'classes': 75} -> {'classes': 60}
Setup tf.Session with options {'log_device_placement': False, 'device_count': {'GPU': 1}} ...
2019-12-26 06:46:05.591177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-12-26 06:46:05.591272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-26 06:46:05.591293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1
2019-12-26 06:46:05.591310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N Y
2019-12-26 06:46:05.591325: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: Y N
2019-12-26 06:46:05.591489: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 221 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:18.0, compute capability: 3.7)
layer root/'data' output: Data(name='data', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'source' output: Data(name='source_output', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'lstm0_fw' output: Data(name='lstm0_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024])
layer root/'lstm0_bw' output: Data(name='lstm0_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024])
layer root/'lstm0_pool' output: Data(name='lstm0_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048])
layer root/'lstm5_fw' output: Data(name='lstm5_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'lstm5_bw' output: Data(name='lstm5_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'encoder' output: Data(name='encoder_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|2048])
layer root/'ctc' output: Data(name='ctc_output', shape=(None, 10026), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|10026])
layer root/'enc_ctx' output: Data(name='enc_ctx_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'inv_fertility' output: Data(name='inv_fertility_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1])
layer root/'enc_value' output: Data(name='enc_value_output', shape=(None, 1, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,1,F|2048])
<_SubnetworkRecCell of None>: exception constructing template network (for deps and data shapes)
Most recent construction stack:
<_TemplateLayer(EvalLayer)(:template:eval) 'output/p_t_in' out_type=Data(shape=(), time_dim_axis=None, batch_shape_meta=[B]) (construction stack 'att_weights')>, kwargs:
{'eval': 'tf.squeeze(tf.argmax(source(0), axis=1, output_type=tf.int32), '
'axis=1)',
'name': 'p_t_in',
'network': <TFNetwork 'root/output:rec-subnet' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'out_type': {'batch_dim_axis': 0, 'dtype': 'float32', 'shape': ()},
'sources': [<_TemplateLayer(SoftmaxOverSpatialLayer)(:prev:softmax_over_spatial) 'output/prev:att_weights' out_type=Data(shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool']) (construction stack None)>]}
Template network so far:
{'accum_att_weights': <_TemplateLayer(EvalLayer)(:template:eval) 'output/accum_att_weights' out_type=Data(shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) (construction stack 'weight_feedback')>,
'att': <_TemplateLayer(MergeDimsLayer)(:template:merge_dims) 'output/att' out_type=Data(shape=(2048,), time_dim_axis=None, batch_shape_meta=[B,F|2048]) (construction stack 's')>,
'att0': <_TemplateLayer(GenericAttentionLayer)(:template:generic_attention) 'output/att0' out_type=Data(shape=(1, 2048), time_dim_axis=None, batch_shape_meta=[B,1,F|2048]) (construction stack 'att')>,
'att_weights': <_TemplateLayer(SoftmaxOverSpatialLayer)(:template:softmax_over_spatial) 'output/att_weights' out_type=Data(shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool']) (construction stack 'att0')>,
'end': <_TemplateLayer(CompareLayer)(:template:compare) 'output/end' out_type=Data(shape=(), dtype='bool', sparse=True, dim=2, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>,
'energy': <_TemplateLayer(LinearLayer)(:template:linear) 'output/energy' out_type=Data(shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) (construction stack 'energy_reinterpreted')>,
'energy_in': <_TemplateLayer(CombineLayer)(:template:combine) 'output/energy_in' out_type=Data(shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) (construction stack 'energy_tanh')>,
'energy_reinterpreted': <_TemplateLayer(ReinterpretDataLayer)(:template:reinterpret_data) 'output/energy_reinterpreted' out_type=Data(shape=(None, 1), batch_shape_meta=[B,T|'spatial:0:lstm0_pool',F|1]) (construction stack 'att_weights')>,
'energy_tanh': <_TemplateLayer(ActivationLayer)(:template:activation) 'output/energy_tanh' out_type=Data(shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) (construction stack 'energy')>,
'output': <_TemplateLayer(ChoiceLayer)(:template:choice) 'output/output' out_type=Data(shape=(), dtype='int32', sparse=True, dim=10025, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>,
'output_prob': <_TemplateLayer(SoftmaxLayer)(:template:softmax) 'output/output_prob' out_type=Data(shape=(10025,), time_dim_axis=None, batch_shape_meta=[B,F|10025]) (construction stack None)>,
'p_t_in': <_TemplateLayer(EvalLayer)(:template:eval) 'output/p_t_in' out_type=Data(shape=(), time_dim_axis=None, batch_shape_meta=[B]) (construction stack 'att_weights')>,
'readout': <_TemplateLayer(ReduceOutLayer)(:template:reduce_out) 'output/readout' out_type=Data(shape=(500,), time_dim_axis=None, batch_shape_meta=[B,F|500]) (construction stack 'output_prob')>,
'readout_in': <_TemplateLayer(LinearLayer)(:template:linear) 'output/readout_in' out_type=Data(shape=(1000,), time_dim_axis=None, batch_shape_meta=[B,F|1000]) (construction stack 'readout')>,
's': <_TemplateLayer(RnnCellLayer)(:template:rnn_cell) 'output/s' out_type=Data(shape=(1000,), time_dim_axis=None, batch_shape_meta=[B,F|1000]) (construction stack 'readout_in')>,
's_transformed': <_TemplateLayer(LinearLayer)(:template:linear) 'output/s_transformed' out_type=Data(shape=(1024,), time_dim_axis=None, batch_shape_meta=[B,F|1024]) (construction stack 'energy_in')>,
'target_embed': <_TemplateLayer(LinearLayer)(:template:linear) 'output/target_embed' out_type=Data(shape=(621,), time_dim_axis=None, batch_shape_meta=[B,F|621]) (construction stack 's')>,
'weight_feedback': <_TemplateLayer(LinearLayer)(:template:linear) 'output/weight_feedback' out_type=Data(shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) (construction stack 'energy_in')>}
Collected (unique) exceptions during template construction:
(Note that many of these can be ignored, or are expected.)
EXCEPTION
NetworkConstructionDependencyLoopException: Error: There is a dependency loop on layer 'accum_att_weights'.
Construction stack (most recent first):
accum_att_weights
weight_feedback
energy_in
energy_tanh
energy
energy_reinterpreted
att_weights
att0
att
s
readout_in
readout
output_prob
EXCEPTION
CannotHandleUndefinedSourcesException: 's_transformed': cannot handle undefined sources without defined out_type.
{'activation': None,
'loss': None,
'n_out': 1024,
'network': <TFNetwork 'root/output:rec-subnet' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'size_target': None,
'sources': [None],
'target': None,
'with_bias': False}
Exception creating layer root/'output' of class RecLayer with opts:
{'_target_layers': {},
'cheating': False,
'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>,
'n_out': <class 'Util.NotSpecified'>,
'name': 'output',
'network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'sources': [],
'target': 'classes',
'unit': {'accum_att_weights': {'class': 'eval',
'eval': 'source(0) + source(1) * source(2) * '
'0.5',
'from': ['prev:accum_att_weights',
'att_weights',
'base:inv_fertility'],
'out_type': {'dim': 1, 'shape': (None, 1)}},
'att': {'axes': 'except_batch',
'class': 'merge_dims',
'from': ['att0']},
'att0': {'base': 'base:enc_value',
'class': 'generic_attention',
'weights': 'att_weights'},
'att_weights': {'class': 'softmax_over_spatial',
'from': ['energy_reinterpreted'],
'window_size': 5,
'window_start': 'p_t_in'},
'end': {'class': 'compare', 'from': ['output'], 'value': 0},
'energy': {'activation': None,
'class': 'linear',
'from': ['energy_tanh'],
'n_out': 1,
'with_bias': False},
'energy_in': {'class': 'combine',
'from': ['base:enc_ctx',
'weight_feedback',
's_transformed'],
'kind': 'add',
'n_out': 1024},
'energy_reinterpreted': {'class': 'reinterpret_data',
'enforce_batch_major': True,
'from': 'energy',
'trainable': False},
'energy_tanh': {'activation': 'tanh',
'class': 'activation',
'from': ['energy_in']},
'output': {'beam_size': 12,
'cheating': False,
'class': 'choice',
'from': ['output_prob'],
'initial_output': 0,
'target': 'classes'},
'output_prob': {'class': 'softmax',
'dropout': 0.3,
'from': ['readout'],
'loss': 'ce',
'loss_only_on_non_search': True,
'loss_opts': {'label_smoothing': 0},
'target': 'classes'},
'p_t': {'class': 'eval',
'eval': 'tf.to_float(source(0))',
'from': 'p_t_in'},
'p_t_in': {'class': 'eval',
'eval': 'tf.squeeze(tf.argmax(source(0), axis=1, '
'output_type=tf.int32), axis=1)',
'from': 'prev:att_weights',
'out_type': {'batch_dim_axis': 0,
'dtype': 'float32',
'shape': ()}},
'readout': {'class': 'reduce_out',
'from': ['readout_in'],
'mode': 'max',
'num_pieces': 2},
'readout_in': {'activation': None,
'class': 'linear',
'from': ['s', 'prev:target_embed', 'att'],
'n_out': 1000},
's': {'class': 'rnn_cell',
'from': ['prev:target_embed', 'prev:att'],
'n_out': 1000,
'unit': 'LSTMBlock'},
's_transformed': {'activation': None,
'class': 'linear',
'from': ['s'],
'n_out': 1024,
'with_bias': False},
'target_embed': {'activation': None,
'class': 'linear',
'from': ['output'],
'initial_output': 0,
'n_out': 621,
'with_bias': False},
'weight_feedback': {'activation': None,
'class': 'linear',
'from': ['prev:accum_att_weights'],
'n_out': 1024,
'with_bias': False}}}
Unhandled exception <class 'AssertionError'> in thread <_MainThread(MainThread, started 140613301045056)>, proc 20215.
Thread current, main, <_MainThread(MainThread, started 140613301045056)>:
(Excluded thread.)
That were all threads.
EXCEPTION
Traceback (most recent call last):
File "./returnn/rnn.py", line 654, in <module>
line: main(sys.argv)
locals:
main = <local> <function main at 0x7fe30c132158>
sys = <local> <module 'sys' (built-in)>
sys.argv = <local> ['./returnn/rnn.py', 'local-heuristic.argmax.win05.exp3.ctc.config'], _[0]: {len = 16}
File "./returnn/rnn.py", line 642, in main
line: execute_main_task()
locals:
execute_main_task = <global> <function execute_main_task at 0x7fe30c132048>
File "./returnn/rnn.py", line 451, in execute_main_task
line: engine.init_train_from_config(config, train_data, dev_data, eval_data)
locals:
engine = <global> <TFEngine.Engine object at 0x7fe04dbe8748>
engine.init_train_from_config = <global> <bound method Engine.init_train_from_config of <TFEngine.Engine object at 0x7fe04dbe8748>>
config = <global> <Config.Config object at 0x7fe3147c69e8>
train_data = <global> <LibriSpeechCorpus 'train' epoch=1>
dev_data = <global> <LibriSpeechCorpus 'dev' epoch=1>
eval_data = <global> None
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFEngine.py", line 891, in init_train_from_config
line: self.init_network_from_config(config)
locals:
self = <local> <TFEngine.Engine object at 0x7fe04dbe8748>
self.init_network_from_config = <local> <bound method Engine.init_network_from_config of <TFEngine.Engine object at 0x7fe04dbe8748>>
config = <local> <Config.Config object at 0x7fe3147c69e8>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFEngine.py", line 934, in init_network_from_config
line: self._init_network(net_desc=net_dict, epoch=self.epoch)
locals:
self = <local> <TFEngine.Engine object at 0x7fe04dbe8748>
self._init_network = <local> <bound method Engine._init_network of <TFEngine.Engine object at 0x7fe04dbe8748>>
net_desc = <not found>
net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 14
epoch = <local> None
self.epoch = <local> 1
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFEngine.py", line 1081, in _init_network
line: self.network, self.updater = self.create_network(
config=self.config,
rnd_seed=net_random_seed,
train_flag=train_flag, eval_flag=self.use_eval_flag, search_flag=self.use_search_flag,
initial_learning_rate=getattr(self, "initial_learning_rate", None),
net_dict=net_desc)
locals:
self = <local> <TFEngine.Engine object at 0x7fe04dbe8748>
self.network = <local> None
self.updater = <local> None
self.create_network = <local> <bound method Engine.create_network of <class 'TFEngine.Engine'>>
config = <not found>
self.config = <local> <Config.Config object at 0x7fe3147c69e8>
rnd_seed = <not found>
net_random_seed = <local> 1
train_flag = <local> <tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>
eval_flag = <not found>
self.use_eval_flag = <local> True
search_flag = <not found>
self.use_search_flag = <local> False
initial_learning_rate = <not found>
getattr = <builtin> <built-in function getattr>
net_dict = <not found>
net_desc = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 14
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFEngine.py", line 1113, in create_network
line: network.construct_from_dict(net_dict)
locals:
network = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
network.construct_from_dict = <local> <bound method TFNetwork.construct_from_dict of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 14
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 460, in construct_from_dict
line: self.construct_layer(net_dict, name)
locals:
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 14
name = <local> 'decision', len = 8
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 652, in construct_layer
line: layer_class.transform_config_dict(layer_desc, network=self, get_layer=get_layer)
locals:
layer_class = <local> <class 'TFNetworkRecLayer.DecideLayer'>
layer_class.transform_config_dict = <local> <bound method BaseChoiceLayer.transform_config_dict of <class 'TFNetworkRecLayer.DecideLayer'>>
layer_desc = <local> {'loss': 'edit_distance', 'target': 'classes', 'loss_only_on_non_search': False}
network = <not found>
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7fe3147289d8>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 4089, in transform_config_dict
line: super(BaseChoiceLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
locals:
super = <builtin> <class 'super'>
BaseChoiceLayer = <global> <class 'TFNetworkRecLayer.BaseChoiceLayer'>
cls = <local> <class 'TFNetworkRecLayer.DecideLayer'>
transform_config_dict = <not found>
d = <local> {'loss': 'edit_distance', 'target': 'classes', 'loss_only_on_non_search': False}
network = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7fe3147289d8>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 448, in transform_config_dict
line: for src_name in src_names
locals:
src_name = <not found>
src_names = <local> ['output'], _[0]: {len = 6}
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 449, in <listcomp>
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <not found>
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7fe3147289d8>
src_name = <local> 'output', len = 6
src_names = <not found>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 607, in get_layer
line: return self.construct_layer(net_dict=net_dict, name=src_name) # set get_layer to wrap construct_layer
locals:
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 14
name = <not found>
src_name = <local> 'output', len = 6
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 655, in construct_layer
line: return add_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 7
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 760, in add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 7
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 701, in _create_layer
line: layer_desc["output"] = layer_class.get_out_data_from_opts(**layer_desc)
locals:
layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 9
layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
layer_class.get_out_data_from_opts = <local> <bound method RecLayer.get_out_data_from_opts of <class 'TFNetworkRecLayer.RecLayer'>>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 362, in get_out_data_from_opts
line: subnet = _SubnetworkRecCell(
parent_net=kwargs["network"], net_dict=unit, source_data=source_data, rec_layer_name=kwargs["name"])
locals:
subnet = <not found>
_SubnetworkRecCell = <global> <class 'TFNetworkRecLayer._SubnetworkRecCell'>
parent_net = <not found>
kwargs = <local> {'cheating': False, 'target': 'classes', 'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>, 'n_out': <class 'Util.NotSpecified'>, '_target_layers': {}, 'name': 'output', 'network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>}, len = 7
net_dict = <not found>
unit = <local> {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': False, 'from': ['output'], 'n_ou..., len = 19
source_data = <local> None
rec_layer_name = <not found>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 917, in __init__
line: self._construct_template()
locals:
self = <local> <_SubnetworkRecCell of None>
self._construct_template = <local> <bound method _SubnetworkRecCell._construct_template of <_SubnetworkRecCell of None>>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 1229, in _construct_template
line: direct_get_layer.construct(layer.name)
locals:
direct_get_layer = <local> <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 1, parents None)
direct_get_layer.construct = <local> <bound method _SubnetworkRecCell._construct_template.<locals>.GetLayer.construct of <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 1, parents None)>
layer = <local> <_TemplateLayer(EvalLayer)(:template:eval) 'output/p_t_in' out_type=Data(shape=(), time_dim_axis=None, batch_shape_meta=[B]) (construction stack 'att_weights')>
layer.name = <local> 'p_t_in', len = 6
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 1021, in construct
line: self.__call__(layer_name_)
locals:
self = <local> <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 1, parents None)
self.__call__ = <local> <bound method _SubnetworkRecCell._construct_template.<locals>.GetLayer.__call__ of <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 1, parents None)>
layer_name_ = <local> 'p_t_in', len = 6
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 1188, in __call__
line: self.net.construct_layer(
net_dict=self.net_dict, name=name,
get_layer=default_get_layer, add_layer=default_get_layer.add_templated_layer)
locals:
self = <local> <_SubnetworkRecCell of None>
self.net = <local> <TFNetwork 'root/output:rec-subnet' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output:rec-subnet' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <not found>
self.net_dict = <local> {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': False, 'from': ['output'], 'n_ou..., len = 19
name = <local> 'p_t_in', len = 6
get_layer = <not found>
default_get_layer = <local> <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 0, parents 'p_t_in')
add_layer = <not found>
default_get_layer.add_templated_layer = <local> <bound method _SubnetworkRecCell._construct_template.<locals>.GetLayer.add_templated_layer of <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 0, parents 'p_t_in')>
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 655, in construct_layer
line: return add_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method _SubnetworkRecCell._construct_template.<locals>.GetLayer.add_templated_layer of <RecLayer construct template GetLayer>(safe False, allow_construct_in_call_nrs None, allow_uninitialized_template False, count 0, parents 'p_t_in')>
name = <local> 'p_t_in', len = 6
layer_class = <local> <class 'TFNetworkLayer.EvalLayer'>
layer_desc = <local> {'eval': 'tf.squeeze(tf.argmax(source(0), axis=1, output_type=tf.int32), axis=1)', 'out_type': {'shape': (), 'batch_dim_axis': 0, 'dtype': 'float32'}, 'sources': [<_TemplateLayer(SoftmaxOverSpatialLayer)(:prev:softmax_over_spatial) 'output/prev:att_weights' out_type=Data(shape=(1, None), time_dim...
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 1041, in add_templated_layer
line: output = layer_class.get_out_data_from_opts(**layer_desc)
locals:
output = <not found>
layer_class = <local> <class 'TFNetworkLayer.EvalLayer'>
layer_class.get_out_data_from_opts = <local> <bound method CombineLayer.get_out_data_from_opts of <class 'TFNetworkLayer.EvalLayer'>>
layer_desc = <local> {'eval': 'tf.squeeze(tf.argmax(source(0), axis=1, output_type=tf.int32), axis=1)', 'out_type': {'shape': (), 'batch_dim_axis': 0, 'dtype': 'float32'}, 'sources': [<_TemplateLayer(SoftmaxOverSpatialLayer)(:prev:softmax_over_spatial) 'output/prev:att_weights' out_type=Data(shape=(1, None), time_dim...
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 6069, in get_out_data_from_opts
line: return super(CombineLayer, cls).get_out_data_from_opts(n_out=n_out, out_type=out_type_, sources=sources, **kwargs)
locals:
super = <builtin> <class 'super'>
CombineLayer = <global> <class 'TFNetworkLayer.CombineLayer'>
cls = <local> <class 'TFNetworkLayer.EvalLayer'>
get_out_data_from_opts = <not found>
n_out = <local> <class 'Util.NotSpecified'>
out_type = <local> {'shape': (), 'batch_dim_axis': 0, 'dtype': 'float32'}
out_type_ = <local> {'name': 'p_t_in_output', 'shape': (), 'dtype': 'float32', 'sparse': False, 'dim': 1, 'batch_dim_axis': 0, 'time_dim_axis': 2, 'feature_dim_axis': 1}, len = 8
sources = <local> [<_TemplateLayer(SoftmaxOverSpatialLayer)(:prev:softmax_over_spatial) 'output/prev:att_weights' out_type=Data(shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool']) (construction stack None)>]
kwargs = <local> {'eval': 'tf.squeeze(tf.argmax(source(0), axis=1, output_type=tf.int32), axis=1)', 'name': 'p_t_in', 'network': <TFNetwork 'root/output:rec-subnet' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=boo...
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 227, in get_out_data_from_opts
line: return cls._base_get_out_data_from_opts(**kwargs)
locals:
cls = <local> <class 'TFNetworkLayer.EvalLayer'>
cls._base_get_out_data_from_opts = <local> <bound method LayerBase._base_get_out_data_from_opts of <class 'TFNetworkLayer.EvalLayer'>>
kwargs = <local> {'n_out': <class 'Util.NotSpecified'>, 'out_type': {'name': 'p_t_in_output', 'shape': (), 'dtype': 'float32', 'sparse': False, 'dim': 1, 'batch_dim_axis': 0, 'time_dim_axis': 2, 'feature_dim_axis': 1}, 'sources': [<_TemplateLayer(SoftmaxOverSpatialLayer)(:prev:softmax_over_spatial) 'output/prev:a..., len = 6
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 324, in _base_get_out_data_from_opts
line: output = Data(**out_type)
locals:
output = <not found>
Data = <global> <class 'TFUtil.Data'>
out_type = <local> {'name': 'p_t_in_output', 'shape': (), 'dtype': 'float32', 'sparse': False, 'dim': 1, 'batch_dim_axis': 0, 'time_dim_axis': 2, 'feature_dim_axis': 1, 'beam': None}, len = 9
File "/home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFUtil.py", line 559, in __init__
line: assert 0 <= feature_dim_axis < self.batch_ndim
locals:
feature_dim_axis = <local> 1
self = <local> !AttributeError: 'Data' object has no attribute 'time_dim_axis'
self.batch_ndim = <local> 1
AssertionError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment