Created
December 21, 2019 20:19
-
-
Save manish-kumar-garg/33f5def6069869ae5671e24ca52be69a to your computer and use it in GitHub Desktop.
Returnn
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ip-10-1-21-241:76137] Warning: could not find environment variable "HOROVOD_TIMELINE" | |
[ip-10-1-21-241:76137] Warning: could not find environment variable "DEBUG" | |
Horovod initialized. Hostname ip-10-1-21-241, pid 76142, rank 0 / size 2, local rank 0 / local size 2. | |
Horovod initialized. Hostname ip-10-1-21-241, pid 76143, rank 1 / size 2, local rank 1 / local size 2. | |
RETURNN starting up, version 20191217.234858--git-09b41c6f-dirty, date/time 2019-12-21-20-15-27 (UTC+0000), pid 76142, cwd /home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention, Python /home/ubuntu/anaconda3/envs/tensorflow_p36/bin/python3 | |
RETURNN starting up, version 20191217.234858--git-09b41c6f-dirty, date/time 2019-12-21-20-15-27 (UTC+0000), pid 76143, cwd /home/ubuntu/rwth-i6/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention, Python /home/ubuntu/anaconda3/envs/tensorflow_p36/bin/python3 | |
RETURNN command line options: ['returnn-distributed.config'] | |
Hostname: ip-10-1-21-241 | |
RETURNN command line options: ['returnn-distributed.config'] | |
Hostname: ip-10-1-21-241 | |
TensorFlow: 1.15.0 (v1.15.0-rc3-22-g590d6ee) (<site-package> in /home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow) | |
TensorFlow: 1.15.0 (v1.15.0-rc3-22-g590d6ee) (<site-package> in /home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow) | |
Horovod: Hostname ip-10-1-21-241, pid 76142, using GPU 0. | |
Horovod: Reduce type: grad | |
Horovod: Hostname ip-10-1-21-241, pid 76143, using GPU 1. | |
Setup TF inter and intra global thread pools, num_threads None, session opts {'gpu_options': {'visible_device_list': '0'}, 'log_device_placement': False, 'device_count': {'GPU': 0}}. | |
Setup TF inter and intra global thread pools, num_threads None, session opts {'gpu_options': {'visible_device_list': '1'}, 'log_device_placement': False, 'device_count': {'GPU': 0}}. | |
CUDA_VISIBLE_DEVICES is not set. | |
CUDA_VISIBLE_DEVICES is not set. | |
TF session gpu_options.visible_device_list is set to '1'. | |
Collecting TensorFlow device list... | |
TF session gpu_options.visible_device_list is set to '0'. | |
Collecting TensorFlow device list... | |
Local devices available to TensorFlow: | |
1/4: name: '/job:localhost/replica:0/task:0/device:CPU:0' | |
device_type: 'CPU' | |
memory_limit_bytes: 268435456 | |
physical_device_desc: '' | |
2/4: name: '/job:localhost/replica:0/task:0/device:XLA_CPU:0' | |
device_type: 'XLA_CPU' | |
memory_limit_bytes: 17179869184 | |
physical_device_desc: None | |
3/4: name: '/job:localhost/replica:0/task:0/device:XLA_GPU:0' | |
device_type: 'XLA_GPU' | |
memory_limit_bytes: 17179869184 | |
physical_device_desc: None | |
4/4: name: '/job:localhost/replica:0/task:0/device:GPU:0' | |
device_type: 'GPU' | |
memory_limit_bytes: 11321150669 | |
physical_device_desc: 'device: 0, name: Tesla K80, pci bus id: 0000:00:17.0, compute capability: 3.7' | |
Using gpu device 0: Tesla K80 | |
Local devices available to TensorFlow: | |
1/4: name: '/job:localhost/replica:0/task:0/device:CPU:0' | |
device_type: 'CPU' | |
memory_limit_bytes: 268435456 | |
physical_device_desc: '' | |
2/4: name: '/job:localhost/replica:0/task:0/device:XLA_CPU:0' | |
device_type: 'XLA_CPU' | |
memory_limit_bytes: 17179869184 | |
physical_device_desc: None | |
3/4: name: '/job:localhost/replica:0/task:0/device:XLA_GPU:1' | |
device_type: 'XLA_GPU' | |
memory_limit_bytes: 17179869184 | |
physical_device_desc: None | |
4/4: name: '/job:localhost/replica:0/task:0/device:GPU:0' | |
device_type: 'GPU' | |
memory_limit_bytes: 11321150669 | |
physical_device_desc: 'device: 1, name: Tesla K80, pci bus id: 0000:00:18.0, compute capability: 3.7' | |
Using gpu device 1: Tesla K80 | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822. | |
Train data: | |
input: 40 x 1 | |
output: {'classes': [10025, 1], 'raw': {'dtype': 'string', 'shape': ()}, 'data': [40, 2]} | |
LibriSpeechCorpus, sequences: 822, frames: unknown | |
Dev data: | |
LibriSpeechCorpus, sequences: 3000, frames: unknown | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old mean seq len (transcription) is 183.267376, new is 63.708029, requested max is 75.000000. Old num seqs is 6575, new num seqs is 822. | |
<LibriSpeechCorpus 'train' epoch=1>, epoch 1. Old num seqs 14063, new num seqs 822. | |
Train data: | |
input: 40 x 1 | |
output: {'classes': [10025, 1], 'raw': {'dtype': 'string', 'shape': ()}, 'data': [40, 2]} | |
LibriSpeechCorpus, sequences: 822, frames: unknown | |
Dev data: | |
LibriSpeechCorpus, sequences: 3000, frames: unknown | |
Learning-rate-control: file data/exp-returnn-distributed/train-scores.data does not exist yet | |
Learning-rate-control: file data/exp-returnn-distributed/train-scores.data does not exist yet | |
Update config key 'max_seq_length' for epoch 1: {'classes': 75} -> {'classes': 60} | |
Setup tf.Session with options {'gpu_options': {'visible_device_list': '0'}, 'log_device_placement': False, 'device_count': {'GPU': 1}} ... | |
Update config key 'max_seq_length' for epoch 1: {'classes': 75} -> {'classes': 60} | |
Setup tf.Session with options {'gpu_options': {'visible_device_list': '1'}, 'log_device_placement': False, 'device_count': {'GPU': 1}} ... | |
layer root/'data' output: Data(name='data', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
layer root/'source' output: Data(name='source_output', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
layer root/'lstm0_fw' output: Data(name='lstm0_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024]) | |
layer root/'data' output: Data(name='data', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
layer root/'source' output: Data(name='source_output', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
layer root/'lstm0_fw' output: Data(name='lstm0_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024]) | |
layer root/'lstm0_bw' output: Data(name='lstm0_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024]) | |
layer root/'lstm0_bw' output: Data(name='lstm0_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024]) | |
layer root/'lstm0_pool' output: Data(name='lstm0_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048]) | |
layer root/'lstm5_fw' output: Data(name='lstm5_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'lstm0_pool' output: Data(name='lstm0_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048]) | |
layer root/'lstm5_fw' output: Data(name='lstm5_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'lstm5_bw' output: Data(name='lstm5_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'lstm5_bw' output: Data(name='lstm5_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'encoder' output: Data(name='encoder_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|2048]) | |
layer root/'ctc' output: Data(name='ctc_output', shape=(None, 10026), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|10026]) | |
layer root/'enc_ctx' output: Data(name='enc_ctx_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'encoder' output: Data(name='encoder_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|2048]) | |
layer root/'ctc' output: Data(name='ctc_output', shape=(None, 10026), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|10026]) | |
layer root/'inv_fertility' output: Data(name='inv_fertility_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/'enc_ctx' output: Data(name='enc_ctx_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/'enc_value' output: Data(name='enc_value_output', shape=(None, 1, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,1,F|2048]) | |
layer root/'inv_fertility' output: Data(name='inv_fertility_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B]) | |
layer root/'enc_value' output: Data(name='enc_value_output', shape=(None, 1, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,1,F|2048]) | |
layer root/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B]) | |
Rec layer 'output' (search False, train 'globals/train_flag:0') sub net: | |
Input layers moved out of loop: (#: 2) | |
output | |
target_embed | |
Output layers moved out of loop: (#: 3) | |
output_prob | |
readout | |
readout_in | |
Layers in loop: (#: 10) | |
s | |
att | |
att0 | |
att_weights | |
energy | |
energy_tanh | |
energy_in | |
weight_feedback | |
accum_att_weights | |
s_transformed | |
Unused layers: (#: 1) | |
end | |
layer root/output:rec-subnet-input/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_shape_meta=[B,T|'time:var:extern_data:classes']) | |
layer root/output:rec-subnet-input/'target_embed' output: Data(name='target_embed_output', shape=(None, 621), batch_shape_meta=[B,T|'time:var:extern_data:classes',F|621]) | |
Rec layer 'output' (search False, train 'globals/train_flag:0') sub net: | |
Input layers moved out of loop: (#: 2) | |
output | |
target_embed | |
Output layers moved out of loop: (#: 3) | |
output_prob | |
readout | |
readout_in | |
Layers in loop: (#: 10) | |
s | |
att | |
att0 | |
att_weights | |
energy | |
energy_tanh | |
energy_in | |
weight_feedback | |
accum_att_weights | |
s_transformed | |
Unused layers: (#: 1) | |
end | |
layer root/output:rec-subnet-input/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_shape_meta=[B,T|'time:var:extern_data:classes']) | |
layer root/output:rec-subnet-input/'target_embed' output: Data(name='target_embed_output', shape=(None, 621), batch_shape_meta=[B,T|'time:var:extern_data:classes',F|621]) | |
layer root/output:rec-subnet/'weight_feedback' output: Data(name='weight_feedback_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|?,B,F|1024]) | |
layer root/output:rec-subnet/'prev:target_embed' output: Data(name='target_embed_output', shape=(621,), time_dim_axis=None, batch_shape_meta=[B,F|621]) | |
layer root/output:rec-subnet/'s' output: Data(name='s_output', shape=(1000,), time_dim_axis=None, batch_shape_meta=[B,F|1000]) | |
layer root/output:rec-subnet/'weight_feedback' output: Data(name='weight_feedback_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|?,B,F|1024]) | |
layer root/output:rec-subnet/'s_transformed' output: Data(name='s_transformed_output', shape=(1024,), time_dim_axis=None, batch_shape_meta=[B,F|1024]) | |
layer root/output:rec-subnet/'energy_in' output: Data(name='energy_in_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/output:rec-subnet/'prev:target_embed' output: Data(name='target_embed_output', shape=(621,), time_dim_axis=None, batch_shape_meta=[B,F|621]) | |
layer root/output:rec-subnet/'energy_tanh' output: Data(name='energy_tanh_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/output:rec-subnet/'s' output: Data(name='s_output', shape=(1000,), time_dim_axis=None, batch_shape_meta=[B,F|1000]) | |
layer root/output:rec-subnet/'energy' output: Data(name='energy_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/output:rec-subnet/'att_weights' output: Data(name='att_weights_output', shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool']) | |
layer root/output:rec-subnet/'s_transformed' output: Data(name='s_transformed_output', shape=(1024,), time_dim_axis=None, batch_shape_meta=[B,F|1024]) | |
layer root/output:rec-subnet/'energy_in' output: Data(name='energy_in_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/output:rec-subnet/'energy_tanh' output: Data(name='energy_tanh_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) | |
layer root/output:rec-subnet/'energy' output: Data(name='energy_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/output:rec-subnet/'att0' output: Data(name='att0_output', shape=(1, 2048), time_dim_axis=None, batch_shape_meta=[B,1,F|2048]) | |
layer root/output:rec-subnet/'att_weights' output: Data(name='att_weights_output', shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool']) | |
layer root/output:rec-subnet/'att' output: Data(name='att_output', shape=(2048,), time_dim_axis=None, batch_shape_meta=[B,F|2048]) | |
layer root/output:rec-subnet/'accum_att_weights' output: Data(name='accum_att_weights_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/output:rec-subnet/'att0' output: Data(name='att0_output', shape=(1, 2048), time_dim_axis=None, batch_shape_meta=[B,1,F|2048]) | |
layer root/output:rec-subnet-output/'s' output: Data(name='s_output', shape=(None, 1000), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|1000]) | |
layer root/output:rec-subnet/'att' output: Data(name='att_output', shape=(2048,), time_dim_axis=None, batch_shape_meta=[B,F|2048]) | |
layer root/output:rec-subnet-output/'prev:target_embed' output: Data(name='target_embed_output', shape=(None, 621), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|621]) | |
layer root/output:rec-subnet/'accum_att_weights' output: Data(name='accum_att_weights_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) | |
layer root/output:rec-subnet-output/'att' output: Data(name='att_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|2048]) | |
layer root/output:rec-subnet-output/'readout_in' output: Data(name='readout_in_output', shape=(None, 1000), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|1000]) | |
layer root/output:rec-subnet-output/'s' output: Data(name='s_output', shape=(None, 1000), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|1000]) | |
layer root/output:rec-subnet-output/'readout' output: Data(name='readout_output', shape=(None, 500), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|500]) | |
layer root/output:rec-subnet-output/'prev:target_embed' output: Data(name='target_embed_output', shape=(None, 621), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|621]) | |
layer root/output:rec-subnet-output/'output_prob' output: Data(name='output_prob_output', shape=(None, 10025), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|10025]) | |
layer root/output:rec-subnet-output/'att' output: Data(name='att_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|2048]) | |
layer root/output:rec-subnet-output/'readout_in' output: Data(name='readout_in_output', shape=(None, 1000), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|1000]) | |
layer root/output:rec-subnet-output/'readout' output: Data(name='readout_output', shape=(None, 500), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|500]) | |
layer root/'decision' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B]) | |
layer root/output:rec-subnet-output/'output_prob' output: Data(name='output_prob_output', shape=(None, 10025), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B,F|10025]) | |
layer root/'decision' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:classes',B]) | |
Network layer topology: | |
extern data: classes: Data(shape=(None,), dtype='int32', sparse=True, dim=10025, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:classes']), data: Data(shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
used data keys: ['classes', 'data'] | |
layers: | |
layer softmax 'ctc' #: 10026 | |
layer source 'data' #: 40 | |
layer decide 'decision' #: 10025 | |
layer linear 'enc_ctx' #: 1024 | |
layer split_dims 'enc_value' #: 2048 | |
layer copy 'encoder' #: 2048 | |
layer linear 'inv_fertility' #: 1 | |
layer rec 'lstm0_bw' #: 1024 | |
layer rec 'lstm0_fw' #: 1024 | |
layer pool 'lstm0_pool' #: 2048 | |
layer rec 'lstm5_bw' #: 1024 | |
layer rec 'lstm5_fw' #: 1024 | |
layer rec 'output' #: 10025 | |
layer eval 'source' #: 40 | |
net params #: 87166092 | |
net trainable params: [<tf.Variable 'ctc/W:0' shape=(2048, 10026) dtype=float32_ref>, <tf.Variable 'ctc/b:0' shape=(10026,) dtype=float32_ref>, <tf.Variable 'enc_ctx/W:0' shape=(2048, 1024) dtype=float32_ref>, <tf.Variable 'enc_ctx/b:0' shape=(1024,) dtype=float32_ref>, <tf.Variable 'inv_fertility/W:0' shape=(2048, 1) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/W:0' shape=(40, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/W:0' shape=(40, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/W:0' shape=(2048, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/W:0' shape=(2048, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'output/rec/energy/W:0' shape=(1024, 1) dtype=float32_ref>, <tf.Variable 'output/rec/output_prob/W:0' shape=(500, 10025) dtype=float32_ref>, <tf.Variable 'output/rec/output_prob/b:0' shape=(10025,) dtype=float32_ref>, <tf.Variable 'output/rec/readout_in/W:0' shape=(3669, 1000) dtype=float32_ref>, <tf.Variable 'output/rec/readout_in/b:0' shape=(1000,) dtype=float32_ref>, <tf.Variable 'output/rec/s/rec/lstm_cell/bias:0' shape=(4000,) dtype=float32_ref>, <tf.Variable 'output/rec/s/rec/lstm_cell/kernel:0' shape=(3669, 4000) dtype=float32_ref>, <tf.Variable 'output/rec/s_transformed/W:0' shape=(1000, 1024) dtype=float32_ref>, <tf.Variable 'output/rec/target_embed/W:0' shape=(10025, 621) dtype=float32_ref>, <tf.Variable 'output/rec/weight_feedback/W:0' shape=(1, 1024) dtype=float32_ref>] | |
Network layer topology: | |
extern data: classes: Data(shape=(None,), dtype='int32', sparse=True, dim=10025, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:classes']), data: Data(shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40]) | |
used data keys: ['classes', 'data'] | |
layers: | |
layer softmax 'ctc' #: 10026 | |
layer source 'data' #: 40 | |
layer decide 'decision' #: 10025 | |
layer linear 'enc_ctx' #: 1024 | |
layer split_dims 'enc_value' #: 2048 | |
layer copy 'encoder' #: 2048 | |
layer linear 'inv_fertility' #: 1 | |
layer rec 'lstm0_bw' #: 1024 | |
layer rec 'lstm0_fw' #: 1024 | |
layer pool 'lstm0_pool' #: 2048 | |
layer rec 'lstm5_bw' #: 1024 | |
layer rec 'lstm5_fw' #: 1024 | |
layer rec 'output' #: 10025 | |
layer eval 'source' #: 40 | |
net params #: 87166092 | |
net trainable params: [<tf.Variable 'ctc/W:0' shape=(2048, 10026) dtype=float32_ref>, <tf.Variable 'ctc/b:0' shape=(10026,) dtype=float32_ref>, <tf.Variable 'enc_ctx/W:0' shape=(2048, 1024) dtype=float32_ref>, <tf.Variable 'enc_ctx/b:0' shape=(1024,) dtype=float32_ref>, <tf.Variable 'inv_fertility/W:0' shape=(2048, 1) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/W:0' shape=(40, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_bw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/W:0' shape=(40, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm0_fw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/W:0' shape=(2048, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_bw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/W:0' shape=(2048, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/W_re:0' shape=(1024, 4096) dtype=float32_ref>, <tf.Variable 'lstm5_fw/rec/b:0' shape=(4096,) dtype=float32_ref>, <tf.Variable 'output/rec/energy/W:0' shape=(1024, 1) dtype=float32_ref>, <tf.Variable 'output/rec/output_prob/W:0' shape=(500, 10025) dtype=float32_ref>, <tf.Variable 'output/rec/output_prob/b:0' shape=(10025,) dtype=float32_ref>, <tf.Variable 'output/rec/readout_in/W:0' shape=(3669, 1000) dtype=float32_ref>, <tf.Variable 'output/rec/readout_in/b:0' shape=(1000,) dtype=float32_ref>, <tf.Variable 'output/rec/s/rec/lstm_cell/bias:0' shape=(4000,) dtype=float32_ref>, <tf.Variable 'output/rec/s/rec/lstm_cell/kernel:0' shape=(3669, 4000) dtype=float32_ref>, <tf.Variable 'output/rec/s_transformed/W:0' shape=(1000, 1024) dtype=float32_ref>, <tf.Variable 'output/rec/target_embed/W:0' shape=(10025, 621) dtype=float32_ref>, <tf.Variable 'output/rec/weight_feedback/W:0' shape=(1, 1024) dtype=float32_ref>] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment