manish-kumar-garg/error_compile_tf_graph

## error_compile_tf_graph
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Using config file 'returnn.config'.
Returnn compile-tf-graph starting up.
RETURNN starting up, version 20200129.184103--git-a399fac3-dirty, date/time 2020-01-30-10-57-15 (UTC+0000), pid 19583, cwd /home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention, Python /home/ubuntu/tf1.13/bin/python
Hostname: ip-10-1-16-53
TensorFlow: 1.13.1 (b'v1.13.1-0-g6612da8951') (<site-package> in /home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow)
Setup TF inter and intra global thread pools, num_threads None, session opts {'log_device_placement': False, 'device_count': {'GPU': 0}}.
2020-01-30 10:57:15.058438: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-30 10:57:15.660183: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-30 10:57:15.687918: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-30 10:57:15.694795: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-30 10:57:15.704406: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-30 10:57:15.706023: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x556e6b9f1cb0 executing computations on platform CUDA. Devices:
2020-01-30 10:57:15.706056: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
2020-01-30 10:57:15.706067: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (1): Tesla V100-SXM2-16GB, Compute Capability 7.0
2020-01-30 10:57:15.706074: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (2): Tesla V100-SXM2-16GB, Compute Capability 7.0
2020-01-30 10:57:15.706085: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (3): Tesla V100-SXM2-16GB, Compute Capability 7.0
2020-01-30 10:57:15.728153: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300080000 Hz
2020-01-30 10:57:15.730335: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x556e6c0b5780 executing computations on platform Host. Devices:
2020-01-30 10:57:15.730364: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-01-30 10:57:15.730446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-30 10:57:15.730465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]
CUDA_VISIBLE_DEVICES is not set.
Collecting TensorFlow device list...
2020-01-30 10:57:15.733876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:00:1b.0
totalMemory: 15.78GiB freeMemory: 15.47GiB
2020-01-30 10:57:15.733937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:00:1c.0
totalMemory: 15.78GiB freeMemory: 15.47GiB
2020-01-30 10:57:15.733985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 2 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:00:1d.0
totalMemory: 15.78GiB freeMemory: 15.47GiB
2020-01-30 10:57:15.734029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 3 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:00:1e.0
totalMemory: 15.78GiB freeMemory: 15.47GiB
2020-01-30 10:57:15.734082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1, 2, 3
2020-01-30 10:57:15.739501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-30 10:57:15.739527: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1 2 3
2020-01-30 10:57:15.739543: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N Y Y Y
2020-01-30 10:57:15.739554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   Y N Y Y
2020-01-30 10:57:15.739565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 2:   Y Y N Y
2020-01-30 10:57:15.739576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 3:   Y Y Y N
2020-01-30 10:57:15.739723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 15049 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1b.0, compute capability: 7.0)
2020-01-30 10:57:15.740041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:1 with 15049 MB memory) -> physical GPU (device: 1, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1c.0, compute capability: 7.0)
2020-01-30 10:57:15.740296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:2 with 15049 MB memory) -> physical GPU (device: 2, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1d.0, compute capability: 7.0)
2020-01-30 10:57:15.740592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:3 with 15049 MB memory) -> physical GPU (device: 3, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)
Local devices available to TensorFlow:
  1/10: name: "/device:CPU:0"
       device_type: "CPU"
       memory_limit: 268435456
       locality {
       }
       incarnation: 12254757501798345357
  2/10: name: "/device:XLA_GPU:0"
       device_type: "XLA_GPU"
       memory_limit: 17179869184
       locality {
       }
       incarnation: 17434536340433975299
       physical_device_desc: "device: XLA_GPU device"
  3/10: name: "/device:XLA_GPU:1"
       device_type: "XLA_GPU"
       memory_limit: 17179869184
       locality {
       }
       incarnation: 6951473131317435871
       physical_device_desc: "device: XLA_GPU device"
  4/10: name: "/device:XLA_GPU:2"
       device_type: "XLA_GPU"
       memory_limit: 17179869184
       locality {
       }
       incarnation: 11901565618654746057
       physical_device_desc: "device: XLA_GPU device"
  5/10: name: "/device:XLA_GPU:3"
       device_type: "XLA_GPU"
       memory_limit: 17179869184
       locality {
       }
       incarnation: 2041843563430914595
       physical_device_desc: "device: XLA_GPU device"
  6/10: name: "/device:XLA_CPU:0"
       device_type: "XLA_CPU"
       memory_limit: 17179869184
       locality {
       }
       incarnation: 17127850179857194621
       physical_device_desc: "device: XLA_CPU device"
  7/10: name: "/device:GPU:0"
       device_type: "GPU"
       memory_limit: 15780652647
       locality {
         bus_id: 1
         links {
           link {
             device_id: 1
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 2
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 3
             type: "StreamExecutor"
             strength: 1
           }
         }
       }
       incarnation: 12586835745688706958
       physical_device_desc: "device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1b.0, compute capability: 7.0"
  8/10: name: "/device:GPU:1"
       device_type: "GPU"
       memory_limit: 15780652647
       locality {
         bus_id: 1
         links {
           link {
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 2
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 3
             type: "StreamExecutor"
             strength: 1
           }
         }
       }
       incarnation: 16219785379659910568
       physical_device_desc: "device: 1, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1c.0, compute capability: 7.0"
  9/10: name: "/device:GPU:2"
       device_type: "GPU"
       memory_limit: 15780652647
       locality {
         bus_id: 1
         links {
           link {
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 1
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 3
             type: "StreamExecutor"
             strength: 1
           }
         }
       }
       incarnation: 15498480627625260496
       physical_device_desc: "device: 2, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1d.0, compute capability: 7.0"
  10/10: name: "/device:GPU:3"
       device_type: "GPU"
       memory_limit: 15780652647
       locality {
         bus_id: 1
         links {
           link {
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 1
             type: "StreamExecutor"
             strength: 1
           }
           link {
             device_id: 2
             type: "StreamExecutor"
             strength: 1
           }
         }
       }
       incarnation: 4701660963018527730
       physical_device_desc: "device: 3, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0"
Using gpu device 0: Tesla V100-SXM2-16GB
Using gpu device 1: Tesla V100-SXM2-16GB
Using gpu device 2: Tesla V100-SXM2-16GB
Using gpu device 3: Tesla V100-SXM2-16GB
Create graph...
Loading network, train flag False, eval flag True, search flag False
WARNING:tensorflow:From /home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
layer root/'data' output: Data(name='data', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'source' output: Data(name='source_output', shape=(None, 40), batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'lstm0_fw' output: Data(name='lstm0_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024])
layer root/'lstm0_bw' output: Data(name='lstm0_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'time:var:extern_data:data',B,F|1024])
layer root/'lstm0_pool' output: Data(name='lstm0_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048])
layer root/'lstm1_fw' output: Data(name='lstm1_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'lstm1_bw' output: Data(name='lstm1_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'lstm1_pool' output: Data(name='lstm1_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048])
layer root/'lstm2_fw' output: Data(name='lstm2_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm1_pool',B,F|1024])
layer root/'lstm2_bw' output: Data(name='lstm2_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm1_pool',B,F|1024])
layer root/'lstm2_pool' output: Data(name='lstm2_pool_output', shape=(None, 2048), batch_shape_meta=[B,T|?,F|2048])
layer root/'lstm3_fw' output: Data(name='lstm3_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'lstm3_bw' output: Data(name='lstm3_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'lstm3_pool' output: Data(name='lstm3_pool_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|2048])
layer root/'lstm4_fw' output: Data(name='lstm4_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'lstm4_bw' output: Data(name='lstm4_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'lstm4_pool' output: Data(name='lstm4_pool_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|2048])
layer root/'lstm5_fw' output: Data(name='lstm5_fw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'lstm5_bw' output: Data(name='lstm5_bw_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'encoder' output: Data(name='encoder_output', shape=(None, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|2048])
layer root/'ctc' output: Data(name='ctc_output', shape=(None, 10026), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|10026])
layer root/'enc_ctx' output: Data(name='enc_ctx_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/'inv_fertility' output: Data(name='inv_fertility_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1])
layer root/'enc_value' output: Data(name='enc_value_output', shape=(None, 1, 2048), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,1,F|2048])
layer root/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])
Rec layer 'output' (search False, train False) sub net:
  Input layers moved out of loop: (#: 1)
    output
  Output layers moved out of loop: (#: 0)
    None
  Layers in loop: (#: 1)
    end
  Unused layers: (#: 14)
    accum_att_weights
    att
    att0
    att_weights
    energy
    energy_in
    energy_tanh
    output_prob
    readout
    readout_in
    s
    s_transformed
    target_embed
    weight_feedback
layer root/output:rec-subnet-input/'output' output: Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_shape_meta=[B,T|'time:var:extern_data:classes'])
layer root/output:rec-subnet/'end' output: Data(name='end_output', shape=(None,), dtype='bool', sparse=True, dim=2, batch_shape_meta=[B,T|'time:var:extern_data:classes'])
layer root/output:rec-subnet/'weight_feedback' output: Data(name='weight_feedback_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|?,B,F|1024])
layer root/output:rec-subnet/'s' output: Data(name='s_output', shape=(1000,), time_dim_axis=None, batch_shape_meta=[B,F|1000])
layer root/output:rec-subnet/'s_transformed' output: Data(name='s_transformed_output', shape=(1024,), time_dim_axis=None, batch_shape_meta=[B,F|1024])
layer root/output:rec-subnet/'energy_in' output: Data(name='energy_in_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/output:rec-subnet/'energy_tanh' output: Data(name='energy_tanh_output', shape=(None, 1024), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1024])
layer root/output:rec-subnet/'energy' output: Data(name='energy_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1])
layer root/output:rec-subnet/'att_weights' output: Data(name='att_weights_output', shape=(1, None), time_dim_axis=2, feature_dim_axis=1, batch_shape_meta=[B,F|1,T|'spatial:0:lstm2_pool'])
layer root/output:rec-subnet/'accum_att_weights' output: Data(name='accum_att_weights_output', shape=(None, 1), batch_dim_axis=1, batch_shape_meta=[T|'spatial:0:lstm2_pool',B,F|1])
layer root/output:rec-subnet/'att0' output: Data(name='att0_output', shape=(1, 2048), time_dim_axis=None, batch_shape_meta=[B,1,F|2048])
layer root/output:rec-subnet/'att' output: Data(name='att_output', shape=(2048,), time_dim_axis=None, batch_shape_meta=[B,F|2048])
layer root/output:rec-subnet/'target_embed' output: Data(name='target_embed_output', shape=(None, 621), batch_shape_meta=[B,T|'time:var:extern_data:classes',F|621])
Exception creating layer root/'output' of class RecLayer with opts:
{'_target_layers': {},
 'cheating': False,
 'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>,
 'n_out': <class 'Util.NotSpecified'>,
 'name': 'output',
 'network': <TFNetwork 'root' train=False>,
 'output': Data(name='output_output', shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B]),
 'sources': [],
 'target': 'classes',
 'unit': {'accum_att_weights': {'class': 'eval',
                                'eval': 'source(0) + source(1) * source(2) * '
                                        '0.5',
                                'from': ['prev:accum_att_weights',
                                         'att_weights',
                                         'base:inv_fertility'],
                                'out_type': {'dim': 1, 'shape': (None, 1)}},
          'att': {'axes': 'except_batch',
                  'class': 'merge_dims',
                  'from': ['att0']},
          'att0': {'base': 'base:enc_value',
                   'class': 'generic_attention',
                   'weights': 'att_weights'},
          'att_weights': {'class': 'softmax_over_spatial', 'from': ['energy']},
          'end': {'class': 'compare', 'from': ['output'], 'value': 0},
          'energy': {'activation': None,
                     'class': 'linear',
                     'from': ['energy_tanh'],
                     'n_out': 1,
                     'with_bias': False},
          'energy_in': {'class': 'combine',
                        'from': ['base:enc_ctx',
                                 'weight_feedback',
                                 's_transformed'],
                        'kind': 'add',
                        'n_out': 1024},
          'energy_tanh': {'activation': 'tanh',
                          'class': 'activation',
                          'from': ['energy_in']},
          'output': {'beam_size': 12,
                     'cheating': False,
                     'class': 'choice',
                     'from': ['output_prob'],
                     'initial_output': 0,
                     'target': 'classes'},
          'output_prob': {'class': 'softmax',
                          'dropout': 0.3,
                          'from': ['readout'],
                          'loss': 'ce',
                          'loss_opts': {'label_smoothing': 0.1},
                          'target': 'classes'},
          'readout': {'class': 'reduce_out',
                      'from': ['readout_in'],
                      'mode': 'max',
                      'num_pieces': 2},
          'readout_in': {'activation': None,
                         'class': 'linear',
                         'from': ['s', 'prev:target_embed', 'att'],
                         'n_out': 1000},
          's': {'class': 'rnn_cell',
                'from': ['prev:target_embed', 'prev:att'],
                'n_out': 1000,
                'unit': 'LSTMBlock'},
          's_transformed': {'activation': None,
                            'class': 'linear',
                            'from': ['s'],
                            'n_out': 1024,
                            'with_bias': False},
          'target_embed': {'activation': None,
                           'class': 'linear',
                           'from': ['output'],
                           'initial_output': 0,
                           'n_out': 621,
                           'with_bias': False},
          'weight_feedback': {'activation': None,
                              'class': 'linear',
                              'from': ['prev:accum_att_weights'],
                              'n_out': 1024,
                              'with_bias': False}}}
Unhandled exception <class 'ValueError'> in thread <_MainThread(MainThread, started 140318093119616)>, proc 19583.

Thread current, main, <_MainThread(MainThread, started 140318093119616)>:
(Excluded thread.)

That were all threads.
EXCEPTION
Traceback (most recent call last):
  File "returnn/tools/compile_tf_graph.py", line 758, in <module>
    line: main(sys.argv)
    locals:
      main = <local> <function main at 0x7f9e3c0a3bf8>
      sys = <local> <module 'sys' (built-in)>
      sys.argv = <local> ['returnn/tools/compile_tf_graph.py', 'returnn.config', '--eval', '1', '--output_file', 'out.meta'], len = 6, _[0]: {len = 33}
  File "returnn/tools/compile_tf_graph.py", line 686, in main
    line: network = create_graph(train_flag=train_flag, eval_flag=eval_flag, search_flag=search_flag, net_dict=net_dict)
    locals:
      network = <not found>
      create_graph = <global> <function create_graph at 0x7f9e3c0a36a8>
      train_flag = <local> False
      eval_flag = <local> True
      search_flag = <local> False
      net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 25
  File "returnn/tools/compile_tf_graph.py", line 77, in create_graph
    line: network, updater = Engine.create_network(
            config=config, rnd_seed=1,
            train_flag=train_flag, eval_flag=eval_flag, search_flag=search_flag,
            net_dict=net_dict)
    locals:
      network = <not found>
      updater = <not found>
      Engine = <local> <class 'TFEngine.Engine'>
      Engine.create_network = <local> <bound method Engine.create_network of <class 'TFEngine.Engine'>>
      config = <global> <Config.Config object at 0x7f9e3c0880b8>
      rnd_seed = <not found>
      train_flag = <local> False
      eval_flag = <local> True
      search_flag = <local> False
      net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 25
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFEngine.py", line 1113, in create_network
    line: network.construct_from_dict(net_dict)
    locals:
      network = <local> <TFNetwork 'root' train=False>
      network.construct_from_dict = <local> <bound method TFNetwork.construct_from_dict of <TFNetwork 'root' train=False>>
      net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 25
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 460, in construct_from_dict
    line: self.construct_layer(net_dict, name)
    locals:
      self = <local> <TFNetwork 'root' train=False>
      self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=False>>
      net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 25
      name = <local> 'decision', len = 8
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 652, in construct_layer
    line: layer_class.transform_config_dict(layer_desc, network=self, get_layer=get_layer)
    locals:
      layer_class = <local> <class 'TFNetworkRecLayer.DecideLayer'>
      layer_class.transform_config_dict = <local> <bound method BaseChoiceLayer.transform_config_dict of <class 'TFNetworkRecLayer.DecideLayer'>>
      layer_desc = <local> {'loss': 'edit_distance', 'target': 'classes', 'loss_opts': {}}
      network = <not found>
      self = <local> <TFNetwork 'root' train=False>
      get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7f9e06474158>
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 4092, in transform_config_dict
    line: super(BaseChoiceLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
    locals:
      super = <builtin> <class 'super'>
      BaseChoiceLayer = <global> <class 'TFNetworkRecLayer.BaseChoiceLayer'>
      cls = <local> <class 'TFNetworkRecLayer.DecideLayer'>
      transform_config_dict = <not found>
      d = <local> {'loss': 'edit_distance', 'target': 'classes', 'loss_opts': {}}
      network = <local> <TFNetwork 'root' train=False>
      get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7f9e06474158>
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 448, in transform_config_dict
    line: for src_name in src_names
    locals:
      src_name = <not found>
      src_names = <local> ['output'], _[0]: {len = 6}
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkLayer.py", line 449, in <listcomp>
    line: d["sources"] = [
            get_layer(src_name)
            for src_name in src_names
            if not src_name == "none"]
    locals:
      d = <not found>
      get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x7f9e06474158>
      src_name = <local> 'output', len = 6
      src_names = <not found>
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 607, in get_layer
    line: return self.construct_layer(net_dict=net_dict, name=src_name)  # set get_layer to wrap construct_layer
    locals:
      self = <local> <TFNetwork 'root' train=False>
      self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=False>>
      net_dict = <local> {'source': {'class': 'eval', 'eval': 'tf.clip_by_value(source(0), -3.0, 3.0)'}, 'lstm0_fw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': 1, 'from': ['source']}, 'lstm0_bw': {'class': 'rec', 'unit': 'nativelstm2', 'n_out': 1024, 'direction': -1, 'from': ['source']}, 'lstm0_p..., len = 25
      name = <not found>
      src_name = <local> 'output', len = 6
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 655, in construct_layer
    line: return add_layer(name=name, layer_class=layer_class, **layer_desc)
    locals:
      add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root' train=False>>
      name = <local> 'output', len = 6
      layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
      layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 7
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 760, in add_layer
    line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
    locals:
      layer = <not found>
      self = <local> <TFNetwork 'root' train=False>
      self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root' train=False>>
      name = <local> 'output', len = 6
      layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
      layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 7
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetwork.py", line 709, in _create_layer
    line: layer = layer_class(**layer_desc)
    locals:
      layer = <not found>
      layer_class = <local> <class 'TFNetworkRecLayer.RecLayer'>
      layer_desc = <local> {'cheating': False, 'unit': {'output': {'class': 'choice', 'target': 'classes', 'beam_size': 12, 'cheating': False, 'from': ['output_prob'], 'initial_output': 0}, 'end': {'class': 'compare', 'from': ['output'], 'value': 0}, 'target_embed': {'class': 'linear', 'activation': None, 'with_bias': Fals..., len = 10
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 210, in __init__
    line: y = self._get_output_subnet_unit(self.cell)
    locals:
      y = <not found>
      self = <local> <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>
      self._get_output_subnet_unit = <local> <bound method RecLayer._get_output_subnet_unit of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      self.cell = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 834, in _get_output_subnet_unit
    line: output = cell.get_output(rec_layer=self)
    locals:
      output = <not found>
      cell = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      cell.get_output = <local> <bound method _SubnetworkRecCell.get_output of <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>>
      rec_layer = <not found>
      self = <local> <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 2252, in get_output
    line: final_loop_vars = self._while_loop(
            cond=cond,
            body=body,
            loop_vars=init_loop_vars,
            shape_invariants=shape_invariants)
    locals:
      final_loop_vars = <not found>
      self = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      self._while_loop = <local> <bound method _SubnetworkRecCell._while_loop of <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>>
      cond = <local> <function _SubnetworkRecCell.get_output.<locals>.cond at 0x7f9b1eed7950>
      body = <local> <function _SubnetworkRecCell.get_output.<locals>.body at 0x7f9b1eed7ae8>
      loop_vars = <not found>
      init_loop_vars = <local> (<tf.Tensor 'output/rec/initial_i:0' shape=() dtype=int32>, ([<tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/target_embed/init_target_embed...
      shape_invariants = <local> (TensorShape([]), ([TensorShape([Dimension(None), Dimension(None), Dimension(1)]), TensorShape([Dimension(None), Dimension(2048)]), TensorShape([Dimension(None), Dimension(621)])], [[LSTMStateTuple(c=TensorShape([Dimension(None), Dimension(1000)]), h=TensorShape([Dimension(None), Dimension(1000)]..., _[0]: {len = 0}
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 1627, in _while_loop
    line: return tf.while_loop(
            cond=cond,
            body=body,
            loop_vars=loop_vars,
            shape_invariants=shape_invariants,
            back_prop=self.parent_rec_layer.back_prop)
    locals:
      tf = <global> <module 'tensorflow' from '/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/__init__.py'>
      tf.while_loop = <global> <function while_loop at 0x7f9da476cae8>
      cond = <local> <function _SubnetworkRecCell.get_output.<locals>.cond at 0x7f9b1eed7950>
      body = <local> <function _SubnetworkRecCell.get_output.<locals>.body at 0x7f9b1eed7ae8>
      loop_vars = <local> (<tf.Tensor 'output/rec/initial_i:0' shape=() dtype=int32>, ([<tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/target_embed/init_target_embed...
      shape_invariants = <local> (TensorShape([]), ([TensorShape([Dimension(None), Dimension(None), Dimension(1)]), TensorShape([Dimension(None), Dimension(2048)]), TensorShape([Dimension(None), Dimension(621)])], [[LSTMStateTuple(c=TensorShape([Dimension(None), Dimension(1000)]), h=TensorShape([Dimension(None), Dimension(1000)]..., _[0]: {len = 0}
      back_prop = <not found>
      self = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      self.parent_rec_layer = <local> <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>
      self.parent_rec_layer.back_prop = <local> False
  File "/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop
    line: result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants,
                                          return_same_structure)
    locals:
      result = <not found>
      loop_context = <local> <tensorflow.python.ops.control_flow_ops.WhileContext object at 0x7f9b1eed8940>
      loop_context.BuildLoop = <local> <bound method WhileContext.BuildLoop of <tensorflow.python.ops.control_flow_ops.WhileContext object at 0x7f9b1eed8940>>
      cond = <local> <function _SubnetworkRecCell.get_output.<locals>.cond at 0x7f9b1eed7950>
      body = <local> <function _SubnetworkRecCell.get_output.<locals>.body at 0x7f9b1eed7ae8>
      loop_vars = <local> (<tf.Tensor 'output/rec/initial_i:0' shape=() dtype=int32>, ([<tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/target_embed/init_target_embed...
      shape_invariants = <local> (TensorShape([]), ([TensorShape([Dimension(None), Dimension(None), Dimension(1)]), TensorShape([Dimension(None), Dimension(2048)]), TensorShape([Dimension(None), Dimension(621)])], [[LSTMStateTuple(c=TensorShape([Dimension(None), Dimension(1000)]), h=TensorShape([Dimension(None), Dimension(1000)]..., _[0]: {len = 0}
      return_same_structure = <local> False
  File "/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop
    line: original_body_result, exit_vars = self._BuildLoop(
              pred, body, original_loop_vars, loop_vars, shape_invariants)
    locals:
      original_body_result = <not found>
      exit_vars = <not found>
      self = <local> <tensorflow.python.ops.control_flow_ops.WhileContext object at 0x7f9b1eed8940>
      self._BuildLoop = <local> <bound method WhileContext._BuildLoop of <tensorflow.python.ops.control_flow_ops.WhileContext object at 0x7f9b1eed8940>>
      pred = <local> <function _SubnetworkRecCell.get_output.<locals>.cond at 0x7f9b1eed7950>
      body = <local> <function _SubnetworkRecCell.get_output.<locals>.body at 0x7f9b1eed7ae8>
      original_loop_vars = <local> (<tf.Tensor 'output/rec/initial_i:0' shape=() dtype=int32>, ([<tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/target_embed/init_target_embed...
      loop_vars = <local> [<tf.Tensor 'output/rec/initial_i:0' shape=() dtype=int32>, <tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/target_embed/init_target_embed_c..., len = 8
      shape_invariants = <local> (TensorShape([]), ([TensorShape([Dimension(None), Dimension(None), Dimension(1)]), TensorShape([Dimension(None), Dimension(2048)]), TensorShape([Dimension(None), Dimension(621)])], [[LSTMStateTuple(c=TensorShape([Dimension(None), Dimension(1000)]), h=TensorShape([Dimension(None), Dimension(1000)]..., _[0]: {len = 0}
  File "/home/ubuntu/tf1.13/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3022, in _BuildLoop
    line: body_result = body(*packed_vars_for_body)
    locals:
      body_result = <not found>
      body = <local> <function _SubnetworkRecCell.get_output.<locals>.body at 0x7f9b1eed7ae8>
      packed_vars_for_body = <local> (<tf.Tensor 'output/rec/while/Identity:0' shape=() dtype=int32>, ([<tf.Tensor 'output/rec/while/Identity_1:0' shape=(?, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/while/Identity_2:0' shape=(?, 2048) dtype=float32>, <tf.Tensor 'output/rec/while/Identity_3:0' shape=(?, 621) dtype=float32>], [[LST...
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 2141, in body
    line: outputs_flat = [
            maybe_transform(self.net.layers[k]).output.copy_compatible_to(
              self.layer_data_templates[k].output).placeholder
            for k in sorted(self._initial_outputs)]
    locals:
      outputs_flat = <not found>
      maybe_transform = <local> <function _SubnetworkRecCell.get_output.<locals>.body.<locals>.maybe_transform at 0x7f9c3812f1e0>
      self = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      self.net = <local> <TFNetwork 'root/output:rec-subnet' parent_layer=<RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])> train=False>
      self.net.layers = <local> {'prev:end': <_TemplateLayer(CompareLayer)(:prev:compare) 'output/prev:end' out_type=Data(shape=(), dtype='bool', sparse=True, dim=2, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>, ':i': <RecStepInfoLayer 'output/:i' out_type=Data(shape=(), dtype='int32', batch_dim_axis=Non..., len = 18
      k = <not found>
      output = <not found>
      output.copy_compatible_to = <not found>
      self.layer_data_templates = <local> {'output': <_TemplateLayer(ChoiceLayer)(:template:choice) 'output/output' out_type=Data(shape=(), dtype='int32', sparse=True, dim=10025, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>, 'end': <_TemplateLayer(CompareLayer)(:template:compare) 'output/end' out_type=Data(shape=(..., len = 16
      placeholder = <not found>
      sorted = <builtin> <built-in function sorted>
      self._initial_outputs = <local> {'accum_att_weights': <tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, 'att': <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, 'target_embed': <tf.Tensor 'output/rec/target_embed/init_target_embed_const/Cast:0' sh...
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFNetworkRecLayer.py", line 2141, in <listcomp>
    line: outputs_flat = [
            maybe_transform(self.net.layers[k]).output.copy_compatible_to(
              self.layer_data_templates[k].output).placeholder
            for k in sorted(self._initial_outputs)]
    locals:
      outputs_flat = <not found>
      maybe_transform = <local> <function _SubnetworkRecCell.get_output.<locals>.body.<locals>.maybe_transform at 0x7f9c3812f1e0>
      self = <local> <_SubnetworkRecCell of <RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])>>
      self.net = <local> <TFNetwork 'root/output:rec-subnet' parent_layer=<RecLayer 'output' out_type=Data(shape=(None,), dtype='int32', sparse=True, dim=10025, batch_dim_axis=1, batch_shape_meta=[T|?,B])> train=False>
      self.net.layers = <local> {'prev:end': <_TemplateLayer(CompareLayer)(:prev:compare) 'output/prev:end' out_type=Data(shape=(), dtype='bool', sparse=True, dim=2, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>, ':i': <RecStepInfoLayer 'output/:i' out_type=Data(shape=(), dtype='int32', batch_dim_axis=Non..., len = 18
      k = <local> 'target_embed', len = 12
      output = <not found>
      output.copy_compatible_to = <not found>
      self.layer_data_templates = <local> {'output': <_TemplateLayer(ChoiceLayer)(:template:choice) 'output/output' out_type=Data(shape=(), dtype='int32', sparse=True, dim=10025, time_dim_axis=None, batch_shape_meta=[B]) (construction stack None)>, 'end': <_TemplateLayer(CompareLayer)(:template:compare) 'output/end' out_type=Data(shape=(..., len = 16
      placeholder = <not found>
      sorted = <builtin> <built-in function sorted>
      self._initial_outputs = <local> {'accum_att_weights': <tf.Tensor 'output/rec/accum_att_weights/init_accum_att_weights_zeros:0' shape=(1, ?, 1) dtype=float32>, 'att': <tf.Tensor 'output/rec/att/init_att_zeros:0' shape=(?, 2048) dtype=float32>, 'target_embed': <tf.Tensor 'output/rec/target_embed/init_target_embed_const/Cast:0' sh...
  File "/home/ubuntu/manish/returnn-experiments/2018-asr-attention/librispeech/full-setup-attention/returnn/TFUtil.py", line 1218, in copy_compatible_to
    line: raise ValueError("copy_compatible_to: self %r already has more dims than target data %r" % (self, data))
    locals:
      ValueError = <builtin> <class 'ValueError'>
      self = <local> Data(name='target_embed_output', shape=(None, 621), batch_shape_meta=[B,T|'time:var:extern_data:classes',F|621])
      data = <local> Data(name='target_embed_output', shape=(621,), time_dim_axis=None, batch_shape_meta=[B,F|621])
ValueError: copy_compatible_to: self Data(name='target_embed_output', shape=(None, 621), batch_shape_meta=[B,T|'time:var:extern_data:classes',F|621]) already has more dims than target data Data(name='target_embed_output', shape=(621,), time_dim_axis=None, batch_shape_meta=[B,F|621])