-
-
Save Max-Ryujin/c559f169bc27f8f77e1e39af0146282b to your computer and use it in GitHub Desktop.
OOM Error in /u/maximilian.kannen/setups/20230406_feat/alias/experiments/switchboard/ctc/feat/train_nn/conformer_bs5k_audio_perturbation_scf_conf-wei-oldspecaug-audio_perturbation_speed0.4_0.8_1.2/log.run.1
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--------------------- Slurm Task Prolog ------------------------ | |
Job ID: 2810223 | |
Job name: ReturnnTrainingJob.TH4IPwv1UZf5.run | |
Host: cn-260 | |
Date: Fr 27. Okt 13:28:14 CEST 2023 | |
User: maximilian.kannen | |
Slurm account: hlt | |
Slurm partition: gpu_11gb | |
Work dir: | |
------------------ | |
Node usage: | |
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | |
2810223_1 gpu_11gb ReturnnT maximili R 0:00 1 cn-260 | |
2810222_1 gpu_11gb ReturnnT maximili R 0:04 1 cn-260 | |
------------------ | |
Show launch script with: | |
sacct -B -j | |
------------------ | |
--------------------- Slurm Task Prolog ------------------------ | |
[2023-10-27 13:28:22,572] INFO: [32mGenerating grammar tables from /usr/local/lib/python3.8/dist-packages/blib2to3/Grammar.txt[0m | |
[2023-10-27 13:28:22,585] INFO: [32mWriting grammar tables to /u/maximilian.kannen/.cache/black/22.3.0/Grammar3.8.10.final.0.pickle[0m | |
[2023-10-27 13:28:22,585] INFO: [32mWriting failed: [Errno 2] No such file or directory: '/u/maximilian.kannen/.cache/black/22.3.0/tmps2iuimfr'[0m | |
[2023-10-27 13:28:22,585] INFO: [32mGenerating grammar tables from /usr/local/lib/python3.8/dist-packages/blib2to3/PatternGrammar.txt[0m | |
[2023-10-27 13:28:22,587] INFO: [32mWriting grammar tables to /u/maximilian.kannen/.cache/black/22.3.0/PatternGrammar3.8.10.final.0.pickle[0m | |
[2023-10-27 13:28:22,587] INFO: [32mWriting failed: [Errno 2] No such file or directory: '/u/maximilian.kannen/.cache/black/22.3.0/tmps2sjen20'[0m | |
[2023-10-27 13:28:23,311] INFO: [32mStart Job: Job<alias/experiments/switchboard/ctc/feat/train_nn/conformer_bs5k_audio_perturbation_scf_conf-wei-oldspecaug-audio_perturbation_speed0.4_0.8_1.2 work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5> Task: run[0m | |
[2023-10-27 13:28:23,311] INFO: [32mInputs:[0m | |
[2023-10-27 13:28:23,311] INFO: [32m/u/vieting/setups/swb/20230406_feat/dependencies/allophones_blank[0m | |
[2023-10-27 13:28:23,311] INFO: [32m/u/vieting/setups/swb/20230406_feat/dependencies/state-tying_blank[0m | |
[2023-10-27 13:28:23,312] INFO: [32m/usr/bin/python3[0m | |
[2023-10-27 13:28:23,312] INFO: [32m/work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard[0m | |
[2023-10-27 13:28:23,312] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/corpus/filter/FilterSegmentsByListJob.Fzh6DWEkIA5y/output/segments.1[0m | |
[2023-10-27 13:28:23,313] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/corpus/filter/FilterSegmentsByListJob.SVlbt6fqP4Jn/output/segments.1[0m | |
[2023-10-27 13:28:23,313] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/corpus/filter/FilterSegmentsByListJob.nrKcBIdsMBZm/output/segments.1[0m | |
[2023-10-27 13:28:23,316] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/datasets/switchboard/CreateSwitchboardBlissCorpusJob.Z1EMi4TdrUS6/output/swb.corpus.xml.gz[0m | |
[2023-10-27 13:28:23,318] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/oggzip/BlissToOggZipJob.lAFM8R9mzLpI/output/out.ogg.zip[0m | |
[2023-10-27 13:28:23,319] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/text/processing/TailJob.RiSM6fe2XipO/output/out.gz[0m | |
[2023-10-27 13:28:23,320] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository[0m | |
[2023-10-27 13:28:23,321] INFO: [32m/u/maximilian.kannen/setups/20230406_feat/work/i6_experiments/users/berger/recipe/lexicon/modification/MakeBlankLexiconJob.N8RlHYKzilei/output/lexicon.xml[0m | |
Uname: uname_result(system='Linux', node='cn-260', release='5.15.0-39-generic', version='#42-Ubuntu SMP Thu Jun 9 23:42:32 UTC 2022', machine='x86_64', processor='x86_64') | |
Load: (0.24, 0.35, 0.96) | |
[2023-10-27 13:28:23,323] INFO: [32m------------------------------------------------------------[0m | |
[2023-10-27 13:28:23,323] INFO: [32mStarting subtask for arg id: 0 args: [][0m | |
[2023-10-27 13:28:23,323] INFO: [32m------------------------------------------------------------[0m | |
[2023-10-27 13:28:23,337] INFO: [32mRun time: 0:00:00 CPU: 79.60% RSS: 86MB VMS: 305MB[0m | |
RETURNN starting up, version 1.20231026.144554+git.d62891f6, date/time 2023-10-27-13-28-24 (UTC+0200), pid 4043690, cwd /work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/work, Python /usr/bin/python3 | |
RETURNN command line options: ['/u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/returnn.config'] | |
Hostname: cn-260 | |
[2023-10-27 13:28:28,349] INFO: [32mRun time: 0:00:05 CPU: 0.40% RSS: 206MB VMS: 1.32GB[0m | |
[2023-10-27 13:28:33,361] INFO: [32mRun time: 0:00:10 CPU: 0.20% RSS: 367MB VMS: 1.51GB[0m | |
TensorFlow: 2.8.0 (unknown) (<not-under-git> in /usr/local/lib/python3.8/dist-packages/tensorflow) | |
Use num_threads=1 (but min 2) via OMP_NUM_THREADS. | |
Setup TF inter and intra global thread pools, num_threads 2, session opts {'log_device_placement': False, 'device_count': {'GPU': 0}, 'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2}. | |
2023-10-27 13:28:37.693446: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX | |
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
CUDA_VISIBLE_DEVICES is set to '1'. | |
Collecting TensorFlow device list... | |
[2023-10-27 13:28:38,379] INFO: [32mRun time: 0:00:15 CPU: 0.00% RSS: 440MB VMS: 6.92GB[0m | |
[2023-10-27 13:28:43,396] INFO: [32mRun time: 0:00:20 CPU: 0.40% RSS: 678MB VMS: 12.16GB[0m | |
2023-10-27 13:28:45.477199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 10245 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1 | |
Local devices available to TensorFlow: | |
1/2: name: "/device:CPU:0" | |
device_type: "CPU" | |
memory_limit: 268435456 | |
locality { | |
} | |
incarnation: 5393298771443453831 | |
xla_global_id: -1 | |
2/2: name: "/device:GPU:0" | |
device_type: "GPU" | |
memory_limit: 10742726656 | |
locality { | |
bus_id: 1 | |
links { | |
} | |
} | |
incarnation: 8870728080383846691 | |
physical_device_desc: "device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1" | |
xla_global_id: 416903419 | |
Using gpu device 1: NVIDIA GeForce GTX 1080 Ti | |
Hostname 'cn-260', GPU 1, GPU-dev-name 'NVIDIA GeForce GTX 1080 Ti', GPU-memory 10.0GB | |
LOG: connected to ('10.6.100.1', 10321) | |
LOG: destination: /var/tmp/maximilian.kannen/work/asr4/vieting/setups/swb/work/20230406_feat/i6_core/returnn/oggzip/BlissToOggZipJob.lAFM8R9mzLpI/output/out.ogg.zip | |
LOG: using existing file | |
LOG: connected to ('10.6.100.1', 10321) | |
LOG: destination: /var/tmp/maximilian.kannen/work/asr4/vieting/setups/swb/work/20230406_feat/i6_core/returnn/oggzip/BlissToOggZipJob.lAFM8R9mzLpI/output/out.ogg.zip | |
LOG: using existing file | |
LOG: connected to ('10.6.100.1', 10321) | |
LOG: destination: /var/tmp/maximilian.kannen/work/asr4/vieting/setups/swb/work/20230406_feat/i6_core/returnn/oggzip/BlissToOggZipJob.lAFM8R9mzLpI/output/out.ogg.zip | |
LOG: using existing file | |
LOG: connected to ('10.6.100.1', 10321) | |
LOG: destination: /var/tmp/maximilian.kannen/work/asr4/vieting/setups/swb/work/20230406_feat/i6_core/returnn/oggzip/BlissToOggZipJob.lAFM8R9mzLpI/output/out.ogg.zip | |
LOG: using existing file | |
[2023-10-27 13:28:48,415] INFO: [32mRun time: 0:00:25 CPU: 0.40% RSS: 1.24GB VMS: 13.29GB[0m | |
[2023-10-27 13:28:53,432] INFO: [32mRun time: 0:00:30 CPU: 0.40% RSS: 1.58GB VMS: 13.64GB[0m | |
Train data: | |
input: 1 x 1 | |
output: {'raw': {'dtype': 'string', 'shape': ()}, 'orth': [256, 1], 'data': [1, 2]} | |
MultiProcDataset, sequences: 249229, frames: unknown | |
Dev data: | |
OggZipDataset, sequences: 300, frames: unknown | |
Learning-rate-control: file learning_rates does not exist yet | |
Setup TF session with options {'log_device_placement': False, 'device_count': {'GPU': 1}} ... | |
2023-10-27 13:28:57.617039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10245 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1 | |
layer /'data': [B,T|'time:var:extern_data:data'[B],F|F'feature:data'(1)] float32 | |
layer /features/'conv_h_filter': ['conv_h_filter:static:0'(128),'conv_h_filter:static:1'(1),F|F'conv_h_filter:static:2'(150)] float32 | |
layer /features/'conv_h': [B,T|'⌈((-63+time:var:extern_data:data)+-64)/5⌉'[B],F|F'conv_h:channel'(150)] float32 | |
layer /features/'conv_h_act': [B,T|'⌈((-63+time:var:extern_data:data)+-64)/5⌉'[B],F|F'conv_h:channel'(150)] float32 | |
layer /features/'conv_h_split': [B,T|'⌈((-63+time:var:extern_data:data)+-64)/5⌉'[B],F'conv_h:channel'(150),F|F'conv_h_split_split_dims1'(1)] float32 | |
DEPRECATION WARNING: Explicitly specify in_spatial_dims when there is more than one spatial dim in the input. | |
This will be disallowed with behavior_version 8. | |
layer /features/'conv_l': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F'conv_h:channel'(150),F|F'conv_l:channel'(5)] float32 | |
layer /features/'conv_l_merge': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
DEPRECATION WARNING: MergeDimsLayer, only keep_order=True is allowed | |
This will be disallowed with behavior_version 6. | |
layer /features/'conv_l_act_no_norm': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
layer /features/'conv_l_act': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
layer /features/'output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
layer /'features': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
layer /'specaug': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F|F'conv_h:channel*conv_l:channel'(750)] float32 | |
WARNING:tensorflow:From /work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/tf/network.py:2462: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. | |
Instructions for updating: | |
Call initializer instance with the dtype argument instead of passing it to the constructor | |
layer /'conv_source': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F'conv_h:channel*conv_l:channel'(750),F|F'conv_source_split_dims1'(1)] float32 | |
layer /'conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],F'conv_h:channel*conv_l:channel'(750),F|F'conv_1:channel'(32)] float32 | |
layer /'conv_1_pool': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/16⌉'[B],'conv_h:channel*conv_l:channel//2'(375),F|F'conv_1:channel'(32)] float32 | |
layer /'conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/32⌉'[B],'conv_h:channel*conv_l:channel//2'(375),F|F'conv_2:channel'(64)] float32 | |
layer /'conv_3': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'conv_h:channel*conv_l:channel//2'(375),F|F'conv_3:channel'(64)] float32 | |
layer /'conv_merged': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conv_h:channel*conv_l:channel//2)*conv_3:channel'(24000)] float32 | |
layer /'input_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'input_linear:feature-dense'(512)] float32 | |
layer /'input_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'input_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'input_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_1_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_1_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_1_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_1_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
DEPRECATION WARNING: batch_norm masked_time should be specified explicitly | |
This will be disallowed with behavior_version 12. | |
WARNING:tensorflow:From /work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/tf/util/basic.py:1725: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. | |
Instructions for updating: | |
Call initializer instance with the dtype argument instead of passing it to the constructor | |
[2023-10-27 13:28:58,450] INFO: [32mRun time: 0:00:35 CPU: 0.20% RSS: 1.91GB VMS: 13.97GB[0m | |
layer /'conformer_1_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_1_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_1_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_1_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_1_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_1_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_1_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_1_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_1_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_1_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_1_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_2_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_2_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_2_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_2_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_2_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_2_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_2_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_2_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_2_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_2_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_2_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_2_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_2_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_2_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_3_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_3_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_3_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_3_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_3_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_3_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_3_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_3_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_3_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_3_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_3_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_3_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_3_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_3_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_4_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_4_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_4_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_4_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_4_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_4_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_4_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_4_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_4_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_4_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_4_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_4_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_4_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_4_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_5_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_5_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_5_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_5_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_5_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_5_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_5_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_5_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_5_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_5_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_5_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_5_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_5_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_5_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_6_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_6_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_6_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_6_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_6_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_6_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_6_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_6_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_6_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_6_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_6_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_6_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_6_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_6_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_7_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_7_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_7_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_7_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_7_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_7_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_7_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_7_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_7_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_7_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_7_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_7_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_7_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_7_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_8_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_8_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_8_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_8_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_8_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_8_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_8_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_8_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_8_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_8_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_8_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_8_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_8_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_8_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_9_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_9_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_9_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_9_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_9_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_9_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_9_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_9_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_9_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_9_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_9_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_9_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_9_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_9_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_10_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_10_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_10_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_10_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_10_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_10_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_10_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_10_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_10_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_10_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_10_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_10_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_10_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_10_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_11_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_11_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_11_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_11_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_11_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_11_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_11_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_11_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_11_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_11_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_11_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_11_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_11_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_1_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_11_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_1_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_1_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_12_ffmod_1_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_1_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_1_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_conv_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_1_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_conv_mod_pointwise_conv_1': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_pointwise_conv_1:feature-dense'(1024)] float32 | |
layer /'conformer_12_conv_mod_glu': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'(conformer_12_conv_mod_pointwise_conv_1:feature-dense)//2'(512)] float32 | |
layer /'conformer_12_conv_mod_depthwise_conv': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_conv_mod_bn': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_conv_mod_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_conv_mod_pointwise_conv_2': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_conv_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_conv_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_mhsa_mod_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_conv_mod_depthwise_conv:channel'(512)] float32 | |
layer /'conformer_12_mhsa_mod_relpos_encoding': [T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_relpos_encoding_rel_pos_enc_feat'(64)] float32 | |
layer /'conformer_12_mhsa_mod_self_attention': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_12_mhsa_mod_att_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_12_mhsa_mod_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_12_mhsa_mod_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_12_ffmod_2_ln': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_mhsa_mod_self_attention_self_att_feat'(512)] float32 | |
layer /'conformer_12_ffmod_2_linear_swish': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_linear_swish:feature-dense'(2048)] float32 | |
layer /'conformer_12_ffmod_2_dropout_linear': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_2_dropout': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_ffmod_2_half_res_add': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'conformer_12_output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
layer /'encoder': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'conformer_12_ffmod_2_dropout_linear:feature-dense'(512)] float32 | |
2023-10-27 13:29:04.799913: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10245 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1 | |
layer /'output': [B,T|'⌈((-19+(⌈((-63+time:var:extern_data:data)+-64)/5⌉))+-20)/64⌉'[B],F|F'output:feature-dense'(88)] float32 | |
WARNING:tensorflow:From /work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/tf/sprint.py:54: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version. | |
Instructions for updating: | |
tf.py_func is deprecated in TF V2. Instead, there are two | |
options available in V2. | |
- tf.py_function takes a python function which manipulates tf eager | |
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to | |
an ndarray (just call tensor.numpy()) but having access to eager tensors | |
means `tf.py_function`s can use accelerators such as GPUs as well as | |
being differentiable using a gradient tape. | |
- tf.numpy_function maintains the semantics of the deprecated tf.py_func | |
(it is not differentiable, and manipulates numpy arrays). It drops the | |
stateful argument making all functions stateful. | |
Waiting for lock-file: /var/tmp/maximilian.kannen/returnn_tf_cache/ops/FastBaumWelchOp/08a9779b3b/lock_file | |
OpCodeCompiler call: /usr/local/cuda-11.6/bin/nvcc -shared -O2 -std=c++14 -I /usr/local/lib/python3.8/dist-packages/tensorflow/include -I /usr/local/lib/python3.8/dist-packages/tensorflow/include/external/nsync/public -ccbin /usr/bin/gcc-9 -I /usr/local/cuda-11.6/targets/x86_64-linux/include -I /usr/local/cuda-11.6/include -L /usr/local/cuda-11.6/lib64 -x cu -v -DGOOGLE_CUDA=1 -Xcompiler -fPIC -Xcompiler -v -arch compute_61 -I /usr/local/lib/python3.8/dist-packages/tensorflow/include/third_party/gpus/cuda/include -D_GLIBCXX_USE_CXX11_ABI=1 -DNDEBUG=1 -g /var/tmp/maximilian.kannen/returnn_tf_cache/ops/FastBaumWelchOp/08a9779b3b/FastBaumWelchOp.cc -o /var/tmp/maximilian.kannen/returnn_tf_cache/ops/FastBaumWelchOp/08a9779b3b/FastBaumWelchOp.so -L/usr/local/lib/python3.8/dist-packages/scipy/.libs -l:libopenblasp-r0-34a18dc3.3.7.so -L/usr/local/lib/python3.8/dist-packages/numpy.libs -l:libopenblasp-r0-2d23e62b.3.17.so -L/usr/local/lib/python3.8/dist-packages/tensorflow -l:libtensorflow_framework.so.2 | |
[2023-10-27 13:29:48,594] INFO: [32mRun time: 0:01:25 CPU: 0.20% RSS: 2.47GB VMS: 14.82GB[0m | |
[2023-10-27 13:29:53,612] INFO: [32mRun time: 0:01:30 CPU: 0.40% RSS: 3.14GB VMS: 15.62GB[0m | |
[2023-10-27 13:29:58,629] INFO: [32mRun time: 0:01:35 CPU: 0.40% RSS: 2.64GB VMS: 14.97GB[0m | |
[2023-10-27 13:30:03,645] INFO: [32mRun time: 0:01:40 CPU: 0.40% RSS: 2.29GB VMS: 14.64GB[0m | |
[2023-10-27 13:30:08,661] INFO: [32mRun time: 0:01:45 CPU: 0.20% RSS: 2.72GB VMS: 15.09GB[0m | |
[2023-10-27 13:30:13,674] INFO: [32mRun time: 0:01:50 CPU: 0.40% RSS: 2.13GB VMS: 14.46GB[0m | |
Network layer topology: | |
extern data: data: Tensor{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(1)]}, seq_tag: Tensor{[B?], dtype='string'} | |
used data keys: ['data', 'seq_tag'] | |
layers: | |
layer batch_norm 'conformer_10_conv_mod_bn' #: 512 | |
layer conv 'conformer_10_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_10_conv_mod_dropout' #: 512 | |
layer gating 'conformer_10_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_10_conv_mod_ln' #: 512 | |
layer linear 'conformer_10_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_10_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_10_conv_mod_res_add' #: 512 | |
layer activation 'conformer_10_conv_mod_swish' #: 512 | |
layer copy 'conformer_10_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_10_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_10_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_10_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_10_ffmod_1_ln' #: 512 | |
layer copy 'conformer_10_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_10_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_10_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_10_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_10_ffmod_2_ln' #: 512 | |
layer linear 'conformer_10_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_10_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_10_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_10_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_10_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_10_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_10_output' #: 512 | |
layer batch_norm 'conformer_11_conv_mod_bn' #: 512 | |
layer conv 'conformer_11_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_11_conv_mod_dropout' #: 512 | |
layer gating 'conformer_11_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_11_conv_mod_ln' #: 512 | |
layer linear 'conformer_11_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_11_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_11_conv_mod_res_add' #: 512 | |
layer activation 'conformer_11_conv_mod_swish' #: 512 | |
layer copy 'conformer_11_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_11_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_11_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_11_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_11_ffmod_1_ln' #: 512 | |
layer copy 'conformer_11_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_11_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_11_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_11_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_11_ffmod_2_ln' #: 512 | |
layer linear 'conformer_11_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_11_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_11_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_11_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_11_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_11_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_11_output' #: 512 | |
layer batch_norm 'conformer_12_conv_mod_bn' #: 512 | |
layer conv 'conformer_12_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_12_conv_mod_dropout' #: 512 | |
layer gating 'conformer_12_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_12_conv_mod_ln' #: 512 | |
layer linear 'conformer_12_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_12_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_12_conv_mod_res_add' #: 512 | |
layer activation 'conformer_12_conv_mod_swish' #: 512 | |
layer copy 'conformer_12_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_12_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_12_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_12_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_12_ffmod_1_ln' #: 512 | |
layer copy 'conformer_12_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_12_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_12_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_12_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_12_ffmod_2_ln' #: 512 | |
layer linear 'conformer_12_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_12_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_12_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_12_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_12_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_12_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_12_output' #: 512 | |
layer batch_norm 'conformer_1_conv_mod_bn' #: 512 | |
layer conv 'conformer_1_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_1_conv_mod_dropout' #: 512 | |
layer gating 'conformer_1_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_1_conv_mod_ln' #: 512 | |
layer linear 'conformer_1_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_1_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_1_conv_mod_res_add' #: 512 | |
layer activation 'conformer_1_conv_mod_swish' #: 512 | |
layer copy 'conformer_1_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_1_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_1_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_1_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_1_ffmod_1_ln' #: 512 | |
layer copy 'conformer_1_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_1_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_1_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_1_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_1_ffmod_2_ln' #: 512 | |
layer linear 'conformer_1_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_1_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_1_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_1_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_1_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_1_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_1_output' #: 512 | |
layer batch_norm 'conformer_2_conv_mod_bn' #: 512 | |
layer conv 'conformer_2_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_2_conv_mod_dropout' #: 512 | |
layer gating 'conformer_2_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_2_conv_mod_ln' #: 512 | |
layer linear 'conformer_2_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_2_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_2_conv_mod_res_add' #: 512 | |
layer activation 'conformer_2_conv_mod_swish' #: 512 | |
layer copy 'conformer_2_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_2_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_2_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_2_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_2_ffmod_1_ln' #: 512 | |
layer copy 'conformer_2_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_2_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_2_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_2_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_2_ffmod_2_ln' #: 512 | |
layer linear 'conformer_2_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_2_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_2_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_2_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_2_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_2_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_2_output' #: 512 | |
layer batch_norm 'conformer_3_conv_mod_bn' #: 512 | |
layer conv 'conformer_3_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_3_conv_mod_dropout' #: 512 | |
layer gating 'conformer_3_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_3_conv_mod_ln' #: 512 | |
layer linear 'conformer_3_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_3_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_3_conv_mod_res_add' #: 512 | |
layer activation 'conformer_3_conv_mod_swish' #: 512 | |
layer copy 'conformer_3_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_3_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_3_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_3_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_3_ffmod_1_ln' #: 512 | |
layer copy 'conformer_3_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_3_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_3_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_3_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_3_ffmod_2_ln' #: 512 | |
layer linear 'conformer_3_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_3_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_3_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_3_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_3_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_3_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_3_output' #: 512 | |
layer batch_norm 'conformer_4_conv_mod_bn' #: 512 | |
layer conv 'conformer_4_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_4_conv_mod_dropout' #: 512 | |
layer gating 'conformer_4_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_4_conv_mod_ln' #: 512 | |
layer linear 'conformer_4_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_4_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_4_conv_mod_res_add' #: 512 | |
layer activation 'conformer_4_conv_mod_swish' #: 512 | |
layer copy 'conformer_4_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_4_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_4_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_4_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_4_ffmod_1_ln' #: 512 | |
layer copy 'conformer_4_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_4_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_4_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_4_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_4_ffmod_2_ln' #: 512 | |
layer linear 'conformer_4_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_4_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_4_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_4_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_4_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_4_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_4_output' #: 512 | |
layer batch_norm 'conformer_5_conv_mod_bn' #: 512 | |
layer conv 'conformer_5_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_5_conv_mod_dropout' #: 512 | |
layer gating 'conformer_5_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_5_conv_mod_ln' #: 512 | |
layer linear 'conformer_5_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_5_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_5_conv_mod_res_add' #: 512 | |
layer activation 'conformer_5_conv_mod_swish' #: 512 | |
layer copy 'conformer_5_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_5_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_5_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_5_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_5_ffmod_1_ln' #: 512 | |
layer copy 'conformer_5_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_5_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_5_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_5_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_5_ffmod_2_ln' #: 512 | |
layer linear 'conformer_5_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_5_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_5_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_5_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_5_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_5_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_5_output' #: 512 | |
layer batch_norm 'conformer_6_conv_mod_bn' #: 512 | |
layer conv 'conformer_6_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_6_conv_mod_dropout' #: 512 | |
layer gating 'conformer_6_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_6_conv_mod_ln' #: 512 | |
layer linear 'conformer_6_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_6_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_6_conv_mod_res_add' #: 512 | |
layer activation 'conformer_6_conv_mod_swish' #: 512 | |
layer copy 'conformer_6_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_6_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_6_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_6_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_6_ffmod_1_ln' #: 512 | |
layer copy 'conformer_6_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_6_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_6_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_6_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_6_ffmod_2_ln' #: 512 | |
layer linear 'conformer_6_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_6_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_6_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_6_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_6_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_6_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_6_output' #: 512 | |
layer batch_norm 'conformer_7_conv_mod_bn' #: 512 | |
layer conv 'conformer_7_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_7_conv_mod_dropout' #: 512 | |
layer gating 'conformer_7_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_7_conv_mod_ln' #: 512 | |
layer linear 'conformer_7_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_7_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_7_conv_mod_res_add' #: 512 | |
layer activation 'conformer_7_conv_mod_swish' #: 512 | |
layer copy 'conformer_7_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_7_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_7_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_7_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_7_ffmod_1_ln' #: 512 | |
layer copy 'conformer_7_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_7_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_7_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_7_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_7_ffmod_2_ln' #: 512 | |
layer linear 'conformer_7_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_7_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_7_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_7_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_7_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_7_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_7_output' #: 512 | |
layer batch_norm 'conformer_8_conv_mod_bn' #: 512 | |
layer conv 'conformer_8_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_8_conv_mod_dropout' #: 512 | |
layer gating 'conformer_8_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_8_conv_mod_ln' #: 512 | |
layer linear 'conformer_8_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_8_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_8_conv_mod_res_add' #: 512 | |
layer activation 'conformer_8_conv_mod_swish' #: 512 | |
layer copy 'conformer_8_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_8_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_8_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_8_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_8_ffmod_1_ln' #: 512 | |
layer copy 'conformer_8_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_8_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_8_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_8_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_8_ffmod_2_ln' #: 512 | |
layer linear 'conformer_8_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_8_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_8_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_8_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_8_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_8_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_8_output' #: 512 | |
layer batch_norm 'conformer_9_conv_mod_bn' #: 512 | |
layer conv 'conformer_9_conv_mod_depthwise_conv' #: 512 | |
layer copy 'conformer_9_conv_mod_dropout' #: 512 | |
layer gating 'conformer_9_conv_mod_glu' #: 512 | |
layer layer_norm 'conformer_9_conv_mod_ln' #: 512 | |
layer linear 'conformer_9_conv_mod_pointwise_conv_1' #: 1024 | |
layer linear 'conformer_9_conv_mod_pointwise_conv_2' #: 512 | |
layer combine 'conformer_9_conv_mod_res_add' #: 512 | |
layer activation 'conformer_9_conv_mod_swish' #: 512 | |
layer copy 'conformer_9_ffmod_1_dropout' #: 512 | |
layer linear 'conformer_9_ffmod_1_dropout_linear' #: 512 | |
layer eval 'conformer_9_ffmod_1_half_res_add' #: 512 | |
layer linear 'conformer_9_ffmod_1_linear_swish' #: 2048 | |
layer layer_norm 'conformer_9_ffmod_1_ln' #: 512 | |
layer copy 'conformer_9_ffmod_2_dropout' #: 512 | |
layer linear 'conformer_9_ffmod_2_dropout_linear' #: 512 | |
layer eval 'conformer_9_ffmod_2_half_res_add' #: 512 | |
layer linear 'conformer_9_ffmod_2_linear_swish' #: 2048 | |
layer layer_norm 'conformer_9_ffmod_2_ln' #: 512 | |
layer linear 'conformer_9_mhsa_mod_att_linear' #: 512 | |
layer copy 'conformer_9_mhsa_mod_dropout' #: 512 | |
layer layer_norm 'conformer_9_mhsa_mod_ln' #: 512 | |
layer relative_positional_encoding 'conformer_9_mhsa_mod_relpos_encoding' #: 64 | |
layer combine 'conformer_9_mhsa_mod_res_add' #: 512 | |
layer self_attention 'conformer_9_mhsa_mod_self_attention' #: 512 | |
layer layer_norm 'conformer_9_output' #: 512 | |
layer conv 'conv_1' #: 32 | |
layer pool 'conv_1_pool' #: 32 | |
layer conv 'conv_2' #: 64 | |
layer conv 'conv_3' #: 64 | |
layer merge_dims 'conv_merged' #: 24000 | |
layer split_dims 'conv_source' #: 1 | |
layer source 'data' #: 1 | |
layer copy 'encoder' #: 512 | |
layer subnetwork 'features' #: 750 | |
layer conv 'features/conv_h' #: 150 | |
layer eval 'features/conv_h_act' #: 150 | |
layer variable 'features/conv_h_filter' #: 150 | |
layer split_dims 'features/conv_h_split' #: 1 | |
layer conv 'features/conv_l' #: 5 | |
layer layer_norm 'features/conv_l_act' #: 750 | |
layer eval 'features/conv_l_act_no_norm' #: 750 | |
layer merge_dims 'features/conv_l_merge' #: 750 | |
layer copy 'features/output' #: 750 | |
layer copy 'input_dropout' #: 512 | |
layer linear 'input_linear' #: 512 | |
layer softmax 'output' #: 88 | |
layer eval 'specaug' #: 750 | |
net params #: 85180092 | |
net trainable params: [<tf.Variable 'conformer_10_conv_mod_bn/batch_norm/conformer_10_conv_mod_bn_conformer_10_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_bn/batch_norm/conformer_10_conv_mod_bn_conformer_10_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_10_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_10_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_10_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_10_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_10_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_bn/batch_norm/conformer_11_conv_mod_bn_conformer_11_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_bn/batch_norm/conformer_11_conv_mod_bn_conformer_11_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_11_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_11_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_11_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_11_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_11_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_bn/batch_norm/conformer_12_conv_mod_bn_conformer_12_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_bn/batch_norm/conformer_12_conv_mod_bn_conformer_12_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_12_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_12_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_12_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_12_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_12_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_bn/batch_norm/conformer_1_conv_mod_bn_conformer_1_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_bn/batch_norm/conformer_1_conv_mod_bn_conformer_1_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_1_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_1_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_1_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_1_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_1_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_bn/batch_norm/conformer_2_conv_mod_bn_conformer_2_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_bn/batch_norm/conformer_2_conv_mod_bn_conformer_2_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_2_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_2_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_2_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_2_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_2_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_bn/batch_norm/conformer_3_conv_mod_bn_conformer_3_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_bn/batch_norm/conformer_3_conv_mod_bn_conformer_3_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_3_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_3_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_3_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_3_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_3_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_bn/batch_norm/conformer_4_conv_mod_bn_conformer_4_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_bn/batch_norm/conformer_4_conv_mod_bn_conformer_4_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_4_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_4_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_4_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_4_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_4_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_bn/batch_norm/conformer_5_conv_mod_bn_conformer_5_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_bn/batch_norm/conformer_5_conv_mod_bn_conformer_5_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_5_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_5_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_5_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_5_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_5_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_bn/batch_norm/conformer_6_conv_mod_bn_conformer_6_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_bn/batch_norm/conformer_6_conv_mod_bn_conformer_6_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_6_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_6_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_6_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_6_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_6_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_bn/batch_norm/conformer_7_conv_mod_bn_conformer_7_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_bn/batch_norm/conformer_7_conv_mod_bn_conformer_7_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_7_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_7_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_7_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_7_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_7_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_bn/batch_norm/conformer_8_conv_mod_bn_conformer_8_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_bn/batch_norm/conformer_8_conv_mod_bn_conformer_8_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_8_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_8_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_8_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_8_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_8_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_bn/batch_norm/conformer_9_conv_mod_bn_conformer_9_conv_mod_bn_output_beta:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_bn/batch_norm/conformer_9_conv_mod_bn_conformer_9_conv_mod_bn_output_gamma:0' shape=(1, 1, 512) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_depthwise_conv/W:0' shape=(32, 1, 512) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_depthwise_conv/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_pointwise_conv_1/W:0' shape=(512, 1024) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_pointwise_conv_1/b:0' shape=(1024,) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_pointwise_conv_2/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_9_conv_mod_pointwise_conv_2/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_1_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_dropout_linear/W:0' shape=(2048, 512) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_dropout_linear/b:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_linear_swish/W:0' shape=(512, 2048) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_linear_swish/b:0' shape=(2048,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_ffmod_2_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_mhsa_mod_att_linear/W:0' shape=(512, 512) dtype=float32>, <tf.Variable 'conformer_9_mhsa_mod_ln/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_mhsa_mod_ln/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_mhsa_mod_relpos_encoding/encoding_matrix:0' shape=(65, 64) dtype=float32>, <tf.Variable 'conformer_9_mhsa_mod_self_attention/QKV:0' shape=(512, 1536) dtype=float32>, <tf.Variable 'conformer_9_output/bias:0' shape=(512,) dtype=float32>, <tf.Variable 'conformer_9_output/scale:0' shape=(512,) dtype=float32>, <tf.Variable 'conv_1/W:0' shape=(3, 3, 1, 32) dtype=float32>, <tf.Variable 'conv_1/bias:0' shape=(32,) dtype=float32>, <tf.Variable 'conv_2/W:0' shape=(3, 3, 32, 64) dtype=float32>, <tf.Variable 'conv_2/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'conv_3/W:0' shape=(3, 3, 64, 64) dtype=float32>, <tf.Variable 'conv_3/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'features/conv_h_filter/conv_h_filter:0' shape=(128, 1, 150) dtype=float32>, <tf.Variable 'features/conv_l/W:0' shape=(40, 1, 1, 5) dtype=float32>, <tf.Variable 'features/conv_l_act/bias:0' shape=(750,) dtype=float32>, <tf.Variable 'features/conv_l_act/scale:0' shape=(750,) dtype=float32>, <tf.Variable 'input_linear/W:0' shape=(24000, 512) dtype=float32>, <tf.Variable 'output/W:0' shape=(512, 88) dtype=float32>, <tf.Variable 'output/b:0' shape=(88,) dtype=float32>] | |
start training at epoch 1 | |
using batch size: {'classes': 5000, 'data': 400000}, max seqs: 128 | |
learning rate control: NewbobMultiEpoch(num_epochs=6, update_interval=1, relative_error_threshold=-0.01, relative_error_grow_threshold=-0.01), epoch data: 1: EpochData(learningRate=1.325e-05, error={}), 2: EpochData(learningRate=1.539861111111111e-05, error={}), 3: EpochData(learningRate=1.754722222222222e-05, error={}), ..., 360: EpochData(learningRate=1.4333333333333375e-05, error={}), 361: EpochData(learningRate=1.2166666666666727e-05, error={}), 362: EpochData(learningRate=1e-05, error={}), error key: None | |
pretrain: None | |
[2023-10-27 13:30:18,685] INFO: [32mRun time: 0:01:55 CPU: 0.20% RSS: 2.48GB VMS: 24.91GB[0m | |
[2023-10-27 13:30:23,701] INFO: [32mRun time: 0:02:00 CPU: 0.20% RSS: 3.74GB VMS: 26.19GB[0m | |
start epoch 1 with learning rate 1.325e-05 ... | |
Create optimizer <class 'returnn.tf.updater.NadamOptimizer'> with options {'epsilon': 1e-08, 'learning_rate': <tf.Variable 'learning_rate:0' shape=() dtype=float32>}. | |
Initialize optimizer (default) with slots ['m', 'v']. | |
These additional variable were created by the optimizer: [<tf.Variable 'optimize/beta1_power:0' shape=() dtype=float32>, <tf.Variable 'optimize/beta2_power:0' shape=() dtype=float32>]. | |
[2023-10-27 13:30:53,793] INFO: [32mRun time: 0:02:30 CPU: 0.40% RSS: 4.16GB VMS: 26.64GB[0m | |
/work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard: Relink `/usr/local/lib/tensorflow/libtensorflow_framework.so.2' with `/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z' | |
[2023-10-27 13:31:18,863] INFO: [32mRun time: 0:02:55 CPU: 0.40% RSS: 4.59GB VMS: 27.91GB[0m | |
configuration error: failed to open file "neural-network-trainer.config" for reading. (No such file or directory) | |
/work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard: Relink `/usr/local/lib/tensorflow/libtensorflow_framework.so.2' with `/lib/x86_64-linux-gnu/libz.so.1' for IFUNC symbol `crc32_z' | |
configuration error: failed to open file "neural-network-trainer.config" for reading. (No such file or directory) | |
[2023-10-27 13:31:23,873] INFO: [32mRun time: 0:03:00 CPU: 0.20% RSS: 5.06GB VMS: 29.38GB[0m | |
2023-10-27 13:31:25.026365: I tensorflow/stream_executor/cuda/cuda_dnn.cc:379] Loaded cuDNN version 8400 | |
2023-10-27 13:31:31.970893: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory | |
[2023-10-27 13:31:33,911] INFO: [32mRun time: 0:03:10 CPU: 0.60% RSS: 5.76GB VMS: 30.77GB[0m | |
[2023-10-27 13:36:14,760] INFO: [32mRun time: 0:07:51 CPU: 0.20% RSS: 6.34GB VMS: 31.58GB[0m | |
[2023-10-27 13:42:20,777] INFO: [32mRun time: 0:13:57 CPU: 0.40% RSS: 7.46GB VMS: 33.22GB[0m | |
[2023-10-27 13:55:47,999] INFO: [32mRun time: 0:27:24 CPU: 0.20% RSS: 8.22GB VMS: 34.46GB[0m | |
[2023-10-27 14:04:34,405] INFO: [32mRun time: 0:36:11 CPU: 0.20% RSS: 9.04GB VMS: 35.65GB[0m | |
[2023-10-27 14:14:00,961] INFO: [32mRun time: 0:45:37 CPU: 0.20% RSS: 10.41GB VMS: 37.55GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=6.9GB, std_dev=412.8MB, min=386.0MB, max=7.3GB, num_seqs=4414, avg_data_len=1) | |
train epoch 1, finished after 4414 steps, 0:56:57 elapsed (99.0% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.001 | |
[2023-10-27 14:27:28,062] INFO: [32mRun time: 0:59:04 CPU: 0.40% RSS: 11.51GB VMS: 39.05GB[0m | |
Learning-rate-control: error key 'train_score' from {'train_score': 1.3900552293060982} | |
epoch 1 score: 1.3900552293060982 error: None elapsed: 0:56:57 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.3GB, std_dev=0.0B, min=7.3GB, max=7.3GB, num_seqs=34, avg_data_len=1) | |
epoch 1 'dev' eval, finished after 34 steps, 0:02:00 elapsed (15.8% computing time) | |
Learning-rate-control: error key 'dev_score' from {'dev_score': 1.3790695139892848} | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.3GB, std_dev=0.0B, min=7.3GB, max=7.3GB, num_seqs=32, avg_data_len=1) | |
epoch 1 'devtrain' eval, finished after 32 steps, 0:02:00 elapsed (9.1% computing time) | |
Learning-rate-control: error key 'dev_score' from {'devtrain_score': 1.3538210300139815} | |
dev: score 1.3790695139892848 error None devtrain: score 1.3538210300139815 error None | |
Only 1 epochs stored so far and keeping last 5 epochs and best 5 epochs, thus not cleaning up any epochs yet. | |
start epoch 2 with learning rate 1.539861111111111e-05 ... | |
[2023-10-27 14:44:55,781] INFO: [32mRun time: 1:16:32 CPU: 0.20% RSS: 12.99GB VMS: 41.41GB[0m | |
[2023-10-27 15:09:59,637] INFO: [32mRun time: 1:41:36 CPU: 0.40% RSS: 14.30GB VMS: 43.55GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.5GB, std_dev=82.4MB, min=7.3GB, max=7.6GB, num_seqs=4415, avg_data_len=1) | |
train epoch 2, finished after 4415 steps, 0:41:30 elapsed (99.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.002 | |
epoch 2 score: 1.3392088963905904 error: None elapsed: 0:41:30 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.6GB, std_dev=0.0B, min=7.6GB, max=7.6GB, num_seqs=34, avg_data_len=1) | |
epoch 2 'dev' eval, finished after 34 steps, 0:02:56 elapsed (7.6% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.6GB, std_dev=0.0B, min=7.6GB, max=7.6GB, num_seqs=32, avg_data_len=1) | |
epoch 2 'devtrain' eval, finished after 32 steps, 0:02:56 elapsed (7.2% computing time) | |
dev: score 1.3864548501561664 error None devtrain: score 1.3501789013885421 error None | |
Only 2 epochs stored so far and keeping last 5 epochs and best 5 epochs, thus not cleaning up any epochs yet. | |
start epoch 3 with learning rate 1.754722222222222e-05 ... | |
[2023-10-27 15:35:28,691] INFO: [32mRun time: 2:07:05 CPU: 0.00% RSS: 15.97GB VMS: 46.40GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.6GB, std_dev=0.0B, min=7.6GB, max=7.6GB, num_seqs=4447, avg_data_len=1) | |
train epoch 3, finished after 4447 steps, 0:39:49 elapsed (99.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.003 | |
epoch 3 score: 1.269489863163703 error: None elapsed: 0:39:49 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.6GB, std_dev=0.0B, min=7.6GB, max=7.6GB, num_seqs=34, avg_data_len=1) | |
epoch 3 'dev' eval, finished after 34 steps, 0:03:37 elapsed (7.0% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.6GB, std_dev=0.0B, min=7.6GB, max=7.6GB, num_seqs=33, avg_data_len=1) | |
epoch 3 'devtrain' eval, finished after 33 steps, 0:03:36 elapsed (6.5% computing time) | |
dev: score 1.2504820725569967 error None devtrain: score 1.2248598019774664 error None | |
Only 3 epochs stored so far and keeping last 5 epochs and best 5 epochs, thus not cleaning up any epochs yet. | |
start epoch 4 with learning rate 1.9695833333333335e-05 ... | |
[2023-10-27 16:17:25,382] INFO: [32mRun time: 2:49:02 CPU: 0.40% RSS: 17.57GB VMS: 49.29GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.9GB, std_dev=154.7MB, min=7.6GB, max=7.9GB, num_seqs=4469, avg_data_len=1) | |
train epoch 4, finished after 4469 steps, 0:39:10 elapsed (99.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.004 | |
epoch 4 score: 1.2416727297516101 error: None elapsed: 0:39:10 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.9GB, std_dev=0.0B, min=7.9GB, max=7.9GB, num_seqs=34, avg_data_len=1) | |
epoch 4 'dev' eval, finished after 34 steps, 0:04:13 elapsed (6.6% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=7.9GB, std_dev=0.0B, min=7.9GB, max=7.9GB, num_seqs=32, avg_data_len=1) | |
epoch 4 'devtrain' eval, finished after 32 steps, 0:04:13 elapsed (6.1% computing time) | |
dev: score 1.2481286898714175 error None devtrain: score 1.222140162311287 error None | |
Only 4 epochs stored so far and keeping last 5 epochs and best 5 epochs, thus not cleaning up any epochs yet. | |
start epoch 5 with learning rate 2.1844444444444446e-05 ... | |
[2023-10-27 17:03:16,883] INFO: [32mRun time: 3:34:53 CPU: 0.20% RSS: 19.59GB VMS: 53.41GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.3GB, std_dev=442.8MB, min=7.9GB, max=8.8GB, num_seqs=4469, avg_data_len=1) | |
train epoch 5, finished after 4469 steps, 0:39:02 elapsed (99.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.005 | |
epoch 5 score: 1.2318148641868123 error: None elapsed: 0:39:02 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=34, avg_data_len=1) | |
epoch 5 'dev' eval, finished after 34 steps, 0:04:48 elapsed (6.4% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=32, avg_data_len=1) | |
epoch 5 'devtrain' eval, finished after 32 steps, 0:04:47 elapsed (6.0% computing time) | |
dev: score 1.2419551608756836 error None devtrain: score 1.2190447165734084 error None | |
Only 5 epochs stored so far and keeping last 5 epochs and best 5 epochs, thus not cleaning up any epochs yet. | |
start epoch 6 with learning rate 2.3993055555555557e-05 ... | |
[2023-10-27 18:16:34,692] INFO: [32mRun time: 4:48:11 CPU: 0.40% RSS: 21.55GB VMS: 57.05GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4453, avg_data_len=1) | |
train epoch 6, finished after 4453 steps, 0:50:13 elapsed (99.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.006 | |
epoch 6 score: 1.2329795741144032 error: None elapsed: 0:50:13 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=35, avg_data_len=1) | |
epoch 6 'dev' eval, finished after 35 steps, 0:05:24 elapsed (6.3% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=33, avg_data_len=1) | |
epoch 6 'devtrain' eval, finished after 33 steps, 0:05:23 elapsed (5.8% computing time) | |
dev: score 1.266788492730479 error None devtrain: score 1.2451312670038714 error None | |
6 epochs stored so far and keeping all. | |
start epoch 7 with learning rate 2.6141666666666667e-05 ... | |
[2023-10-27 19:05:17,538] INFO: [32mRun time: 5:36:54 CPU: 0.40% RSS: 24.01GB VMS: 60.62GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4425, avg_data_len=1) | |
train epoch 7, finished after 4425 steps, 0:36:55 elapsed (99.6% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.007 | |
epoch 7 score: 1.2283967884597307 error: None elapsed: 0:36:55 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=34, avg_data_len=1) | |
epoch 7 'dev' eval, finished after 34 steps, 0:05:57 elapsed (6.0% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=31, avg_data_len=1) | |
epoch 7 'devtrain' eval, finished after 31 steps, 0:06:01 elapsed (5.4% computing time) | |
dev: score 1.3199766814795546 error None devtrain: score 1.2800879307069992 error None | |
We have stored models for epochs [1, 2, 3, ..., 5, 6, 7] and keep epochs [3, 4, 5, 6, 7]. | |
We will delete the models of epochs [1, 2]. | |
Deleted 671.0MB. | |
start epoch 8 with learning rate 2.8290277777777778e-05 ... | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4437, avg_data_len=1) | |
train epoch 8, finished after 4437 steps, 0:36:37 elapsed (99.6% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.008 | |
epoch 8 score: 1.2310048479250342 error: None elapsed: 0:36:37 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=34, avg_data_len=1) | |
epoch 8 'dev' eval, finished after 34 steps, 0:06:35 elapsed (5.7% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=32, avg_data_len=1) | |
epoch 8 'devtrain' eval, finished after 32 steps, 0:06:35 elapsed (5.4% computing time) | |
dev: score 1.254081779478429 error None devtrain: score 1.2274311660958979 error None | |
6 epochs stored so far and keeping all. | |
start epoch 9 with learning rate 3.043888888888889e-05 ... | |
[2023-10-27 20:36:27,147] INFO: [32mRun time: 7:08:03 CPU: 0.20% RSS: 26.42GB VMS: 64.24GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4474, avg_data_len=1) | |
train epoch 9, finished after 4474 steps, 0:36:24 elapsed (99.6% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.009 | |
epoch 9 score: 1.2330831734932135 error: None elapsed: 0:36:24 | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=34, avg_data_len=1) | |
epoch 9 'dev' eval, finished after 34 steps, 0:07:03 elapsed (5.7% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=32, avg_data_len=1) | |
epoch 9 'devtrain' eval, finished after 32 steps, 0:07:06 elapsed (5.3% computing time) | |
dev: score 1.2381794194351643 error None devtrain: score 1.210887479672637 error None | |
7 epochs stored so far and keeping all. | |
start epoch 10 with learning rate 3.25875e-05 ... | |
[2023-10-27 21:50:45,589] INFO: [32mRun time: 8:22:22 CPU: 0.40% RSS: 29.27GB VMS: 68.61GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4468, avg_data_len=1) | |
train epoch 10, finished after 4468 steps, 0:36:40 elapsed (98.7% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.010 | |
epoch 10 score: 1.233361122818855 error: None elapsed: 0:36:40 | |
[2023-10-27 21:55:32,752] INFO: [32mRun time: 8:27:09 CPU: 0.40% RSS: 55.22GB VMS: 131.22GB[0m | |
[2023-10-27 21:56:33,157] INFO: [32mRun time: 8:28:09 CPU: 0.00% RSS: 28.85GB VMS: 68.21GB[0m | |
[2023-10-27 21:56:38,173] INFO: [32mRun time: 8:28:14 CPU: 0.40% RSS: 55.22GB VMS: 131.22GB[0m | |
[2023-10-27 21:58:49,678] INFO: [32mRun time: 8:30:26 CPU: 0.40% RSS: 28.68GB VMS: 67.96GB[0m | |
[2023-10-27 21:58:54,693] INFO: [32mRun time: 8:30:31 CPU: 0.20% RSS: 54.88GB VMS: 130.72GB[0m | |
[2023-10-27 21:59:30,954] INFO: [32mRun time: 8:31:07 CPU: 0.20% RSS: 28.69GB VMS: 67.96GB[0m | |
[2023-10-27 21:59:35,970] INFO: [32mRun time: 8:31:12 CPU: 0.40% RSS: 54.88GB VMS: 130.72GB[0m | |
[2023-10-27 21:59:57,332] INFO: [32mRun time: 8:31:34 CPU: 0.40% RSS: 28.69GB VMS: 67.96GB[0m | |
[2023-10-27 22:00:02,347] INFO: [32mRun time: 8:31:39 CPU: 0.20% RSS: 54.88GB VMS: 130.72GB[0m | |
[2023-10-27 22:01:17,180] INFO: [32mRun time: 8:32:53 CPU: 0.40% RSS: 28.69GB VMS: 67.96GB[0m | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=34, avg_data_len=1) | |
epoch 10 'dev' eval, finished after 34 steps, 0:13:46 elapsed (3.1% computing time) | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=32, avg_data_len=1) | |
epoch 10 'devtrain' eval, finished after 32 steps, 0:07:37 elapsed (5.2% computing time) | |
dev: score 1.253429785756822 error None devtrain: score 1.2296961439859426 error None | |
8 epochs stored so far and keeping all. | |
start epoch 11 with learning rate 3.473611111111111e-05 ... | |
Stats: | |
mem_usage:GPU:0: Stats(mean=8.8GB, std_dev=0.0B, min=8.8GB, max=8.8GB, num_seqs=4455, avg_data_len=1) | |
train epoch 11, finished after 4455 steps, 0:35:51 elapsed (99.6% computing time) | |
Save model under /u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/models/epoch.011 | |
[2023-10-27 22:52:34,719] INFO: [32mRun time: 9:24:11 CPU: 1.40% RSS: 1.85GB VMS: 5.22GB[0m | |
RETURNN SprintControl[pid 4045744] Python module load | |
RETURNN SprintControl[pid 4045744] init: name='Sprint.PythonControl', sprint_unit='NnTrainer.pythonControl', version_number=5, callback=<built-in method callback of PyCapsule object at 0x7f24f99b0780>, ref=<capsule object "Sprint.PythonControl.Internal" at 0x7f24f99b0780>, config={'c2p_fd': '41', 'p2c_fd': '42', 'minPythonControlVersion': '4'}, kwargs={} | |
RETURNN SprintControl[pid 4045744] PythonControl create {'c2p_fd': 41, 'p2c_fd': 42, 'name': 'Sprint.PythonControl', 'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f24f99b0780>, 'config': {'c2p_fd': '41', 'p2c_fd': '42', 'minPythonControlVersion': '4'}, 'sprint_unit': 'NnTrainer.pythonControl', 'version_number': 5, 'min_version_number': 4, 'callback': <built-in method callback of PyCapsule object at 0x7f24f99b0780>} | |
RETURNN SprintControl[pid 4045744] PythonControl init {'name': 'Sprint.PythonControl', 'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f24f99b0780>, 'config': {'c2p_fd': '41', 'p2c_fd': '42', 'minPythonControlVersion': '4'}, 'sprint_unit': 'NnTrainer.pythonControl', 'version_number': 5, 'min_version_number': 4, 'callback': <built-in method callback of PyCapsule object at 0x7f24f99b0780>} | |
RETURNN SprintControl[pid 4045744] init for Sprint.PythonControl {'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f24f99b0780>, 'config': {'c2p_fd': '41', 'p2c_fd': '42', 'minPythonControlVersion': '4'}} | |
RETURNN SprintControl[pid 4045744] PythonControl run_control_loop: <built-in method callback of PyCapsule object at 0x7f24f99b0780>, {} | |
RETURNN SprintControl[pid 4045744] PythonControl run_control_loop control: '<version>RWTH ASR 0.9beta (431c74d54b895a2a4c3689bcd5bf641a878bb925)\n</version>' | |
Unhandled exception <class 'EOFError'> in thread <_MainThread(MainThread, started 139796788109312)>, proc 4045744. | |
Thread current, main, <_MainThread(MainThread, started 139796788109312)>: | |
(Excluded thread.) | |
That were all threads. | |
[31;1mEXCEPTION[0m | |
[34mTraceback (most recent call last):[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m550[0m, [34min[0m PythonControl.run_control_loop | |
[34mline:[0m self[34m.[0mhandle_next[34m([0m[34m)[0m | |
[34mlocals:[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f24f99bf580[34m>[0m | |
self[34;1m.[0mhandle_next [34;1m=[0m [34m<local>[0m [34m<[0mbound method PythonControl[34m.[0mhandle_next of [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f24f99bf580[34m>[0m[34m>[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m518[0m, [34min[0m PythonControl.handle_next | |
[34mline:[0m args [34m=[0m self[34m.[0m_read[34m([0m[34m)[0m | |
[34mlocals:[0m | |
args [34;1m=[0m [34m<not found>[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f24f99bf580[34m>[0m | |
self[34;1m.[0m_read [34;1m=[0m [34m<local>[0m [34m<[0mbound method PythonControl[34m.[0m_read of [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f24f99bf580[34m>[0m[34m>[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m439[0m, [34min[0m PythonControl._read | |
[34mline:[0m [34mreturn[0m Unpickler[34m([0mself[34m.[0mpipe_p2c[34m)[0m[34m.[0mload[34m([0m[34m)[0m | |
[34mlocals:[0m | |
Unpickler [34;1m=[0m [34m<global>[0m [34m<[0m[34mclass[0m [36m'_pickle.Unpickler'[0m[34m>[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f24f99bf580[34m>[0m | |
self[34;1m.[0mpipe_p2c [34;1m=[0m [34m<local>[0m [34m<[0m_io[34m.[0mBufferedReader name[34m=[0m42[34m>[0m | |
load [34;1m=[0m [34m<not found>[0m | |
[31mEOFError[0m: Ran out of input | |
RETURNN SprintControl[pid 4045761] Python module load | |
RETURNN SprintControl[pid 4045761] init: name='Sprint.PythonControl', sprint_unit='NnTrainer.pythonControl', version_number=5, callback=<built-in method callback of PyCapsule object at 0x7f3766470780>, ref=<capsule object "Sprint.PythonControl.Internal" at 0x7f3766470780>, config={'c2p_fd': '42', 'p2c_fd': '47', 'minPythonControlVersion': '4'}, kwargs={} | |
RETURNN SprintControl[pid 4045761] PythonControl create {'c2p_fd': 42, 'p2c_fd': 47, 'name': 'Sprint.PythonControl', 'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f3766470780>, 'config': {'c2p_fd': '42', 'p2c_fd': '47', 'minPythonControlVersion': '4'}, 'sprint_unit': 'NnTrainer.pythonControl', 'version_number': 5, 'min_version_number': 4, 'callback': <built-in method callback of PyCapsule object at 0x7f3766470780>} | |
RETURNN SprintControl[pid 4045761] PythonControl init {'name': 'Sprint.PythonControl', 'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f3766470780>, 'config': {'c2p_fd': '42', 'p2c_fd': '47', 'minPythonControlVersion': '4'}, 'sprint_unit': 'NnTrainer.pythonControl', 'version_number': 5, 'min_version_number': 4, 'callback': <built-in method callback of PyCapsule object at 0x7f3766470780>} | |
RETURNN SprintControl[pid 4045761] init for Sprint.PythonControl {'reference': <capsule object "Sprint.PythonControl.Internal" at 0x7f3766470780>, 'config': {'c2p_fd': '42', 'p2c_fd': '47', 'minPythonControlVersion': '4'}} | |
RETURNN SprintControl[pid 4045761] PythonControl run_control_loop: <built-in method callback of PyCapsule object at 0x7f3766470780>, {} | |
RETURNN SprintControl[pid 4045761] PythonControl run_control_loop control: '<version>RWTH ASR 0.9beta (431c74d54b895a2a4c3689bcd5bf641a878bb925)\n</version>' | |
Unhandled exception <class 'EOFError'> in thread <_MainThread(MainThread, started 139875920732160)>, proc 4045761. | |
Thread current, main, <_MainThread(MainThread, started 139875920732160)>: | |
(Excluded thread.) | |
That were all threads. | |
[31;1mEXCEPTION[0m | |
[34mTraceback (most recent call last):[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m550[0m, [34min[0m PythonControl.run_control_loop | |
[34mline:[0m self[34m.[0mhandle_next[34m([0m[34m)[0m | |
[34mlocals:[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f376647f580[34m>[0m | |
self[34;1m.[0mhandle_next [34;1m=[0m [34m<local>[0m [34m<[0mbound method PythonControl[34m.[0mhandle_next of [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f376647f580[34m>[0m[34m>[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m518[0m, [34min[0m PythonControl.handle_next | |
[34mline:[0m args [34m=[0m self[34m.[0m_read[34m([0m[34m)[0m | |
[34mlocals:[0m | |
args [34;1m=[0m [34m<not found>[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f376647f580[34m>[0m | |
self[34;1m.[0m_read [34;1m=[0m [34m<local>[0m [34m<[0mbound method PythonControl[34m.[0m_read of [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f376647f580[34m>[0m[34m>[0m | |
[34;1mFile[0m [36m"/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/sprint/[0m[36;1mcontrol.py[0m[36m"[0m, [34mline[0m [35m439[0m, [34min[0m PythonControl._read | |
[34mline:[0m [34mreturn[0m Unpickler[34m([0mself[34m.[0mpipe_p2c[34m)[0m[34m.[0mload[34m([0m[34m)[0m | |
[34mlocals:[0m | |
Unpickler [34;1m=[0m [34m<global>[0m [34m<[0m[34mclass[0m [36m'_pickle.Unpickler'[0m[34m>[0m | |
self [34;1m=[0m [34m<local>[0m [34m<[0mreturnn[34m.[0msprint[34m.[0mcontrol[34m.[0mPythonControl object at 0x7f376647f580[34m>[0m | |
self[34;1m.[0mpipe_p2c [34;1m=[0m [34m<local>[0m [34m<[0m_io[34m.[0mBufferedReader name[34m=[0m47[34m>[0m | |
load [34;1m=[0m [34m<not found>[0m | |
[31mEOFError[0m: Ran out of input | |
[2023-10-27 22:52:35,413] ERROR: [31mExecuted command failed:[0m | |
[2023-10-27 22:52:35,415] ERROR: [31mCmd: ['/usr/bin/python3', '/u/maximilian.kannen/setups/20230406_feat/work/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/rnn.py', '/u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/returnn.config'][0m | |
[2023-10-27 22:52:35,416] ERROR: [31mArgs: (-9, ['/usr/bin/python3', '/u/maximilian.kannen/setups/20230406_feat/work/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/rnn.py', '/u/maximilian.kannen/setups/20230406_feat/work/i6_core/returnn/training/ReturnnTrainingJob.TH4IPwv1UZf5/output/returnn.config'])[0m | |
[2023-10-27 22:52:35,416] ERROR: [31mReturn-Code: -9[0m | |
[2023-10-27 22:52:35,429] INFO: [32mMax resources: Run time: 9:24:12 CPU: 79.6% RSS: 55.22GB VMS: 131.22GB[0m | |
Process train worker proc 2/2: | |
Traceback (most recent call last): | |
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap | |
self.run() | |
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run | |
self._target(*self._args, **self._kwargs) | |
File "/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/datasets/multi_proc.py", line 295, in _worker_proc_loop | |
msg, kwargs = parent.recv() | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 250, in recv | |
buf = self._recv_bytes() | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes | |
buf = self._recv(4) | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 383, in _recv | |
raise EOFError | |
EOFError | |
<?xml version="1.0" encoding="UTF-8"?> | |
<sprint> | |
<critical-error component="neural-network-trainer"> | |
PythonControl(NnTrainer.pythonControl): run_control_loop() failed | |
Creating stack trace (innermost first): | |
#1 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK4Core9Component13vErrorMessageENS0_9ErrorTypeEPKcP13__va_list_tag+0xa97) [0x55bdd9139ec7] | |
#2 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK4Core9Component14vCriticalErrorEPKcP13__va_list_tag+0x1e)<?xml version="1.0" encoding="UTF-8"?> | |
<sprint> | |
<critical-error component="neural-network-trainer"> | |
PythonControl(NnTrainer.pythonControl): run_control_loop() failed | |
Creating stack trace (innermost first): | |
#1 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK4Core9Component13vErrorMessageENS0_9ErrorTypeEPKcP13__va_list_tag+0xa97) [0x55cb50ff7ec7] | |
#2 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK4Core9Component14vCriticalErrorEPKcP13__va_list_tag+0x1e) [0x55bdd91364ee] | |
#3 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK2Nn13PythonControl19pythonCriticalErrorEPKcz+0xbe) [0x55bdd934d98e] | |
#4 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN2Nn13PythonControl16run_control_loopEv+0xcd) [0x55bdd934e08d] | |
#5 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN9NnTrainer13pythonControlEv+0x117) [0x55cb50ff44ee] | |
#3 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZNK2Nn13PythonControl19pythonCriticalErrorEPKcz+0xbe) [0x55cb5120b98e] | |
#4 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN2Nn13PythonControl16run_control_loopEv+0xcd) [0x55cb5120c08d] | |
#5 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN9NnTrainer13pythonControlEv+0x117) [0x55bdd90cb917] | |
#6 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN9NnTrainer4mainERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS6_EE+0x2ff) [0x55bdd90a200f] | |
#7 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN4Core11Application3runERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x23) [0x55bdd912b2f3] | |
#8 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN4Core11Application4mainEiPPc+0x484) [0x55cb50f89917] | |
#6 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN9NnTrainer4mainERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS6_EE+0x2ff) [0x55cb50f6000f] | |
#7 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN4Core11Application3runERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x23) [0x55cb50fe92f3] | |
#8 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_ZN4Core11Application4mainEiPPc+0x484) [0x55bdd90a37d4] | |
#9 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(main+0x3d) [0x55bdd90a16cd] | |
#10 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f25087ab083] | |
#11 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_start+0x2e) [0x55bdd90cb0de] | |
</critical-error> | |
<critical-error component="neural-network-trainer"> | |
Terminating due to previous errors | |
</critical-error> | |
[0x55cb50f617d4] | |
#9 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(main+0x3d) [0x55cb50f5f6cd] | |
#10 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f377526b083] | |
#11 /work/asr4/vieting/programs/rasr/20230707/rasr/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard(_start+0x2e) [0x55cb50f890de] | |
</critical-error> | |
<critical-error component="neural-network-trainer"> | |
Terminating due to previous errors | |
</critical-error> | |
<?xml version="1.0" encoding="UTF-8"?> | |
<sprint> | |
<?xml version="1.0" encoding="UTF-8"?> | |
<sprint> | |
exiting... | |
exiting... | |
Process train worker proc 1/2: | |
Traceback (most recent call last): | |
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap | |
self.run() | |
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run | |
self._target(*self._args, **self._kwargs) | |
File "/work/asr3/vieting/hiwis/kannen/sisyphus_work_dirs/swb/i6_core/tools/git/CloneGitRepositoryJob.FigHMwYJhhef/output/repository/returnn/datasets/multi_proc.py", line 295, in _worker_proc_loop | |
msg, kwargs = parent.recv() | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 250, in recv | |
buf = self._recv_bytes() | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes | |
buf = self._recv(4) | |
File "/usr/lib/python3.8/multiprocessing/connection.py", line 383, in _recv | |
raise EOFError | |
EOFError | |
--------------------- Slurm Task Epilog ------------------------ | |
Job ID: 2810223 | |
Time: Fr 27. Okt 22:52:38 CEST 2023 | |
Elapsed Time: 09:24:24 | |
Billing per second for TRES: billing=116,cpu=3,gres/gpu=1,mem=30G,node=1 | |
Show resource usage with e.g.: | |
sacct -j 2810223 -o Elapsed,TotalCPU,UserCPU,SystemCPU,MaxRSS,ReqTRES%60,MaxDiskRead,MaxDiskWrite | |
--------------------- Slurm Task Epilog ------------------------ | |
slurmstepd-cn-260: error: Detected 1 oom-kill event(s) in StepId=2810223.batch. Some of your processes may have been killed by the cgroup out-of-memory handler. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment