Skip to content

Instantly share code, notes, and snippets.

@0x6b64
Created June 18, 2024 10:06
Show Gist options
  • Save 0x6b64/fa34cb1d7b26ac7a73265590aebfda03 to your computer and use it in GitHub Desktop.
Save 0x6b64/fa34cb1d7b26ac7a73265590aebfda03 to your computer and use it in GitHub Desktop.
sd3_cpu_goldens_activations
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.3.0+cu121 with CUDA 1201 (you have 2.1.2+cu121)
Python 3.10.14 (you have 3.10.13)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
/opt/conda/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py:34: FutureWarning: `Transformer2DModelOutput` is deprecated and will be removed in version 1.0.0. Importing `Transformer2DModelOutput` from `diffusers.models.transformer_2d` is deprecated and this will be removed in a future version. Please use `from diffusers.models.modeling_outputs import Transformer2DModelOutput`, instead.
deprecate("Transformer2DModelOutput", "1.0.0", deprecation_message)
Loading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 1.95it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.99it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.99it/s]
Loading pipeline components...: 22%|██▏ | 2/9 [00:01<00:03, 1.83it/s]
Loading pipeline components...: 33%|███▎ | 3/9 [00:02<00:05, 1.02it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 44%|████▍ | 4/9 [00:02<00:03, 1.45it/s]
Loading pipeline components...: 56%|█████▌ | 5/9 [00:03<00:02, 1.93it/s]
Loading pipeline components...: 78%|███████▊ | 7/9 [00:03<00:00, 3.37it/s]
Loading pipeline components...: 100%|██████████| 9/9 [00:03<00:00, 3.60it/s]
Loading pipeline components...: 100%|██████████| 9/9 [00:03<00:00, 2.41it/s]
Embedding input=0 dtype=torch.int64 min=320 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.5078125 max=0.65234375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.1181640625 max=0.65234375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm output=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-2.421875 max=1.6015625
Linear output=0 dtype=torch.bfloat16 min=-0.94921875 max=1.0078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.94921875 max=1.0078125
LayerNorm input=0 dtype=torch.bfloat16 min=-0.9609375 max=1.71875
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=179.0
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=179.0
Linear output=0 dtype=torch.bfloat16 min=-44.75 max=233.0
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-44.75 max=233.0
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPMLP input=0 dtype=torch.bfloat16 min=-23.25 max=179.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=4.84375
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=3.890625
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-2.171875 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-1.4140625 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-0.375 max=0.84765625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.375 max=0.84765625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-11.75 max=3.8125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.75 max=3.8125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.8125
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=4.875
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-2.6875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-0.400390625 max=0.345703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.400390625 max=0.345703125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=3.90625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.78125 max=3.90625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.90625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.90625
Linear output=0 dtype=torch.bfloat16 min=-0.546875 max=0.50390625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.546875 max=0.50390625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=2.78125
Linear input=0 dtype=torch.bfloat16 min=-1.453125 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.412109375 max=0.462890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.412109375 max=0.462890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear output=0 dtype=torch.bfloat16 min=-5.90625 max=4.65625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.90625 max=4.65625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.65625
Linear output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.462890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.462890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.65625
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.078125
Linear output=0 dtype=torch.bfloat16 min=-0.380859375 max=0.41796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.380859375 max=0.41796875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-0.5 max=0.5625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5 max=0.5625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-2.65625 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-1.40625 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.45703125 max=0.71484375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.45703125 max=0.71484375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=3.40625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.875 max=3.40625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.390625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.390625
Linear output=0 dtype=torch.bfloat16 min=-0.92578125 max=0.494140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.92578125 max=0.494140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-3.21875 max=2.984375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-0.60546875 max=1.0390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.60546875 max=1.0390625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=4.28125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=4.28125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.50390625
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.50390625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-3.609375 max=3.09375
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-0.435546875 max=0.7109375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.435546875 max=0.7109375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.40625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.34375 max=6.40625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.40625
Linear output=0 dtype=torch.bfloat16 min=-1.7265625 max=1.0078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.7265625 max=1.0078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.4375 max=0.96875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.4375 max=0.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.75
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=6.75
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-0.71875 max=0.75
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.71875 max=0.75
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-3.90625 max=3.78125
Linear input=0 dtype=torch.bfloat16 min=-3.484375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-1.171875 max=2.71875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.171875 max=2.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=5.5
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.4375 max=5.5
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.5
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=2.21875
CLIPMLP input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=2.21875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-5.03125 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-3.71875 max=4.25
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=5.65625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-8.0 max=5.65625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-1.3125 max=1.015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3125 max=1.015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-4.5 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=3.359375
Linear output=0 dtype=torch.bfloat16 min=-1.3125 max=1.59375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.3125 max=1.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=4.0625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.3125 max=4.0625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=1.859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.84375 max=1.859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.0 max=33.0
Linear input=0 dtype=torch.bfloat16 min=-3.265625 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-3.265625 max=4.0
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=320 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.1474609375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.130859375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
LayerNorm input=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
LayerNorm output=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-3.484375 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.67578125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.67578125
LayerNorm input=0 dtype=torch.bfloat16 min=-1.40625 max=0.6953125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation input=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.5 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-2.90625 max=3.0
Linear output=0 dtype=torch.bfloat16 min=-0.345703125 max=0.283203125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.345703125 max=0.283203125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=5.34375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.3125 max=5.34375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-1.15625 max=1.421875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.15625 max=1.421875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-3.859375 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-3.546875 max=3.859375
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-2.234375 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-1.265625 max=1.7265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.265625 max=1.7265625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=4.75
GELUActivation input=0 dtype=torch.bfloat16 min=-10.0625 max=4.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-1.3671875 max=2.109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3671875 max=2.109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-7.0 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=3.015625
Linear output=0 dtype=torch.bfloat16 min=-0.8515625 max=0.408203125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.8515625 max=0.408203125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=5.65625
GELUActivation input=0 dtype=torch.bfloat16 min=-10.1875 max=5.65625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-0.490234375 max=1.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.490234375 max=1.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-4.78125 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-4.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-1.28125 max=0.57421875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.28125 max=0.57421875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-7.5 max=5.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.5 max=5.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-0.5078125 max=1.265625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5078125 max=1.265625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-5.1875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-1.171875 max=0.5625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.171875 max=0.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=5.75
GELUActivation input=0 dtype=torch.bfloat16 min=-6.46875 max=5.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=7.75
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-3.84375 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-3.1875 max=10.8125
Linear output=0 dtype=torch.bfloat16 min=-1.0546875 max=0.9921875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.0546875 max=0.9921875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=4.34375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.53125 max=4.34375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-0.96875 max=1.0859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.5 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.96875 max=1.0859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.90625 max=3.21875
Linear input=0 dtype=torch.bfloat16 min=-3.34375 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.9765625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.9765625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=5.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.71875 max=5.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-1.078125 max=1.3125
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.078125 max=1.3125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-3.734375 max=4.71875
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-2.796875 max=3.5
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-1.15625 max=0.75390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.15625 max=0.75390625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Linear output=0 dtype=torch.bfloat16 min=-0.62109375 max=1.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62109375 max=1.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-2.859375 max=2.671875
Linear input=0 dtype=torch.bfloat16 min=-1.9765625 max=2.0
Linear output=0 dtype=torch.bfloat16 min=-0.68359375 max=0.43359375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.68359375 max=0.43359375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=5.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-7.125 max=5.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=1.234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.1640625 max=1.234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-7.6875 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-2.0 max=1.9765625
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=0.671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.99609375 max=0.671875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=5.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.8125 max=5.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.87109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.87109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-3.5 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-2.21875 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.77734375 max=0.8046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.77734375 max=0.8046875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=4.53125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.46875 max=4.53125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-0.71484375 max=1.75
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.71484375 max=1.75
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=5.6875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-2.578125 max=3.234375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-0.73046875 max=0.80078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.73046875 max=0.80078125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=5.8125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.0 max=5.8125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-0.828125 max=0.91015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.828125 max=0.91015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-1.828125 max=3.53125
Linear output=0 dtype=torch.bfloat16 min=-0.54296875 max=0.5546875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.54296875 max=0.5546875
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=3.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.375 max=3.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.046875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.046875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-3.640625 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-1.578125 max=1.765625
Linear output=0 dtype=torch.bfloat16 min=-0.6875 max=0.84375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6875 max=0.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
Linear input=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=6.75
GELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=6.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=0.8203125
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.53125 max=0.8203125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=6.0625
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-2.671875 max=3.296875
Linear input=0 dtype=torch.bfloat16 min=-1.703125 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.0234375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.0234375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=3.578125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.75 max=3.578125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.578125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.84375
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.84375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-2.515625 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-1.7109375 max=1.7265625
Linear output=0 dtype=torch.bfloat16 min=-0.44140625 max=0.5078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.44140625 max=0.5078125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-7.9375 max=4.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.9375 max=4.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=0.6953125
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=0.6953125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.859375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.3984375
Linear output=0 dtype=torch.bfloat16 min=-0.51171875 max=1.328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.51171875 max=1.328125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=4.125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.1875 max=4.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-1.4375 max=0.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.4375 max=0.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=2.203125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=1.15625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=1.15625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=3.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.875 max=3.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.3125
Linear output=0 dtype=torch.bfloat16 min=-1.3515625 max=0.5859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3515625 max=0.5859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-2.9375 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-1.1328125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=1.609375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6640625 max=1.609375
LayerNorm input=0 dtype=torch.bfloat16 min=-66.0 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=4.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.6875 max=4.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-2.21875 max=0.67578125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.21875 max=0.67578125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm output=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-7.0625 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-9.25 max=7.78125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.140625 max=3.515625
Linear input=0 dtype=torch.bfloat16 min=-1.2578125 max=1.796875
Linear output=0 dtype=torch.bfloat16 min=-0.59375 max=2.4375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.59375 max=2.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-27.0 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-27.0 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=6.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.4375 max=6.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.25
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=0.71875
CLIPMLP input=0 dtype=torch.bfloat16 min=-27.0 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.59375 max=0.71875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-3.40625 max=2.859375
Linear input=0 dtype=torch.bfloat16 min=-0.8671875 max=0.69140625
Linear output=0 dtype=torch.bfloat16 min=-0.373046875 max=2.25
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.373046875 max=2.25
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-19.625 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=3.25
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=3.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-6.03125 max=0.86328125
CLIPMLP input=0 dtype=torch.bfloat16 min=-19.625 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-6.03125 max=0.86328125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-4.6875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-0.68359375 max=2.046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.68359375 max=2.046875
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-11.0625 max=2.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-11.0625 max=2.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear output=0 dtype=torch.bfloat16 min=-2.328125 max=0.765625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.328125 max=0.765625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-7.71875 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-9.25 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=4.90625
Linear input=0 dtype=torch.bfloat16 min=-1.8359375 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-0.66015625 max=1.453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.66015625 max=1.453125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.5 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-19.5 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=2.796875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.875 max=2.796875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.796875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.796875
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=1.1015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-19.5 max=17.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.28125 max=1.1015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-7.71875 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-3.8125 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=3.296875
Linear output=0 dtype=torch.bfloat16 min=-0.75 max=1.2265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.75 max=1.2265625
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=2.140625
GELUActivation input=0 dtype=torch.bfloat16 min=-9.625 max=2.140625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.109375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.109375
Linear output=0 dtype=torch.bfloat16 min=-1.8828125 max=0.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.8828125 max=0.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.78125
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-3.015625 max=2.9375
Linear input=0 dtype=torch.bfloat16 min=-0.91796875 max=1.15625
Linear output=0 dtype=torch.bfloat16 min=-0.64453125 max=1.6328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.64453125 max=1.6328125
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=3.96875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=3.96875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Linear output=0 dtype=torch.bfloat16 min=-1.1875 max=0.796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.1875 max=0.796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=7.71875
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-3.890625 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-3.828125 max=3.15625
Linear output=0 dtype=torch.bfloat16 min=-0.70703125 max=1.0
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.70703125 max=1.0
LayerNorm input=0 dtype=torch.bfloat16 min=-63.75 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-10.125 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-10.125 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=3.015625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.125 max=3.015625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.015625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.015625
Linear output=0 dtype=torch.bfloat16 min=-2.671875 max=1.109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.125 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.671875 max=1.109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-2.796875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-1.953125 max=2.5625
Linear output=0 dtype=torch.bfloat16 min=-1.4375 max=1.7421875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.4375 max=1.7421875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.5 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=4.78125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=4.78125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-2.6875 max=5.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.5 max=20.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.6875 max=5.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.25 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-3.265625 max=3.4375
Linear input=0 dtype=torch.bfloat16 min=-1.78125 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-2.640625 max=0.98046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.640625 max=0.98046875
LayerNorm input=0 dtype=torch.bfloat16 min=-62.5 max=19.25
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
Linear input=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=5.90625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.625 max=5.90625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.90625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-1.609375 max=5.53125
CLIPMLP input=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.609375 max=5.53125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.0 max=24.75
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.75
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.75
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.46875
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.75
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=2.03125
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.1875 max=2.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-25.25 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-25.25 max=17.25
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=3.8125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=3.8125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Linear output=0 dtype=torch.bfloat16 min=-2.40625 max=5.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.25 max=17.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.40625 max=5.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-15.4375 max=21.25
Linear input=0 dtype=torch.bfloat16 min=-15.4375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-15.4375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-15.4375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-3.578125 max=4.25
Linear input=0 dtype=torch.bfloat16 min=-1.703125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=2.28125
CLIPAttention output=0 dtype=torch.bfloat16 min=-9.75 max=2.28125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=22.0
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.1875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.1875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-3.8125 max=11.1875
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.5 max=22.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.8125 max=11.1875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.4375 max=31.0
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.0
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.0
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=7.15625
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.0
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=2.84375
Linear input=0 dtype=torch.bfloat16 min=-3.671875 max=2.328125
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=4.21875
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.09375 max=4.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-72.0 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=3.609375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.28125 max=3.609375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=8.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-6.75 max=8.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=26.25
Linear input=0 dtype=torch.bfloat16 min=-5.0 max=6.53125
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.03125
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=0 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=21820
Embedding output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-0.52734375 max=0.490234375
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=5.28125
Embedding input=0 dtype=torch.int64 min=0 max=30
Embedding output=0 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=3 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=4 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=5 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=6 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=7 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=8 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=9 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=10 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=11 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=12 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=13 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=14 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=15 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=16 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=17 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=18 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=19 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=20 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=21 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=22 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=23 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=24 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=25 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=26 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=27 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=28 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=29 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=30 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=31 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=32 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=33 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=34 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=35 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=36 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=37 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=38 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=39 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=40 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=41 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=42 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=43 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=44 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=45 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=46 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=47 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=48 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=49 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=50 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=51 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=52 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=53 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=54 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=55 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=56 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=57 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=58 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=59 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=60 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=61 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=62 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=63 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=64 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=65 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=66 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=67 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=68 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=69 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=70 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=71 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=72 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=73 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=74 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=75 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=76 dtype=torch.bfloat16 min=-47.25 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-3.375 max=3.09375
Linear output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
T5Attention output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-120.5 max=94.5
Dropout output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=6.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.0625 max=6.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=7.65625
Dropout input=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Dropout output=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Linear input=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Linear output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
Dropout input=0 dtype=torch.bfloat16 min=-119.5 max=106.5
Dropout output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
T5LayerFF input=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5Block output=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-1.234375 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-7.375 max=6.46875
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-3.5625 max=3.890625
Linear input=0 dtype=torch.bfloat16 min=-3.125 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
T5Attention output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-80.5 max=96.0
Dropout output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=9.1875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.40625 max=9.1875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=5.28125
Dropout input=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Dropout output=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Linear output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
Dropout input=0 dtype=torch.bfloat16 min=-324.0 max=302.0
Dropout output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block output=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.6796875
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-3.046875 max=2.8125
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5Attention input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
T5Attention output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-156.0 max=158.0
Dropout output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=6.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.8125 max=6.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=5.0
Dropout input=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Dropout output=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Linear output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
Dropout input=0 dtype=torch.bfloat16 min=-111.0 max=110.5
Dropout output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
T5LayerFF input=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block output=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-0.6875 max=0.8125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-3.640625 max=3.09375
Linear input=0 dtype=torch.bfloat16 min=-3.125 max=2.78125
Linear output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
T5Attention output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-74.0 max=98.5
Dropout output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.09375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.75 max=7.09375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=5.75
Dropout input=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Dropout output=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
Dropout input=0 dtype=torch.bfloat16 min=-195.0 max=201.0
Dropout output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block output=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.8671875
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-2.984375 max=3.296875
Linear input=0 dtype=torch.bfloat16 min=-2.890625 max=3.203125
Linear output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
T5Attention output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-95.0 max=116.5
Dropout output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=5.96875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-11.6875 max=5.96875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=7.15625
Dropout input=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Dropout output=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Linear input=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Linear output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
Dropout input=0 dtype=torch.bfloat16 min=-268.0 max=274.0
Dropout output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block output=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.3046875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-3.1875 max=3.140625
Linear input=0 dtype=torch.bfloat16 min=-2.890625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
T5Attention output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-130.0 max=131.0
Dropout output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.03125
Dropout input=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Dropout output=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Linear output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
Dropout input=0 dtype=torch.bfloat16 min=-217.0 max=211.0
Dropout output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block output=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-0.8984375 max=0.84765625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-7.96875 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-2.9375 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.796875
Linear output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
T5Attention output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-84.0 max=75.5
Dropout output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=4.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.15625 max=4.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.28125
Dropout input=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Dropout output=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Linear input=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
Dropout input=0 dtype=torch.bfloat16 min=-213.0 max=264.0
Dropout output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block output=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-0.78125 max=0.875
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-2.953125 max=2.75
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5Attention input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
T5Attention output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-51.25 max=62.0
Dropout output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=28.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.4375 max=28.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=28.25
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-56.75 max=44.25
Dropout input=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Dropout output=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Linear input=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Linear output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
Dropout input=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
Dropout output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.734375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-2.484375 max=1.984375
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
T5Attention output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-47.75 max=55.5
Dropout output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear output=0 dtype=torch.bfloat16 min=-4.8125 max=11.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.8125 max=11.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=11.4375
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=18.25
Dropout input=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Dropout output=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Linear output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
Dropout input=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
Dropout output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.64453125
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-8.5 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-3.453125 max=3.78125
Linear input=0 dtype=torch.bfloat16 min=-3.140625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
T5Attention output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-82.0 max=50.5
Dropout output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=15.375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.09375 max=15.375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=15.375
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-32.25 max=23.625
Dropout input=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Dropout output=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Linear input=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Linear output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
Dropout input=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
Dropout output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-0.72265625 max=0.78515625
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-2.875 max=3.234375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=66.5
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=36.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.65625 max=36.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=36.75
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear output=0 dtype=torch.bfloat16 min=-159.0 max=107.0
Dropout input=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Dropout output=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Linear input=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Linear output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
Dropout input=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
Dropout output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.71484375
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-3.296875 max=3.078125
Linear input=0 dtype=torch.bfloat16 min=-3.203125 max=2.40625
Linear output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
T5Attention output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-60.75 max=91.5
Dropout output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=6.59375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.0625 max=6.59375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=18.625
Dropout input=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Dropout output=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Linear output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
Dropout input=0 dtype=torch.bfloat16 min=-728.0 max=984.0
Dropout output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.6796875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-2.984375 max=3.3125
Linear input=0 dtype=torch.bfloat16 min=-2.6875 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
T5Attention output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-75.5 max=99.5
Dropout output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=7.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.46875 max=7.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.75
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-49.0 max=53.25
Dropout input=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Dropout output=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Linear input=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Linear output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
Dropout input=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
Dropout output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-0.82421875 max=0.84765625
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-3.6875 max=2.90625
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=98.0
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=7.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.5625 max=7.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=13.875
Dropout input=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Dropout output=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Linear input=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Linear output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
Dropout input=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
Dropout output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-0.66796875 max=0.75390625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-3.9375 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
T5Attention output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-79.5 max=89.0
Dropout output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=5.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.3125 max=5.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear output=0 dtype=torch.bfloat16 min=-15.0625 max=12.9375
Dropout input=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Dropout output=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Linear input=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Linear output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
Dropout input=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
Dropout output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.80859375
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=3.296875
Linear output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
T5Attention output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-80.5 max=109.0
Dropout output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=7.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=7.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=26.375
Dropout input=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Dropout output=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Linear input=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Linear output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
Dropout input=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
Dropout output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-0.80859375 max=0.83203125
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-8.5625 max=7.90625
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-5.1875 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
T5Attention output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-178.0 max=177.0
Dropout output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=6.90625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.84375 max=6.90625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-106.5 max=65.0
Dropout input=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Dropout output=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Linear input=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Linear output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
Dropout input=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
Dropout output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-0.7890625 max=0.91796875
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-6.78125 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-244.0 max=174.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
T5Attention output=0 dtype=torch.bfloat16 min=-244.0 max=174.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-244.0 max=174.0
Dropout output=0 dtype=torch.bfloat16 min=-244.0 max=174.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=12.75
Linear input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear output=0 dtype=torch.bfloat16 min=-45.0 max=37.75
Dropout input=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Dropout output=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Linear input=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Linear output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
Dropout input=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
Dropout output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-1.359375 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-8.5 max=10.625
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-4.8125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-356.0 max=306.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
T5Attention output=0 dtype=torch.bfloat16 min=-356.0 max=306.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-356.0 max=306.0
Dropout output=0 dtype=torch.bfloat16 min=-356.0 max=306.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=10.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.59375 max=10.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-38.75 max=39.75
Dropout input=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Dropout output=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Linear input=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Linear output=0 dtype=torch.bfloat16 min=-2336.0 max=3424.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2336.0 max=3424.0
Dropout input=0 dtype=torch.bfloat16 min=-2336.0 max=3424.0
Dropout output=0 dtype=torch.bfloat16 min=-2336.0 max=3424.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.8671875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-7.28125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
T5Attention output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-410.0 max=368.0
Dropout output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=7.8125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.5 max=7.8125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-43.75 max=43.25
Dropout input=0 dtype=torch.bfloat16 min=-118.0 max=84.5
Dropout output=0 dtype=torch.bfloat16 min=-118.0 max=84.5
Linear input=0 dtype=torch.bfloat16 min=-118.0 max=84.5
Linear output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
Dropout input=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
Dropout output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.21875 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-2.21875 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-0.82421875 max=1.046875
Linear input=0 dtype=torch.bfloat16 min=-2.21875 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=7.21875
Linear input=0 dtype=torch.bfloat16 min=-2.21875 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-221.0 max=222.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.21875 max=2.265625
T5Attention output=0 dtype=torch.bfloat16 min=-221.0 max=222.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-221.0 max=222.0
Dropout output=0 dtype=torch.bfloat16 min=-221.0 max=222.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.625 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-2.625 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=20.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=20.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-2.625 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-35.75 max=27.625
Dropout input=0 dtype=torch.bfloat16 min=-128.0 max=99.0
Dropout output=0 dtype=torch.bfloat16 min=-128.0 max=99.0
Linear input=0 dtype=torch.bfloat16 min=-128.0 max=99.0
Linear output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.625 max=2.46875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
Dropout input=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
Dropout output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-0.89453125 max=1.2578125
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-14.3125 max=11.6875
Linear input=0 dtype=torch.bfloat16 min=-14.3125 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-197.0 max=266.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
T5Attention output=0 dtype=torch.bfloat16 min=-197.0 max=266.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-197.0 max=266.0
Dropout output=0 dtype=torch.bfloat16 min=-197.0 max=266.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.5 max=2.96875
Linear input=0 dtype=torch.bfloat16 min=-3.5 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-7.0625 max=20.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.0625 max=20.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=20.25
Linear input=0 dtype=torch.bfloat16 min=-3.5 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-42.5 max=95.5
Dropout input=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Dropout output=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Linear input=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Linear output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.5 max=2.96875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
Dropout input=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
Dropout output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-1.6875 max=1.078125
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-19.875 max=13.875
Linear input=0 dtype=torch.bfloat16 min=-19.875 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-266.0 max=306.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
T5Attention output=0 dtype=torch.bfloat16 min=-266.0 max=306.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-266.0 max=306.0
Dropout output=0 dtype=torch.bfloat16 min=-266.0 max=306.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-5.0 max=3.671875
Linear input=0 dtype=torch.bfloat16 min=-5.0 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=22.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.5625 max=22.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=22.0
Linear input=0 dtype=torch.bfloat16 min=-5.0 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-111.0 max=200.0
Dropout input=0 dtype=torch.bfloat16 min=-716.0 max=1648.0
Dropout output=0 dtype=torch.bfloat16 min=-716.0 max=1648.0
Linear input=0 dtype=torch.bfloat16 min=-716.0 max=1648.0
Linear output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-5.0 max=3.671875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
Dropout input=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
Dropout output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.5 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=0.8984375
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-27.5 max=25.0
Linear input=0 dtype=torch.bfloat16 min=-15.5625 max=18.25
Linear output=0 dtype=torch.bfloat16 min=-402.0 max=342.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.5 max=4.1875
T5Attention output=0 dtype=torch.bfloat16 min=-402.0 max=342.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-402.0 max=342.0
Dropout output=0 dtype=torch.bfloat16 min=-402.0 max=342.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.65625 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-6.65625 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-12.0625 max=102.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-12.0625 max=102.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=102.5
Linear input=0 dtype=torch.bfloat16 min=-6.65625 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-98.5 max=138.0
Dropout input=0 dtype=torch.bfloat16 min=-1368.0 max=3760.0
Dropout output=0 dtype=torch.bfloat16 min=-1368.0 max=3760.0
Linear input=0 dtype=torch.bfloat16 min=-1368.0 max=3760.0
Linear output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-6.65625 max=9.0
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
Dropout input=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
Dropout output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5Block input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block output=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
Dropout input=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
Dropout output=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
T5EncoderModel input=0 dtype=torch.int64 min=0 max=21820
Embedding input=0 dtype=torch.int64 min=49406 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.5078125 max=0.65234375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.1181640625 max=0.65234375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-5.15625 max=5.5
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-2.1875 max=1.59375
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=1.078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.99609375 max=1.078125
LayerNorm input=0 dtype=torch.bfloat16 min=-0.9921875 max=1.71875
LayerNorm output=0 dtype=torch.bfloat16 min=-22.875 max=179.0
Linear input=0 dtype=torch.bfloat16 min=-22.875 max=179.0
Linear output=0 dtype=torch.bfloat16 min=-45.0 max=233.0
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-45.0 max=233.0
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.875 max=179.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=2.8125
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-1.109375 max=1.1640625
Linear input=0 dtype=torch.bfloat16 min=-0.734375 max=0.59375
Linear output=0 dtype=torch.bfloat16 min=-0.39453125 max=0.97265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.39453125 max=0.97265625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-11.875 max=2.28125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.875 max=2.28125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-3.25 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-1.7421875 max=1.4609375
Linear input=0 dtype=torch.bfloat16 min=-0.65625 max=0.8046875
Linear output=0 dtype=torch.bfloat16 min=-0.296875 max=0.326171875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.296875 max=0.326171875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=3.71875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=3.71875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.71875
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.71875
Linear output=0 dtype=torch.bfloat16 min=-0.50390625 max=0.4296875
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.50390625 max=0.4296875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=3.765625
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-3.375 max=3.078125
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-1.9921875 max=2.125
Linear input=0 dtype=torch.bfloat16 min=-0.76171875 max=0.96875
Linear output=0 dtype=torch.bfloat16 min=-0.296875 max=0.2890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.296875 max=0.2890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=2.296875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.5625 max=2.296875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=2.25
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=2.25
Linear output=0 dtype=torch.bfloat16 min=-0.279296875 max=0.337890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.279296875 max=0.337890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-4.34375 max=3.65625
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-3.984375 max=3.96875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=1.9765625
Linear input=0 dtype=torch.bfloat16 min=-1.3125 max=0.921875
Linear output=0 dtype=torch.bfloat16 min=-0.333984375 max=0.337890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.333984375 max=0.337890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=5.21875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=5.21875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.5703125 max=0.453125
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5703125 max=0.453125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-4.40625 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-4.28125 max=3.65625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-1.78125 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-1.3203125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.396484375 max=0.455078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.396484375 max=0.455078125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=6.5
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.28125 max=6.5
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.45703125
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.45703125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-3.75 max=3.15625
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.078125
Linear output=0 dtype=torch.bfloat16 min=-0.34375 max=0.51953125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.34375 max=0.51953125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=5.625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.21875 max=5.625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.494140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.494140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-3.25 max=3.546875
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-2.046875 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=1.7890625
Linear output=0 dtype=torch.bfloat16 min=-0.423828125 max=0.373046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.423828125 max=0.373046875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=3.21875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=3.21875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.203125
Linear output=0 dtype=torch.bfloat16 min=-1.078125 max=0.5859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.078125 max=0.5859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-4.28125 max=3.828125
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-2.53125 max=2.40625
Linear input=0 dtype=torch.bfloat16 min=-1.3359375 max=1.1640625
Linear output=0 dtype=torch.bfloat16 min=-0.41796875 max=0.427734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.41796875 max=0.427734375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=4.375
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.59375 max=4.375
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-0.70703125 max=0.7734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.70703125 max=0.7734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.5625
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=2.640625
Linear input=0 dtype=torch.bfloat16 min=-1.9453125 max=1.9296875
Linear output=0 dtype=torch.bfloat16 min=-1.5234375 max=2.65625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.5234375 max=2.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=3.59375
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=3.59375
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.578125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=2.78125
CLIPMLP input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62890625 max=2.78125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-3.734375 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-2.5 max=2.515625
Linear input=0 dtype=torch.bfloat16 min=-1.484375 max=1.734375
Linear output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.0390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.0390625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=4.15625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.3125 max=4.15625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.15625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.74609375
CLIPMLP input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.74609375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.6875 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.15625 max=2.796875
Linear input=0 dtype=torch.bfloat16 min=-1.7265625 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.59375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.53125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=5.53125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.53125
Linear output=0 dtype=torch.bfloat16 min=-4.0625 max=1.5078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.0625 max=1.5078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.0 max=33.0
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=49406 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.0693359375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.130859375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
LayerNorm input=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
LayerNorm output=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.796875 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-2.875 max=3.453125
Linear output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.63671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.63671875
LayerNorm input=0 dtype=torch.bfloat16 min=-1.40625 max=0.6328125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation input=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.5 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-2.953125 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-0.3125 max=0.291015625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.3125 max=0.291015625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=5.21875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=5.21875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.6171875 max=0.69921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6171875 max=0.69921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-3.65625 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-3.5 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-2.21875 max=2.453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.21875 max=2.453125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=3.703125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.9375 max=3.703125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Linear output=0 dtype=torch.bfloat16 min=-0.8359375 max=0.443359375
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.8359375 max=0.443359375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-4.09375 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-3.515625 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.25
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=5.21875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=5.21875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.474609375 max=1.0078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.474609375 max=1.0078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=0.27734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.203125 max=0.27734375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
Linear input=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-7.6875 max=3.546875
GELUActivation input=0 dtype=torch.bfloat16 min=-7.6875 max=3.546875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear output=0 dtype=torch.bfloat16 min=-0.4453125 max=0.9140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.4453125 max=0.9140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-5.75 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.734375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=5.59375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.78125 max=5.59375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Linear output=0 dtype=torch.bfloat16 min=-0.921875 max=1.7734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.921875 max=1.7734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=7.34375
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-2.828125 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-1.296875 max=0.89453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.296875 max=0.89453125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
Linear input=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=4.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.375 max=4.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-0.98828125 max=1.015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.98828125 max=1.015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-0.84765625 max=0.85546875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.84765625 max=0.85546875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=5.125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.84375 max=5.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-0.79296875 max=1.34375
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.79296875 max=1.34375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=3.359375
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-1.1171875 max=0.80859375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.1171875 max=0.80859375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=4.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.90625 max=4.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.875
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-3.390625 max=3.859375
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-0.76171875 max=0.54296875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.76171875 max=0.54296875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=3.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-7.21875 max=3.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear output=0 dtype=torch.bfloat16 min=-1.0546875 max=1.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.0546875 max=1.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-3.21875 max=2.71875
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=1.9765625
Linear output=0 dtype=torch.bfloat16 min=-1.109375 max=0.95703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.109375 max=0.95703125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=2.75
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=2.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.734375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.69140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.69140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=7.21875
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=1.734375
Linear output=0 dtype=torch.bfloat16 min=-0.61328125 max=0.54296875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.61328125 max=0.54296875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=2.015625
GELUActivation input=0 dtype=torch.bfloat16 min=-7.53125 max=2.015625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.96875
Linear output=0 dtype=torch.bfloat16 min=-0.52734375 max=1.4296875
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.52734375 max=1.4296875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-0.71875 max=0.5703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.71875 max=0.5703125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=2.734375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.8125 max=2.734375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.71875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.71875
Linear output=0 dtype=torch.bfloat16 min=-0.35546875 max=0.65625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.35546875 max=0.65625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=7.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=2.625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.453125
Linear output=0 dtype=torch.bfloat16 min=-0.470703125 max=0.376953125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.470703125 max=0.376953125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=1.6875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.125 max=1.6875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.609375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.609375
Linear output=0 dtype=torch.bfloat16 min=-1.046875 max=0.84375
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.046875 max=0.84375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=3.1875
Linear input=0 dtype=torch.bfloat16 min=-1.0546875 max=0.8359375
Linear output=0 dtype=torch.bfloat16 min=-0.6015625 max=0.828125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6015625 max=0.828125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
Linear input=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=2.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=2.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.28125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.28125
Linear output=0 dtype=torch.bfloat16 min=-2.03125 max=0.55859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.03125 max=0.55859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-2.28125 max=2.03125
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=1.296875
Linear output=0 dtype=torch.bfloat16 min=-0.419921875 max=0.9609375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.419921875 max=0.9609375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-6.53125 max=6.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.53125 max=6.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.03125
Linear output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.9921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.9921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-2.28125 max=2.828125
Linear input=0 dtype=torch.bfloat16 min=-0.6875 max=1.0078125
Linear output=0 dtype=torch.bfloat16 min=-0.439453125 max=0.43359375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.439453125 max=0.43359375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=1.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=1.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.1171875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.1171875
Linear output=0 dtype=torch.bfloat16 min=-1.0390625 max=0.380859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.0390625 max=0.380859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-2.25 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-0.55859375 max=0.51953125
Linear output=0 dtype=torch.bfloat16 min=-0.375 max=1.1796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.375 max=1.1796875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=0.5546875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=0.5546875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=0.39453125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=0.39453125
Linear output=0 dtype=torch.bfloat16 min=-1.390625 max=0.66796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.390625 max=0.66796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-2.515625 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.0390625
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=1.1328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6328125 max=1.1328125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=2.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.0 max=2.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear output=0 dtype=torch.bfloat16 min=-1.2421875 max=0.53125
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2421875 max=0.53125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-2.1875 max=2.125
Linear input=0 dtype=torch.bfloat16 min=-0.83203125 max=0.88671875
Linear output=0 dtype=torch.bfloat16 min=-0.486328125 max=1.5859375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.486328125 max=1.5859375
LayerNorm input=0 dtype=torch.bfloat16 min=-66.0 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-8.5625 max=3.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.5625 max=3.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear output=0 dtype=torch.bfloat16 min=-1.8671875 max=0.765625
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.8671875 max=0.765625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm output=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-6.3125 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-2.5625 max=3.703125
Linear input=0 dtype=torch.bfloat16 min=-0.75390625 max=0.9765625
Linear output=0 dtype=torch.bfloat16 min=-0.609375 max=2.4375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.609375 max=2.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-29.5 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-29.5 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=3.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=3.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear output=0 dtype=torch.bfloat16 min=-3.15625 max=0.66796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-29.5 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.15625 max=0.66796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-0.74609375 max=0.478515625
Linear output=0 dtype=torch.bfloat16 min=-0.33203125 max=2.21875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.33203125 max=2.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-22.125 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-22.125 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=2.828125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=2.828125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.828125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.828125
Linear output=0 dtype=torch.bfloat16 min=-2.453125 max=0.396484375
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.125 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.453125 max=0.396484375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=5.90625
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=3.671875
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.03125
Linear output=0 dtype=torch.bfloat16 min=-0.4453125 max=2.109375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.4453125 max=2.109375
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=3.453125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.875 max=3.453125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.453125
Linear output=0 dtype=torch.bfloat16 min=-2.3125 max=0.7890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.3125 max=0.7890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.109375
Linear input=0 dtype=torch.bfloat16 min=-0.87109375 max=0.6015625
Linear output=0 dtype=torch.bfloat16 min=-0.58203125 max=1.34375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.58203125 max=1.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.5 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-22.5 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=2.109375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=2.109375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.078125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.078125
Linear output=0 dtype=torch.bfloat16 min=-1.21875 max=0.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.5 max=17.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.21875 max=0.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-5.125 max=2.890625
Linear input=0 dtype=torch.bfloat16 min=-0.83984375 max=0.8515625
Linear output=0 dtype=torch.bfloat16 min=-0.578125 max=1.2578125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.578125 max=1.2578125
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-10.875 max=12.125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=4.125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.6875 max=4.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-1.765625 max=0.6953125
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.875 max=12.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.765625 max=0.6953125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=2.703125
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=0.80859375
Linear output=0 dtype=torch.bfloat16 min=-0.66796875 max=1.71875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.66796875 max=1.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=8.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.625 max=8.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-1.5 max=1.9921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.5 max=1.9921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-2.390625 max=3.421875
Linear input=0 dtype=torch.bfloat16 min=-1.484375 max=1.0703125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.84375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-63.75 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-10.125 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-10.125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=2.71875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.875 max=2.71875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.703125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-1.234375 max=1.2734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.125 max=18.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.234375 max=1.2734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-3.015625 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-2.234375 max=1.265625
Linear output=0 dtype=torch.bfloat16 min=-1.609375 max=1.6796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.609375 max=1.6796875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.0 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-13.0 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=4.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=4.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-1.5546875 max=5.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.0 max=20.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.5546875 max=5.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-2.53125 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-1.53125 max=1.28125
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=0.90234375
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.828125 max=0.90234375
LayerNorm input=0 dtype=torch.bfloat16 min=-62.5 max=19.25
LayerNorm output=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
Linear input=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=3.640625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=3.640625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.640625
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=7.21875
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=7.21875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-4.34375 max=4.15625
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-2.46875 max=2.4375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.15625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=1.1328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.46875 max=1.1328125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-24.75 max=18.125
Linear input=0 dtype=torch.bfloat16 min=-24.75 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=2.4375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=2.4375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-1.265625 max=5.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.75 max=18.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.265625 max=5.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-3.359375 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-1.125 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=1.671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-8.875 max=1.671875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-24.25 max=25.25
Linear input=0 dtype=torch.bfloat16 min=-24.25 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=14.625
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.25 max=25.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.28125 max=14.625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.171875
Linear input=0 dtype=torch.bfloat16 min=-1.078125 max=1.1015625
Linear output=0 dtype=torch.bfloat16 min=-10.5 max=2.984375
CLIPAttention output=0 dtype=torch.bfloat16 min=-10.5 max=2.984375
LayerNorm input=0 dtype=torch.bfloat16 min=-72.0 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=3.59375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.34375 max=3.59375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.59375
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=24.875
CLIPMLP input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-12.9375 max=24.875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=30.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=30.125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=26.25
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-4.40625 max=3.59375
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=0 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=1
Embedding output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-0.5 max=0.490234375
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-4.09375 max=4.03125
Embedding input=0 dtype=torch.int64 min=0 max=30
Embedding output=0 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=3 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=4 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=5 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=6 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=7 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=8 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=9 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=10 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=11 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=12 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=13 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=14 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=15 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=16 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=17 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=18 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=19 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=20 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=21 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=22 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=23 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=24 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=25 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=26 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=27 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=28 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=29 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=30 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=31 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=32 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=33 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=34 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=35 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=36 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=37 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=38 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=39 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=40 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=41 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=42 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=43 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=44 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=45 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=46 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=47 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=48 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=49 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=50 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=51 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=52 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=53 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=54 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=55 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=56 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=57 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=58 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=59 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=60 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=61 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=62 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=63 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=64 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=65 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=66 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=67 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=68 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=69 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=70 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=71 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=72 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=73 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=74 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=75 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=76 dtype=torch.bfloat16 min=-47.25 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4375
Linear output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
T5Attention output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-34.25 max=33.5
Dropout output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=6.03125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.625 max=6.03125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.78125
Dropout input=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Dropout output=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
Dropout input=0 dtype=torch.bfloat16 min=-48.25 max=77.0
Dropout output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5Block output=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-0.89453125 max=0.69140625
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-3.96875 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
T5Attention output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-107.5 max=127.0
Dropout output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear output=0 dtype=torch.bfloat16 min=-3.578125 max=4.03125
Dropout input=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Dropout output=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
Dropout input=0 dtype=torch.bfloat16 min=-422.0 max=392.0
Dropout output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block output=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-0.74609375 max=0.73828125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-3.4375 max=2.65625
Linear input=0 dtype=torch.bfloat16 min=-2.765625 max=2.65625
Linear output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5Attention input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
T5Attention output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-194.0 max=191.0
Dropout output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=6.15625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-11.0 max=6.15625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=5.5
Dropout input=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Dropout output=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Linear input=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
Dropout input=0 dtype=torch.bfloat16 min=-114.5 max=117.0
Dropout output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block output=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-0.75 max=0.95703125
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-3.515625 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-3.5 max=3.4375
Linear output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
T5Attention output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-87.0 max=95.0
Dropout output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=7.96875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.8125 max=7.96875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-7.90625 max=6.0
Dropout input=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Dropout output=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
Dropout input=0 dtype=torch.bfloat16 min=-221.0 max=226.0
Dropout output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block output=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-0.9140625 max=0.94921875
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-3.328125 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-3.25 max=3.734375
Linear output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
T5Attention output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-97.5 max=110.5
Dropout output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=4.78125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-10.6875 max=4.78125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=7.21875
Dropout input=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Dropout output=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
Dropout input=0 dtype=torch.bfloat16 min=-193.0 max=198.0
Dropout output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block output=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-1.015625 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-2.921875 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-2.84375 max=2.0
Linear output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
T5Attention output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-109.5 max=114.0
Dropout output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=4.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.21875 max=4.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=5.28125
Dropout input=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Dropout output=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Linear input=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Linear output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
Dropout input=0 dtype=torch.bfloat16 min=-93.5 max=92.0
Dropout output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block output=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-0.7265625 max=0.7109375
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-3.0625 max=3.15625
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
T5Attention output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-111.5 max=82.5
Dropout output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=3.453125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.625 max=3.453125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-4.125 max=3.984375
Dropout input=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Dropout output=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
Dropout input=0 dtype=torch.bfloat16 min=-135.0 max=167.0
Dropout output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block output=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.78515625
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-1.9765625 max=2.125
Linear output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
T5Attention output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-77.0 max=48.0
Dropout output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=16.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.15625 max=16.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=16.125
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear output=0 dtype=torch.bfloat16 min=-35.75 max=28.0
Dropout input=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Dropout output=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Linear input=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Linear output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
Dropout input=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
Dropout output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block output=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-0.69140625 max=0.67578125
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-2.125 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=2.28125
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5Attention input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=53.75
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear output=0 dtype=torch.bfloat16 min=-3.953125 max=11.1875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-3.953125 max=11.1875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=17.875
Dropout input=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Dropout output=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Linear input=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Linear output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
Dropout input=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
Dropout output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-0.62109375 max=0.5390625
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-2.859375 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-2.71875 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
T5Attention output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-97.5 max=47.5
Dropout output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=15.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.75 max=15.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-31.75 max=23.125
Dropout input=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Dropout output=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Linear input=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Linear output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
Dropout input=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
Dropout output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.625
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-5.65625 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-2.3125 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
T5Attention output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-76.0 max=52.5
Dropout output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=36.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.34375 max=36.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear output=0 dtype=torch.bfloat16 min=-162.0 max=109.0
Dropout input=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Dropout output=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Linear input=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Linear output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
Dropout input=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
Dropout output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-0.58203125 max=0.55859375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-3.4375 max=2.765625
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5Attention input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
T5Attention output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-66.0 max=92.5
Dropout output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=6.53125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.71875 max=6.53125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=18.5
Dropout input=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Dropout output=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Linear output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
Dropout input=0 dtype=torch.bfloat16 min=-660.0 max=888.0
Dropout output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.6484375
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-2.90625 max=3.84375
Linear input=0 dtype=torch.bfloat16 min=-2.59375 max=3.484375
Linear output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
T5Attention output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-99.5 max=93.5
Dropout output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-4.5 max=7.6875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.5 max=7.6875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-48.5 max=52.75
Dropout input=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Dropout output=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Linear input=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Linear output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
Dropout input=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
Dropout output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-0.58984375 max=0.57421875
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-3.890625 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
T5Attention output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-82.5 max=93.5
Dropout output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=7.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.625 max=7.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.5
Linear input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=14.0625
Dropout input=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Dropout output=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Linear input=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Linear output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
Dropout input=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
Dropout output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.6328125
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-3.15625 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-2.734375 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=82.0
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=5.03125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.75 max=5.03125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=13.0
Dropout input=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Dropout output=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Linear input=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Linear output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
Dropout input=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
Dropout output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=2.578125
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-0.66796875 max=0.6875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-4.03125 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.390625 max=2.578125
T5Attention output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-87.5 max=97.5
Dropout output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.5625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=7.5625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear output=0 dtype=torch.bfloat16 min=-48.25 max=26.375
Dropout input=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Dropout output=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Linear input=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Linear output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
Dropout input=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
Dropout output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.4375 max=2.578125
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.6875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-211.0 max=217.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.4375 max=2.578125
T5Attention output=0 dtype=torch.bfloat16 min=-211.0 max=217.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-211.0 max=217.0
Dropout output=0 dtype=torch.bfloat16 min=-211.0 max=217.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.90625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.25 max=6.90625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-107.5 max=64.5
Dropout input=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Dropout output=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Linear input=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Linear output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
Dropout input=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
Dropout output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.8046875
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
T5Attention output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-274.0 max=195.0
Dropout output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=12.5625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.5 max=12.5625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear output=0 dtype=torch.bfloat16 min=-46.5 max=38.5
Dropout input=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Dropout output=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Linear input=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Linear output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
Dropout input=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
Dropout output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-1.109375 max=0.73828125
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=11.4375
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-4.65625 max=6.40625
Linear output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
T5Attention output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-462.0 max=348.0
Dropout output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=7.96875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.15625 max=7.96875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-39.0 max=34.75
Dropout input=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Dropout output=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Linear input=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Linear output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
Dropout input=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
Dropout output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-0.91015625 max=0.7265625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=8.625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=7.875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
T5Attention output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-430.0 max=420.0
Dropout output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.8125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.75 max=7.8125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear output=0 dtype=torch.bfloat16 min=-32.5 max=35.25
Dropout input=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Dropout output=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Linear input=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Linear output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
Dropout input=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
Dropout output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-0.82421875 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=13.75
Linear input=0 dtype=torch.bfloat16 min=-7.625 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
T5Attention output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-198.0 max=304.0
Dropout output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=25.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.21875 max=25.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=25.25
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-48.5 max=35.5
Dropout input=0 dtype=torch.bfloat16 min=-127.5 max=210.0
Dropout output=0 dtype=torch.bfloat16 min=-127.5 max=210.0
Linear input=0 dtype=torch.bfloat16 min=-127.5 max=210.0
Linear output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
Dropout input=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
Dropout output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.9921875
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-14.875 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-10.9375 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-280.0 max=378.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
T5Attention output=0 dtype=torch.bfloat16 min=-280.0 max=378.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-280.0 max=378.0
Dropout output=0 dtype=torch.bfloat16 min=-280.0 max=378.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=24.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=24.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=24.25
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=83.0
Dropout input=0 dtype=torch.bfloat16 min=-111.5 max=426.0
Dropout output=0 dtype=torch.bfloat16 min=-111.5 max=426.0
Linear input=0 dtype=torch.bfloat16 min=-111.5 max=426.0
Linear output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
Dropout input=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
Dropout output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-1.8046875 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-10.625 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-20.125 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-20.125 max=13.4375
Linear output=0 dtype=torch.bfloat16 min=-314.0 max=352.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
T5Attention output=0 dtype=torch.bfloat16 min=-314.0 max=352.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-314.0 max=352.0
Dropout output=0 dtype=torch.bfloat16 min=-314.0 max=352.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear output=0 dtype=torch.bfloat16 min=-13.0625 max=27.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-13.0625 max=27.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=27.625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear output=0 dtype=torch.bfloat16 min=-116.5 max=209.0
Dropout input=0 dtype=torch.bfloat16 min=-856.0 max=2064.0
Dropout output=0 dtype=torch.bfloat16 min=-856.0 max=2064.0
Linear input=0 dtype=torch.bfloat16 min=-856.0 max=2064.0
Linear output=0 dtype=torch.bfloat16 min=-5728.0 max=5568.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5728.0 max=5568.0
Dropout input=0 dtype=torch.bfloat16 min=-5728.0 max=5568.0
Dropout output=0 dtype=torch.bfloat16 min=-5728.0 max=5568.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=0.9140625
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-27.5 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-27.5 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-458.0 max=640.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
T5Attention output=0 dtype=torch.bfloat16 min=-458.0 max=640.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-458.0 max=640.0
Dropout output=0 dtype=torch.bfloat16 min=-458.0 max=640.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-7.0 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-7.0 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-16.125 max=123.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-16.125 max=123.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=123.0
Linear input=0 dtype=torch.bfloat16 min=-7.0 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-115.5 max=137.0
Dropout input=0 dtype=torch.bfloat16 min=-2024.0 max=3680.0
Dropout output=0 dtype=torch.bfloat16 min=-2024.0 max=3680.0
Linear input=0 dtype=torch.bfloat16 min=-2024.0 max=3680.0
Linear output=0 dtype=torch.bfloat16 min=-45824.0 max=45312.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-7.0 max=9.25
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-45824.0 max=45312.0
Dropout input=0 dtype=torch.bfloat16 min=-45824.0 max=45312.0
Dropout output=0 dtype=torch.bfloat16 min=-45824.0 max=45312.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5Block input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block output=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.46875 max=2.3125
Dropout input=0 dtype=torch.bfloat16 min=-6.46875 max=2.3125
Dropout output=0 dtype=torch.bfloat16 min=-6.46875 max=2.3125
T5EncoderModel input=0 dtype=torch.int64 min=0 max=1
0%| | 0/2 [00:00<?, ?it/s]tensor([1000., 1000.])
Conv2d input=0 dtype=torch.bfloat16 min=-3.328125 max=3.328125
Conv2d output=0 dtype=torch.bfloat16 min=-8.5 max=5.5625
Conv2d output=1 dtype=torch.bfloat16 min=-8.5 max=5.5625
PatchEmbed input=0 dtype=torch.bfloat16 min=-3.328125 max=3.328125
PatchEmbed output=0 dtype=torch.bfloat16 min=-9.25 max=6.09375
PatchEmbed output=1 dtype=torch.bfloat16 min=-9.25 max=6.09375
Timesteps input=0 dtype=torch.float32 min=1000.0 max=1000.0
Timesteps output=0 dtype=torch.float32 min=-0.9999996423721313 max=0.9997203946113586
Timesteps output=1 dtype=torch.float32 min=-0.9999996423721313 max=0.9997203946113586
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=2.234375
Linear output=1 dtype=torch.bfloat16 min=-8.75 max=2.234375
SiLU input=0 dtype=torch.bfloat16 min=-8.75 max=2.234375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=3.9375
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=3.9375
TimestepEmbedding input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
TimestepEmbedding output=0 dtype=torch.bfloat16 min=-9.0 max=3.9375
TimestepEmbedding output=1 dtype=torch.bfloat16 min=-9.0 max=3.9375
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-33.75 max=15.9375
SiLU input=0 dtype=torch.bfloat16 min=-40.5 max=15.9375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=15.8125
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
Linear output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
PixArtAlphaTextProjection input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
PixArtAlphaTextProjection output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
PixArtAlphaTextProjection output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
CombinedTimestepTextProjEmbeddings input=0 dtype=torch.float32 min=1000.0 max=1000.0
CombinedTimestepTextProjEmbeddings input=1 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CombinedTimestepTextProjEmbeddings output=0 dtype=torch.bfloat16 min=-20.5 max=4.71875
CombinedTimestepTextProjEmbeddings output=1 dtype=torch.bfloat16 min=-21.25 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
Linear output=0 dtype=torch.bfloat16 min=-812.0 max=612.0
Linear output=1 dtype=torch.bfloat16 min=-812.0 max=612.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.8359375 max=3.71875
Linear output=1 dtype=torch.bfloat16 min=-1.8125 max=3.75
LayerNorm input=0 dtype=torch.bfloat16 min=-9.25 max=6.09375
LayerNorm output=0 dtype=torch.bfloat16 min=-15.4375 max=8.5
LayerNorm output=1 dtype=torch.bfloat16 min=-15.4375 max=8.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9.25 max=6.09375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.25 max=5.0625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.09375 max=3.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.49609375 max=0.9375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.5625 max=1.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.734375 max=2.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=7.125
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=7.15625
LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=16.0
LayerNorm output=1 dtype=torch.bfloat16 min=-23.0 max=16.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.15625 max=1.2421875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-8.125 max=7.15625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.046875 max=0.17578125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.890625 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-18.0 max=16.625
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-14.0625 max=13.5625
Linear output=1 dtype=torch.bfloat16 min=-13.5625 max=13.1875
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=4.34375
Linear output=1 dtype=torch.bfloat16 min=-5.46875 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.65625 max=5.25
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=4.25
Linear output=1 dtype=torch.bfloat16 min=-4.40625 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-4.6875 max=4.90625
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=6.375
Linear output=1 dtype=torch.bfloat16 min=-21.0 max=6.125
Dropout input=0 dtype=torch.bfloat16 min=-21.25 max=6.375
Dropout output=0 dtype=torch.bfloat16 min=-21.25 max=6.375
Dropout output=1 dtype=torch.bfloat16 min=-21.0 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-6.1875 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=11.375
Attention output=0 dtype=torch.bfloat16 min=-21.25 max=6.375
Attention output=1 dtype=torch.bfloat16 min=-11.4375 max=11.375
LayerNorm input=0 dtype=torch.bfloat16 min=-75.0 max=10.875
LayerNorm output=0 dtype=torch.bfloat16 min=-37.5 max=5.40625
LayerNorm output=1 dtype=torch.bfloat16 min=-37.5 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-2.171875 max=2.109375
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-9.3125 max=4.3125
GELU input=0 dtype=torch.bfloat16 min=-2.171875 max=2.109375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Linear output=0 dtype=torch.bfloat16 min=-19.625 max=18.125
Linear output=1 dtype=torch.bfloat16 min=-19.875 max=18.25
FeedForward input=0 dtype=torch.bfloat16 min=-2.171875 max=2.109375
FeedForward output=0 dtype=torch.bfloat16 min=-19.625 max=18.125
FeedForward output=1 dtype=torch.bfloat16 min=-19.875 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-816.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.75 max=23.875
LayerNorm output=1 dtype=torch.bfloat16 min=-24.125 max=23.25
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=26.125
Linear output=1 dtype=torch.bfloat16 min=-14.375 max=30.125
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=30.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=30.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.125
Linear output=0 dtype=torch.bfloat16 min=-32.25 max=34.5
Linear output=1 dtype=torch.bfloat16 min=-36.25 max=34.5
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
FeedForward output=0 dtype=torch.bfloat16 min=-32.25 max=34.5
FeedForward output=1 dtype=torch.bfloat16 min=-36.25 max=34.5
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-844.0 max=612.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-55.5 max=22.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=3.53125
Linear output=1 dtype=torch.bfloat16 min=-2.625 max=3.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-55.5 max=22.5
LayerNorm output=0 dtype=torch.bfloat16 min=-30.125 max=12.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-30.0 max=12.3125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-55.5 max=22.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.75 max=9.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.078125 max=2.28125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.5234375 max=1.1953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.046875 max=2.78125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.5078125 max=2.875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-3.0 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-3.015625 max=4.28125
LayerNorm input=0 dtype=torch.bfloat16 min=-844.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-31.25 max=15.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-844.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.734375 max=5.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.765625 max=2.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.40625 max=3.328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.15625 max=4.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.015625 max=1.140625
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=9.3125
Linear output=0 dtype=torch.bfloat16 min=-25.0 max=26.0
Linear output=1 dtype=torch.bfloat16 min=-24.625 max=25.625
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=9.3125
Linear output=0 dtype=torch.bfloat16 min=-12.625 max=12.1875
Linear output=1 dtype=torch.bfloat16 min=-12.375 max=11.875
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=9.3125
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-8.375 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-3.90625 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=5.09375
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=4.625
Linear output=1 dtype=torch.bfloat16 min=-3.84375 max=3.703125
Linear input=0 dtype=torch.bfloat16 min=-4.34375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=7.03125
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=7.0
Dropout input=0 dtype=torch.bfloat16 min=-6.71875 max=7.03125
Dropout output=0 dtype=torch.bfloat16 min=-6.71875 max=7.03125
Dropout output=1 dtype=torch.bfloat16 min=-6.5 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-1.859375 max=2.015625
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-10.5 max=5.84375
Attention output=0 dtype=torch.bfloat16 min=-6.71875 max=7.03125
Attention output=1 dtype=torch.bfloat16 min=-10.5 max=6.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-58.5 max=22.5
LayerNorm output=0 dtype=torch.bfloat16 min=-30.375 max=12.0
LayerNorm output=1 dtype=torch.bfloat16 min=-30.25 max=11.9375
Linear input=0 dtype=torch.bfloat16 min=-5.375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=3.546875
Linear output=1 dtype=torch.bfloat16 min=-5.6875 max=3.546875
GELU input=0 dtype=torch.bfloat16 min=-5.375 max=4.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear output=0 dtype=torch.bfloat16 min=-19.625 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-19.5 max=16.0
FeedForward input=0 dtype=torch.bfloat16 min=-5.375 max=4.34375
FeedForward output=0 dtype=torch.bfloat16 min=-19.625 max=16.125
FeedForward output=1 dtype=torch.bfloat16 min=-19.5 max=16.0
LayerNorm input=0 dtype=torch.bfloat16 min=-844.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.25 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-26.75 max=16.5
Linear output=0 dtype=torch.bfloat16 min=-31.125 max=30.625
Linear output=1 dtype=torch.bfloat16 min=-28.125 max=28.375
GELU input=0 dtype=torch.bfloat16 min=-26.75 max=16.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=28.375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=28.375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.625
Linear output=0 dtype=torch.bfloat16 min=-221.0 max=498.0
Linear output=1 dtype=torch.bfloat16 min=-208.0 max=500.0
FeedForward input=0 dtype=torch.bfloat16 min=-26.75 max=16.5
FeedForward output=0 dtype=torch.bfloat16 min=-221.0 max=498.0
FeedForward output=1 dtype=torch.bfloat16 min=-208.0 max=500.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-53.0 max=22.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.1875
Linear output=1 dtype=torch.bfloat16 min=-3.203125 max=3.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-53.0 max=22.5
LayerNorm output=0 dtype=torch.bfloat16 min=-27.375 max=14.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-27.125 max=14.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-53.0 max=22.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.9375 max=5.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.203125 max=1.0390625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0 max=2.609375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.4375 max=2.84375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.625 max=1.3359375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=4.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.75 max=10.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=9.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.40625 max=4.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.453125 max=1.859375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-6.71875 max=2.984375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.609375 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.4375 max=4.3125
Linear input=0 dtype=torch.bfloat16 min=-5.9375 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-10.0 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-5.9375 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-8.4375 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-5.9375 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=6.25
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-5.40625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=4.8125
Linear output=1 dtype=torch.bfloat16 min=-5.15625 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-5.40625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-4.65625 max=5.625
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-5.40625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-3.515625 max=3.875
Linear output=1 dtype=torch.bfloat16 min=-4.375 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-4.125 max=5.75
Linear output=1 dtype=torch.bfloat16 min=-4.65625 max=6.125
Dropout input=0 dtype=torch.bfloat16 min=-4.65625 max=6.125
Dropout output=0 dtype=torch.bfloat16 min=-4.125 max=5.75
Dropout output=1 dtype=torch.bfloat16 min=-4.65625 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-3.484375 max=3.75
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=7.125
Linear output=1 dtype=torch.bfloat16 min=-8.3125 max=7.59375
Attention output=0 dtype=torch.bfloat16 min=-4.65625 max=6.125
Attention output=1 dtype=torch.bfloat16 min=-8.3125 max=7.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-58.25 max=22.5
LayerNorm output=0 dtype=torch.bfloat16 min=-28.75 max=13.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-29.0 max=13.5625
Linear input=0 dtype=torch.bfloat16 min=-3.765625 max=2.9375
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=3.75
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=3.6875
GELU input=0 dtype=torch.bfloat16 min=-3.765625 max=2.9375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=12.5
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=12.6875
FeedForward input=0 dtype=torch.bfloat16 min=-3.765625 max=2.9375
FeedForward output=0 dtype=torch.bfloat16 min=-6.96875 max=12.5
FeedForward output=1 dtype=torch.bfloat16 min=-6.84375 max=12.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-2336.0 max=604.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=10.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-23.125 max=30.5
Linear output=0 dtype=torch.bfloat16 min=-24.875 max=15.6875
Linear output=1 dtype=torch.bfloat16 min=-22.75 max=15.625
GELU input=0 dtype=torch.bfloat16 min=-23.125 max=30.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
Linear output=0 dtype=torch.bfloat16 min=-148.0 max=220.0
Linear output=1 dtype=torch.bfloat16 min=-151.0 max=223.0
FeedForward input=0 dtype=torch.bfloat16 min=-23.125 max=30.5
FeedForward output=0 dtype=torch.bfloat16 min=-148.0 max=220.0
FeedForward output=1 dtype=torch.bfloat16 min=-151.0 max=223.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3552.0 max=668.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-44.0 max=22.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=3.40625
Linear output=1 dtype=torch.bfloat16 min=-2.625 max=3.375
LayerNorm input=0 dtype=torch.bfloat16 min=-44.0 max=22.5
LayerNorm output=0 dtype=torch.bfloat16 min=-25.875 max=13.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-26.0 max=13.3125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-44.0 max=22.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.4375 max=10.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.65234375 max=3.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5703125 max=2.359375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.3125 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.6171875 max=3.40625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=4.0
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=3.984375
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=668.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=9.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=12.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3552.0 max=668.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.25 max=9.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.65625 max=3.671875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-4.9375 max=2.59375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.390625 max=1.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.46875 max=4.0
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-32.5 max=24.875
Linear output=1 dtype=torch.bfloat16 min=-32.25 max=24.375
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-16.875 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-16.25 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=6.65625
Linear output=1 dtype=torch.bfloat16 min=-6.65625 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-6.25 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-4.4375 max=5.40625
Linear output=1 dtype=torch.bfloat16 min=-4.5 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-6.25 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=6.78125
Linear output=1 dtype=torch.bfloat16 min=-5.53125 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-6.25 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-3.734375 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-3.359375 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-10.8125 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=7.84375
Dropout input=0 dtype=torch.bfloat16 min=-10.8125 max=8.25
Dropout output=0 dtype=torch.bfloat16 min=-10.8125 max=8.25
Dropout output=1 dtype=torch.bfloat16 min=-10.5625 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-4.84375 max=3.890625
Linear output=0 dtype=torch.bfloat16 min=-11.875 max=12.4375
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=8.8125
Attention output=0 dtype=torch.bfloat16 min=-10.8125 max=8.25
Attention output=1 dtype=torch.bfloat16 min=-15.375 max=12.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=22.75
LayerNorm output=0 dtype=torch.bfloat16 min=-30.25 max=11.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-30.125 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-3.640625 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-6.53125 max=3.609375
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=4.0
GELU input=0 dtype=torch.bfloat16 min=-3.640625 max=2.390625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=10.8125
FeedForward input=0 dtype=torch.bfloat16 min=-3.640625 max=2.390625
FeedForward output=0 dtype=torch.bfloat16 min=-14.25 max=10.5625
FeedForward output=1 dtype=torch.bfloat16 min=-14.625 max=10.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=664.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=9.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=11.375
Linear input=0 dtype=torch.bfloat16 min=-17.25 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-11.125 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-13.625 max=8.9375
GELU input=0 dtype=torch.bfloat16 min=-17.25 max=19.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-58.25 max=34.0
Linear output=1 dtype=torch.bfloat16 min=-56.75 max=28.875
FeedForward input=0 dtype=torch.bfloat16 min=-17.25 max=19.25
FeedForward output=0 dtype=torch.bfloat16 min=-58.25 max=34.0
FeedForward output=1 dtype=torch.bfloat16 min=-56.75 max=28.875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3776.0 max=660.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-58.0 max=27.375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.625 max=4.0
Linear output=1 dtype=torch.bfloat16 min=-1.640625 max=3.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-58.0 max=27.375
LayerNorm output=0 dtype=torch.bfloat16 min=-28.25 max=16.875
LayerNorm output=1 dtype=torch.bfloat16 min=-28.0 max=16.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-58.0 max=27.375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.5625 max=5.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.75 max=3.890625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.640625 max=1.2421875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.203125 max=1.6640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.8203125 max=4.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=8.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-3776.0 max=660.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=15.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=17.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3776.0 max=660.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.65625 max=5.40625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.03125 max=4.28125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.125 max=1.9453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.171875 max=1.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.0 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-5.5625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=12.9375
Linear output=1 dtype=torch.bfloat16 min=-11.375 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-5.5625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-10.6875 max=12.0
Linear input=0 dtype=torch.bfloat16 min=-5.5625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=5.5
Linear output=1 dtype=torch.bfloat16 min=-5.21875 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=4.90625
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-3.453125 max=3.390625
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=4.09375
Linear output=1 dtype=torch.bfloat16 min=-11.5 max=4.40625
Dropout input=0 dtype=torch.bfloat16 min=-11.5 max=4.40625
Dropout output=0 dtype=torch.bfloat16 min=-10.75 max=4.09375
Dropout output=1 dtype=torch.bfloat16 min=-11.5 max=4.40625
Linear input=0 dtype=torch.bfloat16 min=-5.96875 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-13.5 max=12.5625
Linear output=1 dtype=torch.bfloat16 min=-14.0 max=9.125
Attention output=0 dtype=torch.bfloat16 min=-11.5 max=4.40625
Attention output=1 dtype=torch.bfloat16 min=-14.0 max=12.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-78.5 max=28.75
LayerNorm output=0 dtype=torch.bfloat16 min=-32.0 max=13.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-31.875 max=13.625
Linear input=0 dtype=torch.bfloat16 min=-6.375 max=2.453125
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=3.53125
Linear output=1 dtype=torch.bfloat16 min=-6.1875 max=3.5
GELU input=0 dtype=torch.bfloat16 min=-6.375 max=2.453125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=10.125
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=9.4375
FeedForward input=0 dtype=torch.bfloat16 min=-6.375 max=2.453125
FeedForward output=0 dtype=torch.bfloat16 min=-5.5 max=10.125
FeedForward output=1 dtype=torch.bfloat16 min=-5.90625 max=9.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-3776.0 max=656.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=14.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=17.75
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=10.6875
Linear output=1 dtype=torch.bfloat16 min=-12.75 max=10.0625
GELU input=0 dtype=torch.bfloat16 min=-9.125 max=7.53125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-53.75 max=33.25
Linear output=1 dtype=torch.bfloat16 min=-53.0 max=29.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.125 max=7.53125
FeedForward output=0 dtype=torch.bfloat16 min=-53.75 max=33.25
FeedForward output=1 dtype=torch.bfloat16 min=-53.0 max=29.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4224.0 max=688.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-77.5 max=33.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.125
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=4.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-77.5 max=33.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.5 max=18.625
LayerNorm output=1 dtype=torch.bfloat16 min=-31.25 max=18.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-77.5 max=33.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.3125 max=11.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.96875 max=0.78515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9375 max=1.3125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.75 max=2.46875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.9140625 max=4.125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.3125 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-4.3125 max=9.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4224.0 max=688.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=23.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=26.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4224.0 max=688.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.25 max=4.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.3125 max=4.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5390625 max=1.765625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.65625 max=1.4296875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.09375 max=9.75
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=11.9375
Linear output=0 dtype=torch.bfloat16 min=-11.5 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-10.625 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=11.9375
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-11.375 max=12.9375
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=11.9375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.53125
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-4.875 max=5.4375
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=4.34375
Linear output=1 dtype=torch.bfloat16 min=-7.5 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-5.53125 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-4.125 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-5.78125 max=11.5625
Linear output=1 dtype=torch.bfloat16 min=-5.46875 max=15.8125
Dropout input=0 dtype=torch.bfloat16 min=-5.78125 max=15.8125
Dropout output=0 dtype=torch.bfloat16 min=-5.78125 max=11.5625
Dropout output=1 dtype=torch.bfloat16 min=-5.46875 max=15.8125
Linear input=0 dtype=torch.bfloat16 min=-3.921875 max=3.890625
Linear output=0 dtype=torch.bfloat16 min=-13.1875 max=18.375
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=14.625
Attention output=0 dtype=torch.bfloat16 min=-5.78125 max=15.8125
Attention output=1 dtype=torch.bfloat16 min=-13.1875 max=18.375
LayerNorm input=0 dtype=torch.bfloat16 min=-120.0 max=33.25
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=13.25
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=12.375
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=2.1875
Linear output=0 dtype=torch.bfloat16 min=-10.3125 max=3.984375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=4.84375
GELU input=0 dtype=torch.bfloat16 min=-8.125 max=2.1875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=8.125
Linear output=1 dtype=torch.bfloat16 min=-14.5 max=9.75
FeedForward input=0 dtype=torch.bfloat16 min=-8.125 max=2.1875
FeedForward output=0 dtype=torch.bfloat16 min=-12.375 max=8.125
FeedForward output=1 dtype=torch.bfloat16 min=-14.5 max=9.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4224.0 max=688.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=23.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=25.5
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-11.875 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=8.6875
GELU input=0 dtype=torch.bfloat16 min=-8.25 max=6.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-51.75 max=34.75
Linear output=1 dtype=torch.bfloat16 min=-50.75 max=34.0
FeedForward input=0 dtype=torch.bfloat16 min=-8.25 max=6.0
FeedForward output=0 dtype=torch.bfloat16 min=-51.75 max=34.75
FeedForward output=1 dtype=torch.bfloat16 min=-50.75 max=34.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4736.0 max=844.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-105.0 max=42.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.28125
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=4.25
LayerNorm input=0 dtype=torch.bfloat16 min=-105.0 max=42.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=17.375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=17.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-105.0 max=42.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.40625 max=7.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.46875 max=1.09375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8515625 max=1.796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.1875 max=2.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.3046875 max=4.28125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-10.6875 max=8.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-4736.0 max=844.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=29.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=30.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4736.0 max=844.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.46875 max=4.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.84375 max=5.09375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.2421875 max=1.125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.40625 max=1.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.6875 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=13.25
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=13.125
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-14.375 max=13.5
Linear output=1 dtype=torch.bfloat16 min=-13.4375 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=5.5625
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=5.1875
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.4375
Linear output=0 dtype=torch.bfloat16 min=-5.0 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-7.125 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.4375
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.53125
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=5.75
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.4375
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-6.34375 max=5.71875
Linear input=0 dtype=torch.bfloat16 min=-3.359375 max=3.5
Linear output=0 dtype=torch.bfloat16 min=-3.640625 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-3.921875 max=10.0
Dropout input=0 dtype=torch.bfloat16 min=-3.921875 max=10.0
Dropout output=0 dtype=torch.bfloat16 min=-3.640625 max=8.5
Dropout output=1 dtype=torch.bfloat16 min=-3.921875 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-4.5 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-13.875 max=10.0
Linear output=1 dtype=torch.bfloat16 min=-11.3125 max=15.8125
Attention output=0 dtype=torch.bfloat16 min=-3.921875 max=10.0
Attention output=1 dtype=torch.bfloat16 min=-13.875 max=15.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-130.0 max=43.5
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=14.0
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=13.625
Linear input=0 dtype=torch.bfloat16 min=-6.875 max=2.328125
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-6.1875 max=3.796875
GELU input=0 dtype=torch.bfloat16 min=-6.875 max=2.328125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.875
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=12.375
FeedForward input=0 dtype=torch.bfloat16 min=-6.875 max=2.328125
FeedForward output=0 dtype=torch.bfloat16 min=-5.53125 max=12.25
FeedForward output=1 dtype=torch.bfloat16 min=-5.96875 max=12.375
LayerNorm input=0 dtype=torch.bfloat16 min=-4736.0 max=844.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=27.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=29.5
Linear input=0 dtype=torch.bfloat16 min=-6.8125 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-10.5 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-10.3125 max=8.5625
GELU input=0 dtype=torch.bfloat16 min=-6.8125 max=4.9375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
Linear output=0 dtype=torch.bfloat16 min=-13.8125 max=40.5
Linear output=1 dtype=torch.bfloat16 min=-14.25 max=42.25
FeedForward input=0 dtype=torch.bfloat16 min=-6.8125 max=4.9375
FeedForward output=0 dtype=torch.bfloat16 min=-13.8125 max=40.5
FeedForward output=1 dtype=torch.bfloat16 min=-14.25 max=42.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4864.0 max=988.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-109.0 max=51.25
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=5.0625
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=5.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-109.0 max=51.25
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=20.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=20.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-109.0 max=51.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.75 max=14.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.2109375 max=4.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.0625 max=1.671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.734375 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.0625 max=1.4609375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-12.1875 max=12.25
LayerNorm input=0 dtype=torch.bfloat16 min=-4864.0 max=988.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=29.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=31.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4864.0 max=988.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.96875 max=3.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.8125 max=5.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.8984375 max=0.8203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.546875 max=1.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.1875 max=12.25
Linear input=0 dtype=torch.bfloat16 min=-14.75 max=14.1875
Linear output=0 dtype=torch.bfloat16 min=-17.125 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-14.75 max=14.1875
Linear output=0 dtype=torch.bfloat16 min=-21.625 max=20.125
Linear output=1 dtype=torch.bfloat16 min=-19.75 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-14.75 max=14.1875
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-8.875 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=6.5625
Linear output=1 dtype=torch.bfloat16 min=-5.59375 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-6.3125 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.34375
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-15.875 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=7.3125
Dropout input=0 dtype=torch.bfloat16 min=-16.375 max=8.25
Dropout output=0 dtype=torch.bfloat16 min=-15.875 max=8.25
Dropout output=1 dtype=torch.bfloat16 min=-16.375 max=7.3125
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=5.4375
Linear output=0 dtype=torch.bfloat16 min=-16.5 max=12.1875
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=12.3125
Attention output=0 dtype=torch.bfloat16 min=-16.375 max=8.25
Attention output=1 dtype=torch.bfloat16 min=-16.5 max=12.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-152.0 max=52.75
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=14.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-10.125 max=1.9609375
Linear output=0 dtype=torch.bfloat16 min=-13.25 max=3.671875
Linear output=1 dtype=torch.bfloat16 min=-13.3125 max=4.5
GELU input=0 dtype=torch.bfloat16 min=-10.125 max=1.9609375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-14.6875 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=11.75
FeedForward input=0 dtype=torch.bfloat16 min=-10.125 max=1.9609375
FeedForward output=0 dtype=torch.bfloat16 min=-14.6875 max=9.875
FeedForward output=1 dtype=torch.bfloat16 min=-13.6875 max=11.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=968.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=26.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=29.0
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=3.21875
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=8.0625
Linear output=1 dtype=torch.bfloat16 min=-10.125 max=8.875
GELU input=0 dtype=torch.bfloat16 min=-6.71875 max=3.21875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.0625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.0625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-64.0 max=52.5
Linear output=1 dtype=torch.bfloat16 min=-71.5 max=60.0
FeedForward input=0 dtype=torch.bfloat16 min=-6.71875 max=3.21875
FeedForward output=0 dtype=torch.bfloat16 min=-64.0 max=52.5
FeedForward output=1 dtype=torch.bfloat16 min=-71.5 max=60.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5088.0 max=1200.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-132.0 max=64.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.296875 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-2.328125 max=6.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-132.0 max=64.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=21.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=21.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-132.0 max=64.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.5625 max=9.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.015625 max=5.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9296875 max=1.71875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9375 max=1.8984375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.9765625 max=6.4375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0625 max=15.5625
Linear output=1 dtype=torch.bfloat16 min=-14.0 max=15.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5088.0 max=1200.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=28.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=29.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5088.0 max=1200.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.0625 max=3.21875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-8.25 max=9.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.3203125 max=0.73828125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.5 max=1.2578125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.0625 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-16.25 max=15.375
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=15.3125
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-15.6875 max=18.125
Linear output=1 dtype=torch.bfloat16 min=-14.875 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-15.25 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-15.5625 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=3.21875
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=5.75
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=5.1875
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=3.21875
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=7.9375
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=3.21875
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-7.71875 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-5.625 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-18.5 max=4.75
Linear output=1 dtype=torch.bfloat16 min=-18.0 max=4.96875
Dropout input=0 dtype=torch.bfloat16 min=-18.5 max=4.96875
Dropout output=0 dtype=torch.bfloat16 min=-18.5 max=4.75
Dropout output=1 dtype=torch.bfloat16 min=-18.0 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-7.4375 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-19.5 max=20.0
Linear output=1 dtype=torch.bfloat16 min=-19.75 max=19.25
Attention output=0 dtype=torch.bfloat16 min=-18.5 max=4.96875
Attention output=1 dtype=torch.bfloat16 min=-19.75 max=20.0
LayerNorm input=0 dtype=torch.bfloat16 min=-196.0 max=66.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=13.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=1.75
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=2.78125
Linear output=1 dtype=torch.bfloat16 min=-10.75 max=3.796875
GELU input=0 dtype=torch.bfloat16 min=-9.8125 max=1.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.78125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.78125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=13.8125
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=13.375
FeedForward input=0 dtype=torch.bfloat16 min=-9.8125 max=1.75
FeedForward output=0 dtype=torch.bfloat16 min=-4.71875 max=13.8125
FeedForward output=1 dtype=torch.bfloat16 min=-5.25 max=13.375
LayerNorm input=0 dtype=torch.bfloat16 min=-5088.0 max=1192.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=27.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-7.4375 max=4.3125
Linear output=0 dtype=torch.bfloat16 min=-13.9375 max=11.75
Linear output=1 dtype=torch.bfloat16 min=-13.625 max=11.3125
GELU input=0 dtype=torch.bfloat16 min=-7.4375 max=4.3125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-52.5 max=47.75
Linear output=1 dtype=torch.bfloat16 min=-54.0 max=49.75
FeedForward input=0 dtype=torch.bfloat16 min=-7.4375 max=4.3125
FeedForward output=0 dtype=torch.bfloat16 min=-52.5 max=47.75
FeedForward output=1 dtype=torch.bfloat16 min=-54.0 max=49.75
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4896.0 max=1832.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-186.0 max=70.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.0
Linear output=1 dtype=torch.bfloat16 min=-6.375 max=6.0
LayerNorm input=0 dtype=torch.bfloat16 min=-186.0 max=70.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=20.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=20.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-186.0 max=70.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.5 max=6.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.375 max=1.9609375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8828125 max=1.6015625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.921875 max=1.3515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.046875 max=6.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-13.4375 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-13.5 max=13.375
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=1832.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=28.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=29.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4896.0 max=1832.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.34375 max=4.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.1875 max=10.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.76171875 max=0.64453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.171875 max=1.53125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-13.5 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=7.40625
Linear output=1 dtype=torch.bfloat16 min=-9.5 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=11.4375
Linear output=1 dtype=torch.bfloat16 min=-10.0625 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=5.78125
Linear output=1 dtype=torch.bfloat16 min=-6.09375 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-6.4375 max=7.21875
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.34375
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-9.375 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=15.5
Linear output=1 dtype=torch.bfloat16 min=-7.625 max=17.875
Dropout input=0 dtype=torch.bfloat16 min=-7.625 max=17.875
Dropout output=0 dtype=torch.bfloat16 min=-5.59375 max=15.5
Dropout output=1 dtype=torch.bfloat16 min=-7.625 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=6.09375
Linear output=0 dtype=torch.bfloat16 min=-18.75 max=20.0
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=18.0
Attention output=0 dtype=torch.bfloat16 min=-7.625 max=17.875
Attention output=1 dtype=torch.bfloat16 min=-18.75 max=20.0
LayerNorm input=0 dtype=torch.bfloat16 min=-222.0 max=71.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=13.25
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=13.5625
Linear input=0 dtype=torch.bfloat16 min=-8.3125 max=2.15625
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=2.984375
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=3.625
GELU input=0 dtype=torch.bfloat16 min=-8.3125 max=2.15625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.984375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.984375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Linear output=0 dtype=torch.bfloat16 min=-3.90625 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-4.40625 max=18.25
FeedForward input=0 dtype=torch.bfloat16 min=-8.3125 max=2.15625
FeedForward output=0 dtype=torch.bfloat16 min=-3.90625 max=19.625
FeedForward output=1 dtype=torch.bfloat16 min=-4.40625 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-4960.0 max=1848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=26.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=27.875
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-24.75 max=18.25
Linear output=1 dtype=torch.bfloat16 min=-26.0 max=15.9375
GELU input=0 dtype=torch.bfloat16 min=-8.75 max=4.5625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=18.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=18.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=18.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=18.25
Linear output=0 dtype=torch.bfloat16 min=-84.5 max=79.5
Linear output=1 dtype=torch.bfloat16 min=-83.0 max=78.0
FeedForward input=0 dtype=torch.bfloat16 min=-8.75 max=4.5625
FeedForward output=0 dtype=torch.bfloat16 min=-84.5 max=79.5
FeedForward output=1 dtype=torch.bfloat16 min=-83.0 max=78.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5024.0 max=2912.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-162.0 max=79.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=5.71875
Linear output=1 dtype=torch.bfloat16 min=-2.71875 max=5.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-162.0 max=79.5
LayerNorm output=0 dtype=torch.bfloat16 min=-31.125 max=21.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=22.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-162.0 max=79.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.3125 max=11.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.9453125 max=5.71875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.484375 max=1.4765625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.859375 max=1.1640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.734375 max=5.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-13.625 max=15.5
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=15.5
LayerNorm input=0 dtype=torch.bfloat16 min=-5024.0 max=2912.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=30.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=30.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5024.0 max=2912.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.5625 max=4.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.09375 max=11.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.8359375 max=1.0859375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=1.8515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-13.6875 max=15.5
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-14.5625 max=12.9375
Linear output=1 dtype=torch.bfloat16 min=-13.0 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-20.625 max=19.125
Linear output=1 dtype=torch.bfloat16 min=-18.0 max=18.125
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=6.28125
Linear output=1 dtype=torch.bfloat16 min=-5.8125 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.0
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=5.09375
Linear output=1 dtype=torch.bfloat16 min=-7.625 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=8.125
Linear output=1 dtype=torch.bfloat16 min=-8.8125 max=7.9375
Linear input=0 dtype=torch.bfloat16 min=-4.125 max=4.875
Linear output=0 dtype=torch.bfloat16 min=-23.0 max=4.21875
Linear output=1 dtype=torch.bfloat16 min=-23.25 max=4.78125
Dropout input=0 dtype=torch.bfloat16 min=-23.25 max=4.78125
Dropout output=0 dtype=torch.bfloat16 min=-23.0 max=4.21875
Dropout output=1 dtype=torch.bfloat16 min=-23.25 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-20.625 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-24.375 max=12.8125
Attention output=0 dtype=torch.bfloat16 min=-23.25 max=4.78125
Attention output=1 dtype=torch.bfloat16 min=-24.375 max=13.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-224.0 max=82.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=13.25
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=3.015625
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=3.796875
Linear output=1 dtype=torch.bfloat16 min=-7.28125 max=4.5625
GELU input=0 dtype=torch.bfloat16 min=-8.4375 max=3.015625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.796875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=23.75
Linear output=1 dtype=torch.bfloat16 min=-6.46875 max=25.625
FeedForward input=0 dtype=torch.bfloat16 min=-8.4375 max=3.015625
FeedForward output=0 dtype=torch.bfloat16 min=-5.71875 max=23.75
FeedForward output=1 dtype=torch.bfloat16 min=-6.46875 max=25.625
LayerNorm input=0 dtype=torch.bfloat16 min=-4960.0 max=2912.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=29.375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=30.0
Linear input=0 dtype=torch.bfloat16 min=-7.65625 max=4.09375
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=19.25
Linear output=1 dtype=torch.bfloat16 min=-12.125 max=12.0625
GELU input=0 dtype=torch.bfloat16 min=-7.65625 max=4.09375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-65.5 max=62.25
Linear output=1 dtype=torch.bfloat16 min=-65.5 max=61.5
FeedForward input=0 dtype=torch.bfloat16 min=-7.65625 max=4.09375
FeedForward output=0 dtype=torch.bfloat16 min=-65.5 max=62.25
FeedForward output=1 dtype=torch.bfloat16 min=-65.5 max=61.5
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4896.0 max=3680.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-153.0 max=87.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=5.84375
Linear output=1 dtype=torch.bfloat16 min=-5.5625 max=5.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-153.0 max=87.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.875 max=23.875
LayerNorm output=1 dtype=torch.bfloat16 min=-31.375 max=23.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-153.0 max=87.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.8125 max=9.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.5 max=5.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.046875 max=1.8046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.71875 max=0.9375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.5625 max=3.015625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=11.8125
Linear output=1 dtype=torch.bfloat16 min=-14.125 max=11.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=3680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=31.25
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=31.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4896.0 max=3680.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.5625 max=5.09375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.8125 max=11.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9609375 max=0.875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.453125 max=3.03125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.125 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-17.875 max=18.625
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=17.75
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-18.625 max=20.75
Linear output=1 dtype=torch.bfloat16 min=-18.375 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-7.65625 max=7.90625
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=7.8125
Linear output=1 dtype=torch.bfloat16 min=-6.3125 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-5.90625 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-6.5625 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.96875
Linear output=1 dtype=torch.bfloat16 min=-7.5625 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-5.84375 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-23.25 max=4.5625
Linear output=1 dtype=torch.bfloat16 min=-23.5 max=5.25
Dropout input=0 dtype=torch.bfloat16 min=-23.5 max=5.25
Dropout output=0 dtype=torch.bfloat16 min=-23.25 max=4.5625
Dropout output=1 dtype=torch.bfloat16 min=-23.5 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-30.375 max=30.625
Linear output=1 dtype=torch.bfloat16 min=-27.0 max=25.75
Attention output=0 dtype=torch.bfloat16 min=-23.5 max=5.25
Attention output=1 dtype=torch.bfloat16 min=-30.375 max=30.625
LayerNorm input=0 dtype=torch.bfloat16 min=-216.0 max=92.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=14.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=17.375
Linear input=0 dtype=torch.bfloat16 min=-10.75 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=2.96875
Linear output=1 dtype=torch.bfloat16 min=-5.53125 max=3.53125
GELU input=0 dtype=torch.bfloat16 min=-10.75 max=2.421875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.96875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.96875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.53125
Linear output=0 dtype=torch.bfloat16 min=-29.875 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-28.5 max=9.125
FeedForward input=0 dtype=torch.bfloat16 min=-10.75 max=2.421875
FeedForward output=0 dtype=torch.bfloat16 min=-29.875 max=7.46875
FeedForward output=1 dtype=torch.bfloat16 min=-28.5 max=9.125
LayerNorm input=0 dtype=torch.bfloat16 min=-4928.0 max=3600.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=30.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=31.5
Linear input=0 dtype=torch.bfloat16 min=-8.3125 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-18.875 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-18.0 max=12.125
GELU input=0 dtype=torch.bfloat16 min=-8.3125 max=5.03125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.6875
Linear output=0 dtype=torch.bfloat16 min=-103.0 max=56.75
Linear output=1 dtype=torch.bfloat16 min=-89.5 max=50.75
FeedForward input=0 dtype=torch.bfloat16 min=-8.3125 max=5.03125
FeedForward output=0 dtype=torch.bfloat16 min=-103.0 max=56.75
FeedForward output=1 dtype=torch.bfloat16 min=-89.5 max=50.75
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5440.0 max=3888.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-156.0 max=99.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-4.59375 max=8.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-156.0 max=99.5
LayerNorm output=0 dtype=torch.bfloat16 min=-26.375 max=25.625
LayerNorm output=1 dtype=torch.bfloat16 min=-30.75 max=24.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-156.0 max=99.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-19.25 max=20.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.59375 max=6.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.828125 max=2.515625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6796875 max=1.2421875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.03125 max=8.6875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-14.25 max=13.375
LayerNorm input=0 dtype=torch.bfloat16 min=-5440.0 max=3888.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=32.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=33.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5440.0 max=3888.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.953125 max=5.71875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.75 max=11.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.09375 max=0.76953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.203125 max=2.40625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.25 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=20.125
Linear output=0 dtype=torch.bfloat16 min=-22.75 max=26.875
Linear output=1 dtype=torch.bfloat16 min=-21.875 max=25.125
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=20.125
Linear output=0 dtype=torch.bfloat16 min=-36.25 max=34.25
Linear output=1 dtype=torch.bfloat16 min=-32.5 max=32.0
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=20.125
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-9.6875 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.875
Linear output=1 dtype=torch.bfloat16 min=-6.59375 max=6.875
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=6.375
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.25
Linear output=1 dtype=torch.bfloat16 min=-6.9375 max=6.34375
Linear input=0 dtype=torch.bfloat16 min=-4.75 max=4.96875
Linear output=0 dtype=torch.bfloat16 min=-39.0 max=7.5
Linear output=1 dtype=torch.bfloat16 min=-36.5 max=7.5625
Dropout input=0 dtype=torch.bfloat16 min=-39.0 max=7.5625
Dropout output=0 dtype=torch.bfloat16 min=-39.0 max=7.5
Dropout output=1 dtype=torch.bfloat16 min=-36.5 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-34.75 max=34.75
Linear output=1 dtype=torch.bfloat16 min=-35.25 max=33.25
Attention output=0 dtype=torch.bfloat16 min=-39.0 max=7.5625
Attention output=1 dtype=torch.bfloat16 min=-35.25 max=34.75
LayerNorm input=0 dtype=torch.bfloat16 min=-296.0 max=106.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=12.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-6.875 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=5.84375
Linear output=1 dtype=torch.bfloat16 min=-5.71875 max=5.21875
GELU input=0 dtype=torch.bfloat16 min=-6.875 max=2.96875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.84375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.84375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.84375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=15.9375
Linear output=1 dtype=torch.bfloat16 min=-5.78125 max=14.75
FeedForward input=0 dtype=torch.bfloat16 min=-6.875 max=2.96875
FeedForward output=0 dtype=torch.bfloat16 min=-4.53125 max=15.9375
FeedForward output=1 dtype=torch.bfloat16 min=-5.78125 max=14.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5440.0 max=3776.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=32.25
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=32.75
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-15.4375 max=10.0
Linear output=1 dtype=torch.bfloat16 min=-20.625 max=9.3125
GELU input=0 dtype=torch.bfloat16 min=-9.5 max=3.609375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-26.5 max=29.375
Linear output=1 dtype=torch.bfloat16 min=-26.625 max=28.875
FeedForward input=0 dtype=torch.bfloat16 min=-9.5 max=3.609375
FeedForward output=0 dtype=torch.bfloat16 min=-26.5 max=29.375
FeedForward output=1 dtype=torch.bfloat16 min=-26.625 max=28.875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5504.0 max=3840.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-260.0 max=110.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=5.3125
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=5.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-260.0 max=110.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=18.375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=20.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-260.0 max=110.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.4375 max=8.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.34375 max=5.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5 max=1.8359375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6640625 max=2.03125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.09375 max=3.921875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=11.25
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=11.25
LayerNorm input=0 dtype=torch.bfloat16 min=-5504.0 max=3840.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=33.0
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=33.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5504.0 max=3840.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.5 max=6.34375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.375 max=9.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.125 max=0.5703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=3.34375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.25 max=11.25
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-7.0625 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-10.9375 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-6.5625 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-6.0625 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-8.5 max=6.34375
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=17.0
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=17.375
Linear input=0 dtype=torch.bfloat16 min=-8.5 max=6.34375
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=11.375
Linear input=0 dtype=torch.bfloat16 min=-8.5 max=6.34375
Linear output=0 dtype=torch.bfloat16 min=-9.3125 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-9.5625 max=10.0625
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-28.375 max=5.46875
Linear output=1 dtype=torch.bfloat16 min=-25.625 max=5.78125
Dropout input=0 dtype=torch.bfloat16 min=-28.375 max=5.78125
Dropout output=0 dtype=torch.bfloat16 min=-28.375 max=5.46875
Dropout output=1 dtype=torch.bfloat16 min=-25.625 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-7.0 max=7.46875
Linear output=0 dtype=torch.bfloat16 min=-41.5 max=35.5
Linear output=1 dtype=torch.bfloat16 min=-47.25 max=37.5
Attention output=0 dtype=torch.bfloat16 min=-28.375 max=5.78125
Attention output=1 dtype=torch.bfloat16 min=-47.25 max=37.5
LayerNorm input=0 dtype=torch.bfloat16 min=-312.0 max=112.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=11.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.75 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-4.65625 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-5.1875 max=5.125
GELU input=0 dtype=torch.bfloat16 min=-2.65625 max=4.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-18.25 max=3.515625
Linear output=1 dtype=torch.bfloat16 min=-17.125 max=3.609375
FeedForward input=0 dtype=torch.bfloat16 min=-2.65625 max=4.34375
FeedForward output=0 dtype=torch.bfloat16 min=-18.25 max=3.515625
FeedForward output=1 dtype=torch.bfloat16 min=-17.125 max=3.609375
LayerNorm input=0 dtype=torch.bfloat16 min=-5536.0 max=3776.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=32.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=33.25
Linear input=0 dtype=torch.bfloat16 min=-10.5625 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-87.0 max=37.75
Linear output=1 dtype=torch.bfloat16 min=-126.0 max=27.125
GELU input=0 dtype=torch.bfloat16 min=-10.5625 max=4.15625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=37.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=27.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=37.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=37.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=27.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=37.75
Linear output=0 dtype=torch.bfloat16 min=-54.75 max=59.25
Linear output=1 dtype=torch.bfloat16 min=-55.75 max=54.25
FeedForward input=0 dtype=torch.bfloat16 min=-10.5625 max=4.15625
FeedForward output=0 dtype=torch.bfloat16 min=-54.75 max=59.25
FeedForward output=1 dtype=torch.bfloat16 min=-55.75 max=54.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5440.0 max=4032.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-268.0 max=119.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=5.65625
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=5.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-268.0 max=119.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=17.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=20.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-268.0 max=119.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.375 max=7.03125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=5.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.4296875 max=1.375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9765625 max=2.21875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.15625 max=3.53125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=10.9375
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=10.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-5440.0 max=4032.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=33.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=34.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5440.0 max=4032.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.28125 max=5.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.375 max=10.9375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.94921875 max=1.515625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.71875 max=9.75
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.625 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-14.75 max=14.875
Linear output=1 dtype=torch.bfloat16 min=-14.3125 max=14.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-27.375 max=28.25
Linear output=1 dtype=torch.bfloat16 min=-27.0 max=26.25
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=8.0
Linear output=1 dtype=torch.bfloat16 min=-7.625 max=7.90625
Linear input=0 dtype=torch.bfloat16 min=-7.28125 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.0625
Linear output=1 dtype=torch.bfloat16 min=-7.125 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-7.28125 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=6.6875
Linear output=1 dtype=torch.bfloat16 min=-6.375 max=6.71875
Linear input=0 dtype=torch.bfloat16 min=-7.28125 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-15.8125 max=14.875
Linear output=1 dtype=torch.bfloat16 min=-14.75 max=14.1875
Linear input=0 dtype=torch.bfloat16 min=-5.6875 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-23.25 max=5.90625
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=5.21875
Dropout input=0 dtype=torch.bfloat16 min=-23.25 max=5.90625
Dropout output=0 dtype=torch.bfloat16 min=-23.25 max=5.90625
Dropout output=1 dtype=torch.bfloat16 min=-22.5 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=13.625
Linear output=0 dtype=torch.bfloat16 min=-72.5 max=93.5
Linear output=1 dtype=torch.bfloat16 min=-69.5 max=86.5
Attention output=0 dtype=torch.bfloat16 min=-23.25 max=5.90625
Attention output=1 dtype=torch.bfloat16 min=-72.5 max=93.5
LayerNorm input=0 dtype=torch.bfloat16 min=-302.0 max=120.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.75 max=12.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.75 max=16.375
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=4.59375
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=2.609375
Linear output=1 dtype=torch.bfloat16 min=-4.21875 max=2.953125
GELU input=0 dtype=torch.bfloat16 min=-2.5 max=4.59375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-21.75 max=4.0625
Linear output=1 dtype=torch.bfloat16 min=-22.25 max=4.5625
FeedForward input=0 dtype=torch.bfloat16 min=-2.5 max=4.59375
FeedForward output=0 dtype=torch.bfloat16 min=-21.75 max=4.0625
FeedForward output=1 dtype=torch.bfloat16 min=-22.25 max=4.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-5472.0 max=4096.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=33.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=34.25
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=13.625
Linear output=0 dtype=torch.bfloat16 min=-53.0 max=167.0
Linear output=1 dtype=torch.bfloat16 min=-56.0 max=142.0
GELU input=0 dtype=torch.bfloat16 min=-13.25 max=13.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=167.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=142.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=167.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=167.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=142.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=167.0
Linear output=0 dtype=torch.bfloat16 min=-241.0 max=768.0
Linear output=1 dtype=torch.bfloat16 min=-193.0 max=576.0
FeedForward input=0 dtype=torch.bfloat16 min=-13.25 max=13.625
FeedForward output=0 dtype=torch.bfloat16 min=-241.0 max=768.0
FeedForward output=1 dtype=torch.bfloat16 min=-193.0 max=576.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-9152.0 max=6336.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-230.0 max=128.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=5.40625
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=5.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-230.0 max=128.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=20.125
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=24.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-230.0 max=128.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.9375 max=7.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.15625 max=5.40625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9609375 max=1.25
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.0625 max=1.4296875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.375 max=3.265625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=13.3125
Linear output=1 dtype=torch.bfloat16 min=-9.4375 max=13.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-9152.0 max=6336.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=35.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=35.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9152.0 max=6336.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.4375 max=13.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.51171875 max=0.86328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.765625 max=4.53125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.953125 max=4.71875
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-7.78125 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-10.625 max=10.625
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.5625 max=7.90625
Linear output=1 dtype=torch.bfloat16 min=-7.0625 max=7.46875
Linear input=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=6.28125
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.375
Linear output=1 dtype=torch.bfloat16 min=-9.25 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=9.0625
Linear output=1 dtype=torch.bfloat16 min=-12.0625 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-4.21875 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-32.75 max=5.75
Linear output=1 dtype=torch.bfloat16 min=-32.0 max=5.5625
Dropout input=0 dtype=torch.bfloat16 min=-32.75 max=5.75
Dropout output=0 dtype=torch.bfloat16 min=-32.75 max=5.75
Dropout output=1 dtype=torch.bfloat16 min=-32.0 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.25
Linear output=0 dtype=torch.bfloat16 min=-26.75 max=11.75
Linear output=1 dtype=torch.bfloat16 min=-20.5 max=13.5
Attention output=0 dtype=torch.bfloat16 min=-32.75 max=5.75
Attention output=1 dtype=torch.bfloat16 min=-26.75 max=13.5
LayerNorm input=0 dtype=torch.bfloat16 min=-332.0 max=129.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=11.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=17.75
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-4.0 max=3.6875
Linear output=1 dtype=torch.bfloat16 min=-3.65625 max=3.375
GELU input=0 dtype=torch.bfloat16 min=-9.125 max=2.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.6875
Linear output=0 dtype=torch.bfloat16 min=-41.5 max=4.65625
Linear output=1 dtype=torch.bfloat16 min=-41.5 max=4.4375
FeedForward input=0 dtype=torch.bfloat16 min=-9.125 max=2.34375
FeedForward output=0 dtype=torch.bfloat16 min=-41.5 max=4.65625
FeedForward output=1 dtype=torch.bfloat16 min=-41.5 max=4.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-9536.0 max=6304.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=35.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=35.5
Linear input=0 dtype=torch.bfloat16 min=-38.5 max=45.75
Linear output=0 dtype=torch.bfloat16 min=-14.1875 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-20.0 max=11.5
GELU input=0 dtype=torch.bfloat16 min=-38.5 max=45.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Linear output=0 dtype=torch.bfloat16 min=-474.0 max=1136.0
Linear output=1 dtype=torch.bfloat16 min=-480.0 max=1144.0
FeedForward input=0 dtype=torch.bfloat16 min=-38.5 max=45.75
FeedForward output=0 dtype=torch.bfloat16 min=-474.0 max=1136.0
FeedForward output=1 dtype=torch.bfloat16 min=-480.0 max=1144.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-11456.0 max=11712.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-216.0 max=137.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=5.5625
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=5.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-216.0 max=137.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=24.125
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=26.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-216.0 max=137.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-18.625 max=17.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.28125 max=2.109375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9921875 max=1.1796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.765625 max=1.96875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.15625 max=5.5625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-13.9375 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-11456.0 max=11712.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=36.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=37.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-11456.0 max=11712.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.53125 max=5.03125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-14.0 max=13.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.87109375 max=0.5703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.828125 max=9.4375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.8125 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=17.375
Linear output=0 dtype=torch.bfloat16 min=-25.375 max=27.75
Linear output=1 dtype=torch.bfloat16 min=-23.875 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=17.375
Linear output=0 dtype=torch.bfloat16 min=-42.75 max=37.5
Linear output=1 dtype=torch.bfloat16 min=-37.0 max=34.5
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=17.375
Linear output=0 dtype=torch.bfloat16 min=-10.5625 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-9.25 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=6.09375
Linear output=1 dtype=torch.bfloat16 min=-6.21875 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-6.375 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=6.65625
Linear output=1 dtype=torch.bfloat16 min=-5.71875 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-5.5625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=42.0
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=39.75
Dropout input=0 dtype=torch.bfloat16 min=-6.90625 max=42.0
Dropout output=0 dtype=torch.bfloat16 min=-6.90625 max=42.0
Dropout output=1 dtype=torch.bfloat16 min=-6.5 max=39.75
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-31.5 max=63.0
Linear output=1 dtype=torch.bfloat16 min=-30.875 max=63.75
Attention output=0 dtype=torch.bfloat16 min=-6.90625 max=42.0
Attention output=1 dtype=torch.bfloat16 min=-31.5 max=63.75
LayerNorm input=0 dtype=torch.bfloat16 min=-332.0 max=140.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=13.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=18.25
Linear input=0 dtype=torch.bfloat16 min=-3.34375 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-4.03125 max=2.875
Linear output=1 dtype=torch.bfloat16 min=-3.671875 max=2.671875
GELU input=0 dtype=torch.bfloat16 min=-3.34375 max=4.71875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.65625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.65625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.875
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=40.0
Linear output=1 dtype=torch.bfloat16 min=-5.78125 max=39.25
FeedForward input=0 dtype=torch.bfloat16 min=-3.34375 max=4.71875
FeedForward output=0 dtype=torch.bfloat16 min=-6.0 max=40.0
FeedForward output=1 dtype=torch.bfloat16 min=-5.78125 max=39.25
LayerNorm input=0 dtype=torch.bfloat16 min=-11968.0 max=11328.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=36.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=36.25
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-24.375 max=15.0625
Linear output=1 dtype=torch.bfloat16 min=-22.25 max=14.8125
GELU input=0 dtype=torch.bfloat16 min=-9.5 max=9.4375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.8125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0625
Linear output=0 dtype=torch.bfloat16 min=-43.5 max=25.375
Linear output=1 dtype=torch.bfloat16 min=-48.5 max=26.25
FeedForward input=0 dtype=torch.bfloat16 min=-9.5 max=9.4375
FeedForward output=0 dtype=torch.bfloat16 min=-43.5 max=25.375
FeedForward output=1 dtype=torch.bfloat16 min=-48.5 max=26.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-12352.0 max=11584.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-183.0 max=156.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=9.4375
Linear output=1 dtype=torch.bfloat16 min=-4.5625 max=9.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-183.0 max=156.0
LayerNorm output=0 dtype=torch.bfloat16 min=-30.125 max=25.875
LayerNorm output=1 dtype=torch.bfloat16 min=-29.25 max=28.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-183.0 max=156.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-21.0 max=20.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.28125 max=5.53125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.7578125 max=1.015625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8984375 max=2.15625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.59375 max=9.4375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=9.875
LayerNorm input=0 dtype=torch.bfloat16 min=-12352.0 max=11584.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=36.25
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=36.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-12352.0 max=11584.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.65625 max=5.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.375 max=9.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.625 max=1.0390625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.359375 max=9.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.5625 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-21.0 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-27.25 max=24.0
Linear output=1 dtype=torch.bfloat16 min=-27.25 max=23.25
Linear input=0 dtype=torch.bfloat16 min=-21.0 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-56.0 max=73.5
Linear output=1 dtype=torch.bfloat16 min=-53.5 max=71.5
Linear input=0 dtype=torch.bfloat16 min=-21.0 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-10.5 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=6.375
Linear output=1 dtype=torch.bfloat16 min=-7.9375 max=7.40625
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.0
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=8.8125
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-38.25 max=7.625
Linear output=1 dtype=torch.bfloat16 min=-37.75 max=7.0625
Dropout input=0 dtype=torch.bfloat16 min=-38.25 max=7.625
Dropout output=0 dtype=torch.bfloat16 min=-38.25 max=7.625
Dropout output=1 dtype=torch.bfloat16 min=-37.75 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-5.71875 max=5.6875
Linear output=0 dtype=torch.bfloat16 min=-33.5 max=10.1875
Linear output=1 dtype=torch.bfloat16 min=-30.125 max=7.375
Attention output=0 dtype=torch.bfloat16 min=-38.25 max=7.625
Attention output=1 dtype=torch.bfloat16 min=-33.5 max=10.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-328.0 max=162.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=15.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=20.125
Linear input=0 dtype=torch.bfloat16 min=-3.640625 max=4.21875
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=3.40625
Linear output=1 dtype=torch.bfloat16 min=-4.59375 max=3.0
GELU input=0 dtype=torch.bfloat16 min=-3.640625 max=4.21875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.40625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.40625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.40625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-16.125 max=12.0625
Linear output=1 dtype=torch.bfloat16 min=-19.0 max=12.0
FeedForward input=0 dtype=torch.bfloat16 min=-3.640625 max=4.21875
FeedForward output=0 dtype=torch.bfloat16 min=-16.125 max=12.0625
FeedForward output=1 dtype=torch.bfloat16 min=-19.0 max=12.0
LayerNorm input=0 dtype=torch.bfloat16 min=-12416.0 max=11520.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=36.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=36.25
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=10.8125
Linear output=0 dtype=torch.bfloat16 min=-14.0625 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-21.375 max=10.9375
GELU input=0 dtype=torch.bfloat16 min=-16.25 max=10.8125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear output=0 dtype=torch.bfloat16 min=-225.0 max=476.0
Linear output=1 dtype=torch.bfloat16 min=-226.0 max=482.0
FeedForward input=0 dtype=torch.bfloat16 min=-16.25 max=10.8125
FeedForward output=0 dtype=torch.bfloat16 min=-225.0 max=476.0
FeedForward output=1 dtype=torch.bfloat16 min=-226.0 max=482.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-11264.0 max=7872.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-280.0 max=179.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=6.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-280.0 max=179.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=21.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=25.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-280.0 max=179.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.9375 max=8.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.375 max=2.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.046875 max=1.3203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8125 max=1.5859375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.3125 max=6.84375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=12.75
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=12.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-11264.0 max=7872.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=31.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=31.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-11264.0 max=7872.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.8125 max=7.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.9375 max=12.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.462890625 max=1.3046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-0.72265625 max=7.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.8125 max=4.21875
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-15.125 max=17.25
Linear output=1 dtype=torch.bfloat16 min=-14.5 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-26.875 max=21.875
Linear output=1 dtype=torch.bfloat16 min=-25.125 max=21.75
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=8.0625
Linear output=1 dtype=torch.bfloat16 min=-7.0 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=5.46875
Linear output=1 dtype=torch.bfloat16 min=-6.8125 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-6.75 max=7.90625
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-12.75 max=10.5
Linear output=1 dtype=torch.bfloat16 min=-12.875 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-4.84375 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=26.25
Linear output=1 dtype=torch.bfloat16 min=-6.9375 max=25.75
Dropout input=0 dtype=torch.bfloat16 min=-6.9375 max=26.25
Dropout output=0 dtype=torch.bfloat16 min=-6.4375 max=26.25
Dropout output=1 dtype=torch.bfloat16 min=-6.9375 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-3.578125 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=6.0
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=6.34375
Attention output=0 dtype=torch.bfloat16 min=-6.9375 max=26.25
Attention output=1 dtype=torch.bfloat16 min=-9.375 max=6.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-368.0 max=176.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=14.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-4.53125 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-4.0 max=3.28125
Linear output=1 dtype=torch.bfloat16 min=-4.09375 max=2.953125
GELU input=0 dtype=torch.bfloat16 min=-4.53125 max=4.71875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=24.75
Linear output=1 dtype=torch.bfloat16 min=-8.3125 max=24.125
FeedForward input=0 dtype=torch.bfloat16 min=-4.53125 max=4.71875
FeedForward output=0 dtype=torch.bfloat16 min=-7.125 max=24.75
FeedForward output=1 dtype=torch.bfloat16 min=-8.3125 max=24.125
LayerNorm input=0 dtype=torch.bfloat16 min=-11328.0 max=7840.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=31.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=31.125
Linear input=0 dtype=torch.bfloat16 min=-30.75 max=32.0
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-8.5625 max=6.75
GELU input=0 dtype=torch.bfloat16 min=-30.75 max=32.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-262.0 max=466.0
Linear output=1 dtype=torch.bfloat16 min=-243.0 max=458.0
FeedForward input=0 dtype=torch.bfloat16 min=-30.75 max=32.0
FeedForward output=0 dtype=torch.bfloat16 min=-262.0 max=466.0
FeedForward output=1 dtype=torch.bfloat16 min=-243.0 max=458.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-11392.0 max=9024.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-227.0 max=192.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=8.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-227.0 max=192.0
LayerNorm output=0 dtype=torch.bfloat16 min=-30.375 max=22.75
LayerNorm output=1 dtype=torch.bfloat16 min=-30.25 max=27.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-227.0 max=192.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.5 max=15.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.84375 max=5.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.484375 max=1.3203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.78125 max=1.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.90625 max=8.9375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.9375 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=16.5
LayerNorm input=0 dtype=torch.bfloat16 min=-11392.0 max=9024.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=32.25
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=31.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-11392.0 max=9024.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.875 max=7.96875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.9375 max=16.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.734375 max=1.15625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.0625 max=9.375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.125 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-15.5 max=15.3125
Linear output=0 dtype=torch.bfloat16 min=-24.5 max=23.875
Linear output=1 dtype=torch.bfloat16 min=-23.75 max=22.75
Linear input=0 dtype=torch.bfloat16 min=-15.5 max=15.3125
Linear output=0 dtype=torch.bfloat16 min=-62.25 max=66.5
Linear output=1 dtype=torch.bfloat16 min=-55.25 max=59.75
Linear input=0 dtype=torch.bfloat16 min=-15.5 max=15.3125
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-10.6875 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=7.96875
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=6.46875
Linear output=1 dtype=torch.bfloat16 min=-6.5625 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=7.96875
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=5.90625
Linear output=1 dtype=torch.bfloat16 min=-8.3125 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=7.96875
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=6.8125
Linear output=0 dtype=torch.bfloat16 min=-29.375 max=7.875
Linear output=1 dtype=torch.bfloat16 min=-26.75 max=8.3125
Dropout input=0 dtype=torch.bfloat16 min=-29.375 max=8.3125
Dropout output=0 dtype=torch.bfloat16 min=-29.375 max=7.875
Dropout output=1 dtype=torch.bfloat16 min=-26.75 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=3.890625
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=14.9375
Attention output=0 dtype=torch.bfloat16 min=-29.375 max=8.3125
Attention output=1 dtype=torch.bfloat16 min=-10.8125 max=15.75
LayerNorm input=0 dtype=torch.bfloat16 min=-364.0 max=190.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=16.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=20.5
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=3.25
Linear output=1 dtype=torch.bfloat16 min=-4.5625 max=3.34375
GELU input=0 dtype=torch.bfloat16 min=-12.0 max=2.953125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
Linear output=0 dtype=torch.bfloat16 min=-12.6875 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-12.875 max=15.75
FeedForward input=0 dtype=torch.bfloat16 min=-12.0 max=2.953125
FeedForward output=0 dtype=torch.bfloat16 min=-12.6875 max=16.125
FeedForward output=1 dtype=torch.bfloat16 min=-12.875 max=15.75
LayerNorm input=0 dtype=torch.bfloat16 min=-11392.0 max=8960.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=31.875
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=31.0
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=16.375
Linear output=0 dtype=torch.bfloat16 min=-11.875 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-13.0625 max=10.3125
GELU input=0 dtype=torch.bfloat16 min=-15.3125 max=16.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-183.0 max=150.0
Linear output=1 dtype=torch.bfloat16 min=-178.0 max=146.0
FeedForward input=0 dtype=torch.bfloat16 min=-15.3125 max=16.375
FeedForward output=0 dtype=torch.bfloat16 min=-183.0 max=150.0
FeedForward output=1 dtype=torch.bfloat16 min=-178.0 max=146.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-9216.0 max=7200.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-250.0 max=204.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=7.0625
Linear output=1 dtype=torch.bfloat16 min=-7.03125 max=7.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-250.0 max=204.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.5 max=20.375
LayerNorm output=1 dtype=torch.bfloat16 min=-29.25 max=25.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-250.0 max=204.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.0 max=15.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.03125 max=3.515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.859375 max=2.671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6328125 max=1.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.03125 max=7.09375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear output=1 dtype=torch.bfloat16 min=-11.8125 max=12.125
LayerNorm input=0 dtype=torch.bfloat16 min=-9216.0 max=7200.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=28.125
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=28.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9216.0 max=7200.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.5625 max=5.96875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.8125 max=11.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.46875 max=1.6953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.65625 max=12.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.6796875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-22.375 max=20.25
Linear output=1 dtype=torch.bfloat16 min=-22.625 max=20.875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-46.75 max=59.5
Linear output=1 dtype=torch.bfloat16 min=-45.5 max=58.75
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.84375
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-10.9375 max=12.5
Linear output=1 dtype=torch.bfloat16 min=-11.75 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-11.5 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=44.75
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=43.75
Dropout input=0 dtype=torch.bfloat16 min=-8.9375 max=44.75
Dropout output=0 dtype=torch.bfloat16 min=-8.9375 max=44.75
Dropout output=1 dtype=torch.bfloat16 min=-8.6875 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-4.6875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=18.0
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=17.625
Attention output=0 dtype=torch.bfloat16 min=-8.9375 max=44.75
Attention output=1 dtype=torch.bfloat16 min=-9.75 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-444.0 max=205.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=13.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-5.84375 max=3.484375
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=3.984375
Linear output=1 dtype=torch.bfloat16 min=-4.65625 max=3.96875
GELU input=0 dtype=torch.bfloat16 min=-5.84375 max=3.484375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.984375
Linear output=0 dtype=torch.bfloat16 min=-36.0 max=13.625
Linear output=1 dtype=torch.bfloat16 min=-35.5 max=14.75
FeedForward input=0 dtype=torch.bfloat16 min=-5.84375 max=3.484375
FeedForward output=0 dtype=torch.bfloat16 min=-36.0 max=13.625
FeedForward output=1 dtype=torch.bfloat16 min=-35.5 max=14.75
LayerNorm input=0 dtype=torch.bfloat16 min=-9216.0 max=7200.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=28.125
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=28.125
Linear input=0 dtype=torch.bfloat16 min=-102.5 max=98.0
Linear output=0 dtype=torch.bfloat16 min=-45.5 max=60.5
Linear output=1 dtype=torch.bfloat16 min=-45.0 max=66.0
GELU input=0 dtype=torch.bfloat16 min=-102.5 max=98.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=60.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=66.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=66.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=60.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=66.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=66.0
Linear output=0 dtype=torch.bfloat16 min=-2832.0 max=2864.0
Linear output=1 dtype=torch.bfloat16 min=-3168.0 max=3184.0
FeedForward input=0 dtype=torch.bfloat16 min=-102.5 max=98.0
FeedForward output=0 dtype=torch.bfloat16 min=-2832.0 max=2864.0
FeedForward output=1 dtype=torch.bfloat16 min=-3168.0 max=3184.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-8000.0 max=6976.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-258.0 max=220.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-10.0 max=9.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-258.0 max=220.0
LayerNorm output=0 dtype=torch.bfloat16 min=-27.625 max=21.125
LayerNorm output=1 dtype=torch.bfloat16 min=-27.25 max=26.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-258.0 max=220.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.375 max=16.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.3125 max=5.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0234375 max=2.953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.65625 max=1.75
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.0 max=9.5625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.5625 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=12.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-8000.0 max=6976.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=27.75
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=27.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-8000.0 max=6976.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.5625 max=11.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.90625 max=2.0
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-8.5 max=12.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.1484375 max=1.140625
Linear input=0 dtype=torch.bfloat16 min=-14.375 max=16.875
Linear output=0 dtype=torch.bfloat16 min=-21.125 max=21.125
Linear output=1 dtype=torch.bfloat16 min=-20.0 max=20.875
Linear input=0 dtype=torch.bfloat16 min=-14.375 max=16.875
Linear output=0 dtype=torch.bfloat16 min=-51.25 max=38.5
Linear output=1 dtype=torch.bfloat16 min=-49.25 max=38.25
Linear input=0 dtype=torch.bfloat16 min=-14.375 max=16.875
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=9.0
Linear output=1 dtype=torch.bfloat16 min=-9.8125 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=6.75
Linear output=1 dtype=torch.bfloat16 min=-8.5625 max=6.875
Linear input=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=14.8125
Linear output=1 dtype=torch.bfloat16 min=-12.5625 max=15.0625
Linear input=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.5
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-66.0 max=20.5
Linear output=1 dtype=torch.bfloat16 min=-65.0 max=18.625
Dropout input=0 dtype=torch.bfloat16 min=-66.0 max=20.5
Dropout output=0 dtype=torch.bfloat16 min=-66.0 max=20.5
Dropout output=1 dtype=torch.bfloat16 min=-65.0 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=4.875
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=27.5
Linear output=1 dtype=torch.bfloat16 min=-21.125 max=29.25
Attention output=0 dtype=torch.bfloat16 min=-66.0 max=20.5
Attention output=1 dtype=torch.bfloat16 min=-21.25 max=29.25
LayerNorm input=0 dtype=torch.bfloat16 min=-552.0 max=228.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=12.0625
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=15.3125
Linear input=0 dtype=torch.bfloat16 min=-4.53125 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=3.71875
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=3.71875
GELU input=0 dtype=torch.bfloat16 min=-4.53125 max=4.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.71875
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=63.0
Linear output=1 dtype=torch.bfloat16 min=-11.375 max=62.75
FeedForward input=0 dtype=torch.bfloat16 min=-4.53125 max=4.0625
FeedForward output=0 dtype=torch.bfloat16 min=-11.625 max=63.0
FeedForward output=1 dtype=torch.bfloat16 min=-11.375 max=62.75
LayerNorm input=0 dtype=torch.bfloat16 min=-8032.0 max=6976.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=27.75
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=27.875
Linear input=0 dtype=torch.bfloat16 min=-210.0 max=177.0
Linear output=0 dtype=torch.bfloat16 min=-70.0 max=128.0
Linear output=1 dtype=torch.bfloat16 min=-66.0 max=129.0
GELU input=0 dtype=torch.bfloat16 min=-210.0 max=177.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=128.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=129.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=129.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=128.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=129.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=129.0
Linear output=0 dtype=torch.bfloat16 min=-7680.0 max=5248.0
Linear output=1 dtype=torch.bfloat16 min=-6688.0 max=5024.0
FeedForward input=0 dtype=torch.bfloat16 min=-210.0 max=177.0
FeedForward output=0 dtype=torch.bfloat16 min=-7680.0 max=5248.0
FeedForward output=1 dtype=torch.bfloat16 min=-6688.0 max=5024.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-7808.0 max=6848.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-204.0 max=236.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=10.625
Linear output=1 dtype=torch.bfloat16 min=-10.75 max=10.625
LayerNorm input=0 dtype=torch.bfloat16 min=-204.0 max=236.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.875 max=19.0
LayerNorm output=1 dtype=torch.bfloat16 min=-19.375 max=22.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-204.0 max=236.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-21.875 max=19.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.3125 max=4.9375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.21875 max=2.703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6875 max=1.6328125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.75 max=10.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-12.5 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-12.5625 max=16.5
LayerNorm input=0 dtype=torch.bfloat16 min=-7808.0 max=6848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=32.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=32.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-7808.0 max=6848.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.1875 max=5.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.5625 max=7.78125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.515625 max=4.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-9.6875 max=16.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.5625 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-21.875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-36.5 max=38.0
Linear output=1 dtype=torch.bfloat16 min=-40.75 max=45.75
Linear input=0 dtype=torch.bfloat16 min=-21.875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-42.75 max=45.5
Linear output=1 dtype=torch.bfloat16 min=-41.25 max=44.5
Linear input=0 dtype=torch.bfloat16 min=-21.875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=11.1875
Linear output=1 dtype=torch.bfloat16 min=-11.8125 max=11.0
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=5.0
Linear output=0 dtype=torch.bfloat16 min=-13.375 max=10.25
Linear output=1 dtype=torch.bfloat16 min=-11.75 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=5.0
Linear output=0 dtype=torch.bfloat16 min=-17.375 max=15.625
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=5.0
Linear output=0 dtype=torch.bfloat16 min=-7.875 max=6.40625
Linear output=1 dtype=torch.bfloat16 min=-8.0625 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-6.0625 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-19.125 max=12.6875
Linear output=1 dtype=torch.bfloat16 min=-19.625 max=12.75
Dropout input=0 dtype=torch.bfloat16 min=-19.625 max=12.75
Dropout output=0 dtype=torch.bfloat16 min=-19.125 max=12.6875
Dropout output=1 dtype=torch.bfloat16 min=-19.625 max=12.75
Linear input=0 dtype=torch.bfloat16 min=-3.890625 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-15.4375 max=23.375
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=24.125
Attention output=0 dtype=torch.bfloat16 min=-19.625 max=12.75
Attention output=1 dtype=torch.bfloat16 min=-15.9375 max=24.125
LayerNorm input=0 dtype=torch.bfloat16 min=-268.0 max=251.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.625 max=14.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-19.125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=6.125
Linear output=1 dtype=torch.bfloat16 min=-5.1875 max=6.34375
GELU input=0 dtype=torch.bfloat16 min=-5.34375 max=5.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.34375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.34375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.34375
Linear output=0 dtype=torch.bfloat16 min=-42.0 max=15.875
Linear output=1 dtype=torch.bfloat16 min=-41.0 max=16.375
FeedForward input=0 dtype=torch.bfloat16 min=-5.34375 max=5.75
FeedForward output=0 dtype=torch.bfloat16 min=-42.0 max=15.875
FeedForward output=1 dtype=torch.bfloat16 min=-41.0 max=16.375
LayerNorm input=0 dtype=torch.bfloat16 min=-7808.0 max=6848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=32.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=32.5
Linear input=0 dtype=torch.bfloat16 min=-284.0 max=252.0
Linear output=0 dtype=torch.bfloat16 min=-83.0 max=302.0
Linear output=1 dtype=torch.bfloat16 min=-79.0 max=296.0
GELU input=0 dtype=torch.bfloat16 min=-284.0 max=252.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=302.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=296.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=302.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=302.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=296.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=302.0
Linear output=0 dtype=torch.bfloat16 min=-20480.0 max=33280.0
Linear output=1 dtype=torch.bfloat16 min=-19840.0 max=32768.0
FeedForward input=0 dtype=torch.bfloat16 min=-284.0 max=252.0
FeedForward output=0 dtype=torch.bfloat16 min=-20480.0 max=33280.0
FeedForward output=1 dtype=torch.bfloat16 min=-19840.0 max=32768.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6528.0 max=1616.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-176.0 max=256.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=9.625
Linear output=1 dtype=torch.bfloat16 min=-10.375 max=9.625
LayerNorm input=0 dtype=torch.bfloat16 min=-176.0 max=256.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.125 max=16.625
LayerNorm output=1 dtype=torch.bfloat16 min=-14.25 max=20.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-176.0 max=256.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.375 max=11.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.375 max=5.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.015625 max=1.421875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.4375 max=1.078125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.4375 max=9.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.140625 max=5.03125
Linear output=1 dtype=torch.bfloat16 min=-2.140625 max=5.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-6528.0 max=1616.0
LayerNorm output=0 dtype=torch.bfloat16 min=-30.375 max=12.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-30.25 max=12.8125
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-6528.0 max=1616.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-21.25 max=4.71875
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-14.875 max=4.59375
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-15.3125 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-13.6875 max=14.25
Linear output=1 dtype=torch.bfloat16 min=-13.375 max=13.8125
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-16.875 max=18.0
Linear output=1 dtype=torch.bfloat16 min=-16.125 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-1.4609375 max=1.2890625
Linear output=1 dtype=torch.bfloat16 min=-1.40625 max=1.2890625
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=7.8125
Linear output=1 dtype=torch.bfloat16 min=-9.25 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=6.9375
Linear output=1 dtype=torch.bfloat16 min=-6.4375 max=6.46875
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-14.0625 max=35.75
Linear output=1 dtype=torch.bfloat16 min=-16.0 max=38.25
Dropout input=0 dtype=torch.bfloat16 min=-16.0 max=38.25
Dropout output=0 dtype=torch.bfloat16 min=-14.0625 max=35.75
Dropout output=1 dtype=torch.bfloat16 min=-16.0 max=38.25
Attention output=0 dtype=torch.bfloat16 min=-16.0 max=38.25
Attention output=1 dtype=torch.bfloat16 min=-4.1875 max=5.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-324.0 max=292.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.75 max=14.875
LayerNorm output=1 dtype=torch.bfloat16 min=-19.875 max=18.0
Linear input=0 dtype=torch.bfloat16 min=-4.71875 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=4.28125
Linear output=1 dtype=torch.bfloat16 min=-4.0 max=4.34375
GELU input=0 dtype=torch.bfloat16 min=-4.71875 max=9.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-22.75 max=18.875
Linear output=1 dtype=torch.bfloat16 min=-22.875 max=18.5
FeedForward input=0 dtype=torch.bfloat16 min=-4.71875 max=9.25
FeedForward output=0 dtype=torch.bfloat16 min=-22.75 max=18.875
FeedForward output=1 dtype=torch.bfloat16 min=-22.875 max=18.5
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-296.0 max=302.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.046875 max=5.625
Linear output=1 dtype=torch.bfloat16 min=-1.0546875 max=5.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-296.0 max=302.0
LayerNorm output=0 dtype=torch.bfloat16 min=-18.375 max=15.25
LayerNorm output=1 dtype=torch.bfloat16 min=-17.625 max=18.625
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-296.0 max=302.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-21.25 max=4.71875
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-6.21875 max=7.3125
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-6.4375 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-4.34375 max=4.9375
SD3Transformer2DModel output=0 dtype=torch.bfloat16 min=-4.34375 max=4.9375
50%|█████ | 1/2 [00:29<00:29, 29.03s/it]tensor([8.9286, 8.9286])
Conv2d input=0 dtype=torch.bfloat16 min=-5.375 max=4.40625
Conv2d output=0 dtype=torch.bfloat16 min=-19.125 max=7.25
Conv2d output=1 dtype=torch.bfloat16 min=-19.125 max=7.25
PatchEmbed input=0 dtype=torch.bfloat16 min=-5.375 max=4.40625
PatchEmbed output=0 dtype=torch.bfloat16 min=-18.125 max=7.96875
PatchEmbed output=1 dtype=torch.bfloat16 min=-18.125 max=7.96875
Timesteps input=0 dtype=torch.float32 min=8.928571701049805 max=8.928571701049805
Timesteps output=0 dtype=torch.float32 min=-0.9991970658302307 max=0.9999995231628418
Timesteps output=1 dtype=torch.float32 min=-0.9991970658302307 max=0.9999995231628418
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=2.890625
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=2.890625
SiLU input=0 dtype=torch.bfloat16 min=-8.25 max=2.890625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.734375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=2.734375
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-10.625 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-10.625 max=6.3125
TimestepEmbedding input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
TimestepEmbedding output=0 dtype=torch.bfloat16 min=-10.625 max=6.3125
TimestepEmbedding output=1 dtype=torch.bfloat16 min=-10.625 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-33.75 max=15.9375
SiLU input=0 dtype=torch.bfloat16 min=-40.5 max=15.9375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=15.8125
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
Linear output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
PixArtAlphaTextProjection input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
PixArtAlphaTextProjection output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
PixArtAlphaTextProjection output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
CombinedTimestepTextProjEmbeddings input=0 dtype=torch.float32 min=8.928571701049805 max=8.928571701049805
CombinedTimestepTextProjEmbeddings input=1 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CombinedTimestepTextProjEmbeddings output=0 dtype=torch.bfloat16 min=-22.5 max=7.09375
CombinedTimestepTextProjEmbeddings output=1 dtype=torch.bfloat16 min=-23.25 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
Linear output=0 dtype=torch.bfloat16 min=-812.0 max=612.0
Linear output=1 dtype=torch.bfloat16 min=-812.0 max=612.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.3125 max=3.953125
Linear output=1 dtype=torch.bfloat16 min=-3.3125 max=3.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-18.125 max=7.96875
LayerNorm output=0 dtype=torch.bfloat16 min=-21.25 max=8.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-21.25 max=8.1875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-18.125 max=7.96875
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.0 max=7.40625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.73046875 max=2.671875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.5703125 max=0.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.421875 max=1.6953125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.40625 max=3.203125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-7.90625 max=7.0
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=7.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=16.0
LayerNorm output=1 dtype=torch.bfloat16 min=-23.0 max=16.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.3125 max=7.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.8046875 max=1.21875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-7.96875 max=7.0625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.25 max=0.5703125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.15625 max=2.921875
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-31.5 max=29.375
Linear output=1 dtype=torch.bfloat16 min=-31.125 max=29.125
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-23.875 max=20.75
Linear output=1 dtype=torch.bfloat16 min=-23.5 max=20.5
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-15.875 max=13.0625
Linear output=1 dtype=torch.bfloat16 min=-15.6875 max=12.9375
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.21875
Linear output=1 dtype=torch.bfloat16 min=-6.9375 max=6.71875
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-4.78125 max=4.75
Linear output=1 dtype=torch.bfloat16 min=-5.09375 max=5.0
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-3.046875 max=3.171875
Linear output=1 dtype=torch.bfloat16 min=-3.90625 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-6.1875 max=6.46875
Linear output=0 dtype=torch.bfloat16 min=-21.875 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-21.5 max=11.25
Dropout input=0 dtype=torch.bfloat16 min=-21.875 max=11.375
Dropout output=0 dtype=torch.bfloat16 min=-21.875 max=11.375
Dropout output=1 dtype=torch.bfloat16 min=-21.5 max=11.25
Linear input=0 dtype=torch.bfloat16 min=-10.3125 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-15.0 max=15.0
Linear output=1 dtype=torch.bfloat16 min=-13.4375 max=13.8125
Attention output=0 dtype=torch.bfloat16 min=-21.875 max=11.375
Attention output=1 dtype=torch.bfloat16 min=-15.0 max=15.0
LayerNorm input=0 dtype=torch.bfloat16 min=-61.75 max=11.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=6.875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=6.875
Linear input=0 dtype=torch.bfloat16 min=-2.90625 max=2.640625
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=4.15625
Linear output=1 dtype=torch.bfloat16 min=-9.6875 max=4.125
GELU input=0 dtype=torch.bfloat16 min=-2.90625 max=2.640625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-20.25 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-20.25 max=17.5
FeedForward input=0 dtype=torch.bfloat16 min=-2.90625 max=2.640625
FeedForward output=0 dtype=torch.bfloat16 min=-20.25 max=17.5
FeedForward output=1 dtype=torch.bfloat16 min=-20.25 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-820.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.75 max=16.5
LayerNorm output=1 dtype=torch.bfloat16 min=-24.125 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=8.5625
Linear output=0 dtype=torch.bfloat16 min=-22.125 max=30.5
Linear output=1 dtype=torch.bfloat16 min=-19.375 max=36.0
GELU input=0 dtype=torch.bfloat16 min=-9.0 max=8.5625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Linear output=0 dtype=torch.bfloat16 min=-42.0 max=42.75
Linear output=1 dtype=torch.bfloat16 min=-45.0 max=37.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.0 max=8.5625
FeedForward output=0 dtype=torch.bfloat16 min=-42.0 max=42.75
FeedForward output=1 dtype=torch.bfloat16 min=-45.0 max=37.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-840.0 max=608.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-67.0 max=28.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.4375 max=3.984375
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=4.0
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=28.375
LayerNorm output=0 dtype=torch.bfloat16 min=-30.375 max=12.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-30.5 max=12.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-67.0 max=28.375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.5 max=11.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.84765625 max=4.0
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.65234375 max=1.3984375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.953125 max=2.921875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.93359375 max=3.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.109375 max=5.125
Linear output=1 dtype=torch.bfloat16 min=-3.15625 max=5.125
LayerNorm input=0 dtype=torch.bfloat16 min=-840.0 max=608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.5 max=15.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-31.875 max=16.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-840.0 max=608.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.0 max=5.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=2.59375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.328125 max=3.203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.421875 max=5.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.15625 max=1.484375
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-32.5 max=34.25
Linear output=1 dtype=torch.bfloat16 min=-31.875 max=33.5
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-16.25 max=16.0
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=15.75
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-10.875 max=10.4375
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-4.65625 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-4.40625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-4.78125 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-4.65625 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-3.21875 max=3.390625
Linear output=1 dtype=torch.bfloat16 min=-2.234375 max=2.578125
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.625
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=11.9375
Linear output=1 dtype=torch.bfloat16 min=-10.0625 max=11.75
Dropout input=0 dtype=torch.bfloat16 min=-10.0625 max=11.9375
Dropout output=0 dtype=torch.bfloat16 min=-10.0625 max=11.9375
Dropout output=1 dtype=torch.bfloat16 min=-10.0625 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=31.75
Linear output=1 dtype=torch.bfloat16 min=-12.0625 max=32.0
Attention output=0 dtype=torch.bfloat16 min=-10.0625 max=11.9375
Attention output=1 dtype=torch.bfloat16 min=-12.0625 max=32.0
LayerNorm input=0 dtype=torch.bfloat16 min=-91.0 max=28.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=10.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=4.84375
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=4.9375
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=4.9375
GELU input=0 dtype=torch.bfloat16 min=-3.734375 max=4.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-13.5625 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=16.625
FeedForward input=0 dtype=torch.bfloat16 min=-3.734375 max=4.84375
FeedForward output=0 dtype=torch.bfloat16 min=-13.5625 max=16.375
FeedForward output=1 dtype=torch.bfloat16 min=-13.6875 max=16.625
LayerNorm input=0 dtype=torch.bfloat16 min=-840.0 max=608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-27.5 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-28.25 max=16.875
Linear input=0 dtype=torch.bfloat16 min=-21.375 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-29.875 max=26.875
Linear output=1 dtype=torch.bfloat16 min=-27.0 max=25.5
GELU input=0 dtype=torch.bfloat16 min=-21.375 max=9.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=25.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=26.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=25.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=26.875
Linear output=0 dtype=torch.bfloat16 min=-218.0 max=512.0
Linear output=1 dtype=torch.bfloat16 min=-218.0 max=510.0
FeedForward input=0 dtype=torch.bfloat16 min=-21.375 max=9.75
FeedForward output=0 dtype=torch.bfloat16 min=-218.0 max=512.0
FeedForward output=1 dtype=torch.bfloat16 min=-218.0 max=510.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-2448.0 max=616.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-72.0 max=28.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=3.84375
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=3.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-72.0 max=28.375
LayerNorm output=0 dtype=torch.bfloat16 min=-30.5 max=12.375
LayerNorm output=1 dtype=torch.bfloat16 min=-30.625 max=12.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-72.0 max=28.375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.9375 max=5.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.53125 max=0.53515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.03125 max=2.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.28125 max=3.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.62109375 max=2.28125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=3.328125
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=3.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-2448.0 max=616.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=10.125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=14.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-2448.0 max=616.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.40625 max=5.78125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=2.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-6.125 max=3.109375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.953125 max=2.015625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.34375 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=17.75
Linear output=1 dtype=torch.bfloat16 min=-17.875 max=17.75
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-13.5 max=13.1875
Linear output=1 dtype=torch.bfloat16 min=-13.375 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=7.9375
Linear output=1 dtype=torch.bfloat16 min=-6.0625 max=7.9375
Linear input=0 dtype=torch.bfloat16 min=-6.40625 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-4.3125 max=4.03125
Linear output=1 dtype=torch.bfloat16 min=-4.6875 max=4.84375
Linear input=0 dtype=torch.bfloat16 min=-6.40625 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-4.28125 max=5.03125
Linear output=1 dtype=torch.bfloat16 min=-4.1875 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-6.40625 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=4.15625
Linear output=1 dtype=torch.bfloat16 min=-4.71875 max=3.59375
Linear input=0 dtype=torch.bfloat16 min=-4.84375 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-11.375 max=13.4375
Linear output=1 dtype=torch.bfloat16 min=-11.25 max=13.3125
Dropout input=0 dtype=torch.bfloat16 min=-11.375 max=13.4375
Dropout output=0 dtype=torch.bfloat16 min=-11.375 max=13.4375
Dropout output=1 dtype=torch.bfloat16 min=-11.25 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-3.375 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=8.125
Linear output=1 dtype=torch.bfloat16 min=-10.1875 max=7.5625
Attention output=0 dtype=torch.bfloat16 min=-11.375 max=13.4375
Attention output=1 dtype=torch.bfloat16 min=-10.1875 max=8.125
LayerNorm input=0 dtype=torch.bfloat16 min=-111.0 max=28.375
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=9.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-3.171875 max=3.90625
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-5.9375 max=6.1875
GELU input=0 dtype=torch.bfloat16 min=-3.171875 max=3.90625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.1875
Linear output=0 dtype=torch.bfloat16 min=-6.3125 max=25.0
Linear output=1 dtype=torch.bfloat16 min=-6.3125 max=25.5
FeedForward input=0 dtype=torch.bfloat16 min=-3.171875 max=3.90625
FeedForward output=0 dtype=torch.bfloat16 min=-6.3125 max=25.0
FeedForward output=1 dtype=torch.bfloat16 min=-6.3125 max=25.5
LayerNorm input=0 dtype=torch.bfloat16 min=-2464.0 max=608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=9.75
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-21.25 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-24.875 max=17.125
Linear output=1 dtype=torch.bfloat16 min=-22.25 max=17.125
GELU input=0 dtype=torch.bfloat16 min=-21.25 max=25.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Linear output=0 dtype=torch.bfloat16 min=-173.0 max=251.0
Linear output=1 dtype=torch.bfloat16 min=-173.0 max=251.0
FeedForward input=0 dtype=torch.bfloat16 min=-21.25 max=25.375
FeedForward output=0 dtype=torch.bfloat16 min=-173.0 max=251.0
FeedForward output=1 dtype=torch.bfloat16 min=-173.0 max=251.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3552.0 max=680.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-73.0 max=28.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-4.21875 max=5.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-73.0 max=28.375
LayerNorm output=0 dtype=torch.bfloat16 min=-30.625 max=11.875
LayerNorm output=1 dtype=torch.bfloat16 min=-30.5 max=11.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-73.0 max=28.375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.125 max=13.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.84375 max=4.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.3125 max=2.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.09375 max=2.34375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.8984375 max=4.78125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.03125 max=3.8125
Linear output=1 dtype=torch.bfloat16 min=-5.0625 max=3.828125
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=8.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.75 max=12.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3552.0 max=680.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.9375 max=7.65625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.46875 max=3.609375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-5.0625 max=2.5625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.84375 max=1.96875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.96875 max=3.828125
Linear input=0 dtype=torch.bfloat16 min=-12.125 max=13.6875
Linear output=0 dtype=torch.bfloat16 min=-42.0 max=34.75
Linear output=1 dtype=torch.bfloat16 min=-42.25 max=34.5
Linear input=0 dtype=torch.bfloat16 min=-12.125 max=13.6875
Linear output=0 dtype=torch.bfloat16 min=-22.625 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-22.625 max=22.875
Linear input=0 dtype=torch.bfloat16 min=-12.125 max=13.6875
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=9.625
Linear output=1 dtype=torch.bfloat16 min=-10.0625 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=7.65625
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=4.40625
Linear output=1 dtype=torch.bfloat16 min=-4.375 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=7.65625
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=5.34375
Linear output=1 dtype=torch.bfloat16 min=-4.6875 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=7.65625
Linear output=0 dtype=torch.bfloat16 min=-3.546875 max=4.53125
Linear output=1 dtype=torch.bfloat16 min=-3.59375 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-6.15625 max=6.375
Linear output=0 dtype=torch.bfloat16 min=-14.5625 max=12.625
Linear output=1 dtype=torch.bfloat16 min=-15.0 max=12.8125
Dropout input=0 dtype=torch.bfloat16 min=-15.0 max=12.8125
Dropout output=0 dtype=torch.bfloat16 min=-14.5625 max=12.625
Dropout output=1 dtype=torch.bfloat16 min=-15.0 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-7.1875 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=12.375
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=12.5625
Attention output=0 dtype=torch.bfloat16 min=-15.0 max=12.8125
Attention output=1 dtype=torch.bfloat16 min=-22.5 max=12.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-116.5 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=8.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-4.46875 max=3.625
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-6.875 max=6.9375
GELU input=0 dtype=torch.bfloat16 min=-4.46875 max=3.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.84375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.84375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=10.75
FeedForward input=0 dtype=torch.bfloat16 min=-4.46875 max=3.625
FeedForward output=0 dtype=torch.bfloat16 min=-8.375 max=10.5625
FeedForward output=1 dtype=torch.bfloat16 min=-8.6875 max=10.75
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=672.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=9.875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=13.1875
Linear input=0 dtype=torch.bfloat16 min=-15.75 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-11.75 max=9.3125
GELU input=0 dtype=torch.bfloat16 min=-15.75 max=15.125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-71.0 max=37.75
Linear output=1 dtype=torch.bfloat16 min=-70.0 max=32.75
FeedForward input=0 dtype=torch.bfloat16 min=-15.75 max=15.125
FeedForward output=0 dtype=torch.bfloat16 min=-71.0 max=37.75
FeedForward output=1 dtype=torch.bfloat16 min=-70.0 max=32.75
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3808.0 max=680.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-103.0 max=28.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.78125 max=6.65625
Linear output=1 dtype=torch.bfloat16 min=-3.78125 max=6.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-103.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-31.0 max=11.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-30.875 max=11.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-103.0 max=28.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.6875 max=11.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.76171875 max=4.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0234375 max=2.34375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.125 max=2.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.015625 max=6.6875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=7.625
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=7.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-3808.0 max=680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=10.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=13.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3808.0 max=680.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.8125 max=6.0625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.921875 max=5.15625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.65625 max=2.21875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.90625 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=11.4375
Linear output=0 dtype=torch.bfloat16 min=-20.625 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-20.5 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=11.4375
Linear output=0 dtype=torch.bfloat16 min=-15.375 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-15.5 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=11.4375
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=8.875
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-3.8125 max=6.0625
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.5625
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-3.8125 max=6.0625
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=5.28125
Linear output=1 dtype=torch.bfloat16 min=-6.65625 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-3.8125 max=6.0625
Linear output=0 dtype=torch.bfloat16 min=-4.8125 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-4.875 max=5.6875
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=7.71875
Linear output=0 dtype=torch.bfloat16 min=-24.5 max=14.5625
Linear output=1 dtype=torch.bfloat16 min=-24.75 max=15.0625
Dropout input=0 dtype=torch.bfloat16 min=-24.75 max=15.0625
Dropout output=0 dtype=torch.bfloat16 min=-24.5 max=14.5625
Dropout output=1 dtype=torch.bfloat16 min=-24.75 max=15.0625
Linear input=0 dtype=torch.bfloat16 min=-3.96875 max=3.765625
Linear output=0 dtype=torch.bfloat16 min=-12.75 max=15.9375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=10.75
Attention output=0 dtype=torch.bfloat16 min=-24.75 max=15.0625
Attention output=1 dtype=torch.bfloat16 min=-12.75 max=15.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-174.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=7.5
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=7.40625
Linear input=0 dtype=torch.bfloat16 min=-3.140625 max=3.34375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=4.8125
Linear output=1 dtype=torch.bfloat16 min=-8.3125 max=4.9375
GELU input=0 dtype=torch.bfloat16 min=-3.140625 max=3.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=10.875
Linear output=1 dtype=torch.bfloat16 min=-7.28125 max=10.9375
FeedForward input=0 dtype=torch.bfloat16 min=-3.140625 max=3.34375
FeedForward output=0 dtype=torch.bfloat16 min=-7.3125 max=10.875
FeedForward output=1 dtype=torch.bfloat16 min=-7.28125 max=10.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-3808.0 max=680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=9.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=12.75
Linear input=0 dtype=torch.bfloat16 min=-10.3125 max=8.125
Linear output=0 dtype=torch.bfloat16 min=-13.75 max=14.625
Linear output=1 dtype=torch.bfloat16 min=-17.125 max=14.3125
GELU input=0 dtype=torch.bfloat16 min=-10.3125 max=8.125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.625
Linear output=0 dtype=torch.bfloat16 min=-71.5 max=38.5
Linear output=1 dtype=torch.bfloat16 min=-70.0 max=35.25
FeedForward input=0 dtype=torch.bfloat16 min=-10.3125 max=8.125
FeedForward output=0 dtype=torch.bfloat16 min=-71.5 max=38.5
FeedForward output=1 dtype=torch.bfloat16 min=-70.0 max=35.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4352.0 max=704.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-144.0 max=28.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-4.71875 max=6.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-144.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=11.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=11.4375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-144.0 max=28.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.1875 max=10.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.75 max=0.76171875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.078125 max=1.7890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.765625 max=2.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.97265625 max=6.03125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.4375 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=9.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4352.0 max=704.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=8.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4352.0 max=704.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.125 max=4.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.46875 max=6.1875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6796875 max=1.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.484375 max=1.921875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.671875 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-10.25 max=10.25
Linear output=1 dtype=torch.bfloat16 min=-10.25 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=11.5
Linear output=1 dtype=torch.bfloat16 min=-11.0625 max=11.6875
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-9.25 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-7.125 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=5.1875
Linear output=1 dtype=torch.bfloat16 min=-5.46875 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-7.125 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-5.8125 max=4.875
Linear input=0 dtype=torch.bfloat16 min=-7.125 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=4.75
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=6.8125
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=28.75
Linear output=1 dtype=torch.bfloat16 min=-10.25 max=29.0
Dropout input=0 dtype=torch.bfloat16 min=-10.25 max=29.0
Dropout output=0 dtype=torch.bfloat16 min=-10.0625 max=28.75
Dropout output=1 dtype=torch.bfloat16 min=-10.25 max=29.0
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=3.84375
Linear output=0 dtype=torch.bfloat16 min=-11.5 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-9.4375 max=12.75
Attention output=0 dtype=torch.bfloat16 min=-10.25 max=29.0
Attention output=1 dtype=torch.bfloat16 min=-11.5 max=16.125
LayerNorm input=0 dtype=torch.bfloat16 min=-234.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-37.5 max=6.125
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=6.09375
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.171875
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=8.375
Linear output=1 dtype=torch.bfloat16 min=-5.40625 max=8.875
GELU input=0 dtype=torch.bfloat16 min=-3.71875 max=2.171875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-7.40625 max=17.875
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=17.75
FeedForward input=0 dtype=torch.bfloat16 min=-3.71875 max=2.171875
FeedForward output=0 dtype=torch.bfloat16 min=-7.40625 max=17.875
FeedForward output=1 dtype=torch.bfloat16 min=-8.125 max=17.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4352.0 max=716.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=9.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=6.4375
Linear output=0 dtype=torch.bfloat16 min=-15.875 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=11.0625
GELU input=0 dtype=torch.bfloat16 min=-8.875 max=6.4375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Linear output=0 dtype=torch.bfloat16 min=-52.25 max=35.25
Linear output=1 dtype=torch.bfloat16 min=-51.75 max=36.25
FeedForward input=0 dtype=torch.bfloat16 min=-8.875 max=6.4375
FeedForward output=0 dtype=torch.bfloat16 min=-52.25 max=35.25
FeedForward output=1 dtype=torch.bfloat16 min=-51.75 max=36.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4864.0 max=776.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-144.0 max=28.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.0625 max=5.96875
Linear output=1 dtype=torch.bfloat16 min=-4.09375 max=5.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-144.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=9.875
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=9.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-144.0 max=28.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.125 max=8.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.09375 max=1.109375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.234375 max=2.203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.328125 max=3.328125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.5390625 max=5.96875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-11.0625 max=8.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4864.0 max=776.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=15.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=16.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4864.0 max=776.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.46875 max=6.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-7.6875 max=6.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.4375 max=1.921875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.0625 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-13.5 max=14.5
Linear output=1 dtype=torch.bfloat16 min=-13.5 max=14.375
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-16.375 max=15.4375
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=15.25
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=7.78125
Linear output=1 dtype=torch.bfloat16 min=-7.34375 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-4.46875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=6.15625
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-4.46875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=5.1875
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=5.375
Linear input=0 dtype=torch.bfloat16 min=-4.46875 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-5.84375 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear output=0 dtype=torch.bfloat16 min=-11.9375 max=25.875
Linear output=1 dtype=torch.bfloat16 min=-11.6875 max=25.875
Dropout input=0 dtype=torch.bfloat16 min=-11.9375 max=25.875
Dropout output=0 dtype=torch.bfloat16 min=-11.9375 max=25.875
Dropout output=1 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-5.375 max=6.34375
Linear output=0 dtype=torch.bfloat16 min=-16.75 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=8.6875
Attention output=0 dtype=torch.bfloat16 min=-11.9375 max=25.875
Attention output=1 dtype=torch.bfloat16 min=-16.75 max=8.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-223.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=5.90625
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-3.25 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=7.21875
Linear output=1 dtype=torch.bfloat16 min=-5.59375 max=7.09375
GELU input=0 dtype=torch.bfloat16 min=-3.25 max=2.421875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.21875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.09375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.21875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.21875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.21875
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=20.0
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=19.875
FeedForward input=0 dtype=torch.bfloat16 min=-3.25 max=2.421875
FeedForward output=0 dtype=torch.bfloat16 min=-5.84375 max=20.0
FeedForward output=1 dtype=torch.bfloat16 min=-5.90625 max=19.875
LayerNorm input=0 dtype=torch.bfloat16 min=-4864.0 max=792.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=16.875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-12.8125 max=9.375
Linear output=1 dtype=torch.bfloat16 min=-12.8125 max=11.4375
GELU input=0 dtype=torch.bfloat16 min=-9.8125 max=5.90625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.4375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.4375
Linear output=0 dtype=torch.bfloat16 min=-21.5 max=36.75
Linear output=1 dtype=torch.bfloat16 min=-24.125 max=39.5
FeedForward input=0 dtype=torch.bfloat16 min=-9.8125 max=5.90625
FeedForward output=0 dtype=torch.bfloat16 min=-21.5 max=36.75
FeedForward output=1 dtype=torch.bfloat16 min=-24.125 max=39.5
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5056.0 max=912.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-153.0 max=28.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=6.15625
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=6.15625
LayerNorm input=0 dtype=torch.bfloat16 min=-153.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=10.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=10.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-153.0 max=28.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-16.25 max=16.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.2734375 max=4.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.0 max=2.53125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.546875 max=2.65625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.84375 max=2.03125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=14.25
Linear output=1 dtype=torch.bfloat16 min=-12.4375 max=14.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-5056.0 max=912.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=18.25
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=18.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5056.0 max=912.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.0 max=4.03125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.75 max=7.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.96875 max=1.0390625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.328125 max=2.078125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.4375 max=14.3125
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=16.125
Linear output=0 dtype=torch.bfloat16 min=-17.125 max=18.0
Linear output=1 dtype=torch.bfloat16 min=-16.75 max=18.0
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=16.125
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=20.25
Linear output=1 dtype=torch.bfloat16 min=-22.25 max=20.125
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=16.125
Linear output=0 dtype=torch.bfloat16 min=-12.8125 max=14.0
Linear output=1 dtype=torch.bfloat16 min=-12.9375 max=13.875
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-5.0 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-5.5625 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-6.53125 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-7.03125 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-29.75 max=17.125
Linear output=1 dtype=torch.bfloat16 min=-29.5 max=17.375
Dropout input=0 dtype=torch.bfloat16 min=-29.75 max=17.375
Dropout output=0 dtype=torch.bfloat16 min=-29.75 max=17.125
Dropout output=1 dtype=torch.bfloat16 min=-29.5 max=17.375
Linear input=0 dtype=torch.bfloat16 min=-7.84375 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-15.25 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-15.75 max=12.625
Attention output=0 dtype=torch.bfloat16 min=-29.75 max=17.375
Attention output=1 dtype=torch.bfloat16 min=-15.75 max=12.625
LayerNorm input=0 dtype=torch.bfloat16 min=-229.0 max=28.25
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=5.21875
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-2.6875 max=2.625
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=3.15625
Linear output=1 dtype=torch.bfloat16 min=-9.5 max=3.1875
GELU input=0 dtype=torch.bfloat16 min=-2.6875 max=2.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.15625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.15625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
Linear output=0 dtype=torch.bfloat16 min=-19.125 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-19.25 max=10.5
FeedForward input=0 dtype=torch.bfloat16 min=-2.6875 max=2.625
FeedForward output=0 dtype=torch.bfloat16 min=-19.125 max=10.5625
FeedForward output=1 dtype=torch.bfloat16 min=-19.25 max=10.5
LayerNorm input=0 dtype=torch.bfloat16 min=-5024.0 max=892.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=14.5
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=14.9375
Linear input=0 dtype=torch.bfloat16 min=-8.3125 max=3.59375
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=11.0625
Linear output=1 dtype=torch.bfloat16 min=-11.9375 max=11.0
GELU input=0 dtype=torch.bfloat16 min=-8.3125 max=3.59375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.0625
Linear output=0 dtype=torch.bfloat16 min=-61.75 max=50.0
Linear output=1 dtype=torch.bfloat16 min=-67.0 max=55.25
FeedForward input=0 dtype=torch.bfloat16 min=-8.3125 max=3.59375
FeedForward output=0 dtype=torch.bfloat16 min=-61.75 max=50.0
FeedForward output=1 dtype=torch.bfloat16 min=-67.0 max=55.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5216.0 max=1056.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-166.0 max=30.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.1875 max=9.8125
Linear output=1 dtype=torch.bfloat16 min=-3.1875 max=9.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-166.0 max=30.625
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=10.375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=10.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-166.0 max=30.625
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.0625 max=13.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=5.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.140625 max=2.328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.296875 max=2.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.09375 max=9.8125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-21.0 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-21.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-5216.0 max=1056.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=16.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=16.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5216.0 max=1056.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.859375 max=4.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.6875 max=10.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.359375 max=0.90234375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.859375 max=1.6484375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.0 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-20.5 max=17.875
Linear output=1 dtype=torch.bfloat16 min=-20.375 max=18.0
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-18.75 max=22.0
Linear output=1 dtype=torch.bfloat16 min=-18.625 max=21.875
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=16.5
Linear input=0 dtype=torch.bfloat16 min=-3.859375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-5.90625 max=5.71875
Linear output=1 dtype=torch.bfloat16 min=-6.28125 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-3.859375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-5.8125 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-5.8125 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-3.859375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-5.71875 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-27.875 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-27.875 max=11.9375
Dropout input=0 dtype=torch.bfloat16 min=-27.875 max=12.25
Dropout output=0 dtype=torch.bfloat16 min=-27.875 max=12.25
Dropout output=1 dtype=torch.bfloat16 min=-27.875 max=11.9375
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-15.8125 max=25.25
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=21.125
Attention output=0 dtype=torch.bfloat16 min=-27.875 max=12.25
Attention output=1 dtype=torch.bfloat16 min=-22.5 max=25.25
LayerNorm input=0 dtype=torch.bfloat16 min=-264.0 max=40.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=6.03125
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-2.875 max=2.328125
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=3.703125
Linear output=1 dtype=torch.bfloat16 min=-7.65625 max=4.0
GELU input=0 dtype=torch.bfloat16 min=-2.875 max=2.328125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-10.9375 max=15.9375
Linear output=1 dtype=torch.bfloat16 min=-11.0625 max=15.625
FeedForward input=0 dtype=torch.bfloat16 min=-2.875 max=2.328125
FeedForward output=0 dtype=torch.bfloat16 min=-10.9375 max=15.9375
FeedForward output=1 dtype=torch.bfloat16 min=-11.0625 max=15.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5152.0 max=1024.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=16.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=16.875
Linear input=0 dtype=torch.bfloat16 min=-9.125 max=3.171875
Linear output=0 dtype=torch.bfloat16 min=-12.875 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-13.75 max=9.1875
GELU input=0 dtype=torch.bfloat16 min=-9.125 max=3.171875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-47.0 max=42.25
Linear output=1 dtype=torch.bfloat16 min=-48.0 max=43.25
FeedForward input=0 dtype=torch.bfloat16 min=-9.125 max=3.171875
FeedForward output=0 dtype=torch.bfloat16 min=-47.0 max=42.25
FeedForward output=1 dtype=torch.bfloat16 min=-48.0 max=43.25
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4928.0 max=1584.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-218.0 max=43.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=7.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-218.0 max=43.5
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=12.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=12.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-218.0 max=43.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.875 max=9.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.28125 max=1.6953125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.703125 max=2.046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.1875 max=1.9453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.25 max=7.40625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=1 dtype=torch.bfloat16 min=-16.75 max=16.625
LayerNorm input=0 dtype=torch.bfloat16 min=-4928.0 max=1584.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=18.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=18.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4928.0 max=1584.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.46875 max=4.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.625 max=12.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.078125 max=0.87890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.96875 max=2.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-16.75 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=9.6875
Linear output=0 dtype=torch.bfloat16 min=-11.9375 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-11.625 max=10.1875
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=9.6875
Linear output=0 dtype=torch.bfloat16 min=-13.4375 max=14.1875
Linear output=1 dtype=torch.bfloat16 min=-13.125 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=9.6875
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=7.0625
Linear output=1 dtype=torch.bfloat16 min=-8.5625 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-6.46875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=6.78125
Linear output=1 dtype=torch.bfloat16 min=-7.53125 max=7.15625
Linear input=0 dtype=torch.bfloat16 min=-6.46875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=5.8125
Linear output=1 dtype=torch.bfloat16 min=-6.0625 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-6.46875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.75
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=7.71875
Linear input=0 dtype=torch.bfloat16 min=-5.84375 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=26.25
Linear output=1 dtype=torch.bfloat16 min=-7.78125 max=26.0
Dropout input=0 dtype=torch.bfloat16 min=-7.84375 max=26.25
Dropout output=0 dtype=torch.bfloat16 min=-7.84375 max=26.25
Dropout output=1 dtype=torch.bfloat16 min=-7.78125 max=26.0
Linear input=0 dtype=torch.bfloat16 min=-7.25 max=6.375
Linear output=0 dtype=torch.bfloat16 min=-22.75 max=25.5
Linear output=1 dtype=torch.bfloat16 min=-24.375 max=26.125
Attention output=0 dtype=torch.bfloat16 min=-7.84375 max=26.25
Attention output=1 dtype=torch.bfloat16 min=-24.375 max=26.125
LayerNorm input=0 dtype=torch.bfloat16 min=-276.0 max=42.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=6.84375
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=1.40625
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=3.0
Linear output=1 dtype=torch.bfloat16 min=-4.78125 max=3.0625
GELU input=0 dtype=torch.bfloat16 min=-3.3125 max=1.40625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=19.25
Linear output=1 dtype=torch.bfloat16 min=-4.59375 max=19.125
FeedForward input=0 dtype=torch.bfloat16 min=-3.3125 max=1.40625
FeedForward output=0 dtype=torch.bfloat16 min=-4.59375 max=19.25
FeedForward output=1 dtype=torch.bfloat16 min=-4.59375 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-5024.0 max=1600.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=18.25
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-10.0 max=5.1875
Linear output=0 dtype=torch.bfloat16 min=-16.75 max=15.0
Linear output=1 dtype=torch.bfloat16 min=-14.5 max=14.5
GELU input=0 dtype=torch.bfloat16 min=-10.0 max=5.1875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-79.0 max=74.5
Linear output=1 dtype=torch.bfloat16 min=-76.5 max=71.0
FeedForward input=0 dtype=torch.bfloat16 min=-10.0 max=5.1875
FeedForward output=0 dtype=torch.bfloat16 min=-79.0 max=74.5
FeedForward output=1 dtype=torch.bfloat16 min=-76.5 max=71.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5152.0 max=2880.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-242.0 max=47.75
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.40625 max=6.625
Linear output=1 dtype=torch.bfloat16 min=-3.421875 max=6.625
LayerNorm input=0 dtype=torch.bfloat16 min=-242.0 max=47.75
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=13.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=13.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-242.0 max=47.75
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.0 max=11.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.7578125 max=4.9375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6328125 max=1.71875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.109375 max=1.265625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.421875 max=6.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-20.625 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-20.625 max=19.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5152.0 max=2880.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=21.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=21.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5152.0 max=2880.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.0 max=15.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.92578125 max=0.69140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.6875 max=2.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-20.625 max=19.75
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-15.625 max=13.0
Linear output=1 dtype=torch.bfloat16 min=-15.5625 max=13.0
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-20.375 max=20.25
Linear output=1 dtype=torch.bfloat16 min=-20.25 max=19.75
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=8.0
Linear output=1 dtype=torch.bfloat16 min=-7.8125 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=7.65625
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=7.78125
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-6.5625 max=5.125
Linear output=1 dtype=torch.bfloat16 min=-7.25 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=8.625
Linear input=0 dtype=torch.bfloat16 min=-7.5625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-29.5 max=6.25
Linear output=1 dtype=torch.bfloat16 min=-29.375 max=6.03125
Dropout input=0 dtype=torch.bfloat16 min=-29.5 max=6.25
Dropout output=0 dtype=torch.bfloat16 min=-29.5 max=6.25
Dropout output=1 dtype=torch.bfloat16 min=-29.375 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-10.75 max=7.5625
Linear output=0 dtype=torch.bfloat16 min=-29.5 max=12.625
Linear output=1 dtype=torch.bfloat16 min=-31.125 max=13.125
Attention output=0 dtype=torch.bfloat16 min=-29.5 max=6.25
Attention output=1 dtype=torch.bfloat16 min=-31.125 max=13.125
LayerNorm input=0 dtype=torch.bfloat16 min=-328.0 max=43.75
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=7.21875
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=7.3125
Linear input=0 dtype=torch.bfloat16 min=-4.71875 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=3.328125
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=3.28125
GELU input=0 dtype=torch.bfloat16 min=-4.71875 max=1.515625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
Linear output=0 dtype=torch.bfloat16 min=-7.875 max=24.75
Linear output=1 dtype=torch.bfloat16 min=-7.90625 max=24.625
FeedForward input=0 dtype=torch.bfloat16 min=-4.71875 max=1.515625
FeedForward output=0 dtype=torch.bfloat16 min=-7.875 max=24.75
FeedForward output=1 dtype=torch.bfloat16 min=-7.90625 max=24.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=2848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=21.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=21.375
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=10.0
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=7.8125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-73.5 max=83.5
Linear output=1 dtype=torch.bfloat16 min=-73.0 max=81.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=7.8125
FeedForward output=0 dtype=torch.bfloat16 min=-73.5 max=83.5
FeedForward output=1 dtype=torch.bfloat16 min=-73.0 max=81.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5120.0 max=3984.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-191.0 max=45.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=4.875
LayerNorm input=0 dtype=torch.bfloat16 min=-191.0 max=45.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=15.625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=15.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-191.0 max=45.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.8125 max=10.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.1875 max=4.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.40625 max=2.4375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.953125 max=0.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.71875 max=4.53125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.0 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-17.125 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=3984.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=22.875
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=22.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5120.0 max=3984.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.4375 max=5.78125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.875 max=17.0
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.1171875 max=0.8828125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.265625 max=3.09375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-17.125 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=10.375
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=17.0
Linear output=1 dtype=torch.bfloat16 min=-15.875 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=10.375
Linear output=0 dtype=torch.bfloat16 min=-18.375 max=19.25
Linear output=1 dtype=torch.bfloat16 min=-18.5 max=19.25
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=10.375
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-11.4375 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=6.625
Linear output=1 dtype=torch.bfloat16 min=-7.34375 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-11.4375 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-6.21875 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-11.4375 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=8.375
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-36.0 max=5.875
Linear output=1 dtype=torch.bfloat16 min=-36.75 max=5.90625
Dropout input=0 dtype=torch.bfloat16 min=-36.75 max=5.90625
Dropout output=0 dtype=torch.bfloat16 min=-36.0 max=5.875
Dropout output=1 dtype=torch.bfloat16 min=-36.75 max=5.90625
Linear input=0 dtype=torch.bfloat16 min=-6.46875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-45.5 max=39.5
Linear output=1 dtype=torch.bfloat16 min=-47.75 max=42.0
Attention output=0 dtype=torch.bfloat16 min=-36.75 max=5.90625
Attention output=1 dtype=torch.bfloat16 min=-47.75 max=42.0
LayerNorm input=0 dtype=torch.bfloat16 min=-296.0 max=47.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.75 max=7.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-5.4375 max=1.6015625
Linear output=0 dtype=torch.bfloat16 min=-3.765625 max=3.1875
Linear output=1 dtype=torch.bfloat16 min=-3.71875 max=3.234375
GELU input=0 dtype=torch.bfloat16 min=-5.4375 max=1.6015625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.234375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.234375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.1875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.234375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.234375
Linear output=0 dtype=torch.bfloat16 min=-24.375 max=11.625
Linear output=1 dtype=torch.bfloat16 min=-24.375 max=11.5
FeedForward input=0 dtype=torch.bfloat16 min=-5.4375 max=1.6015625
FeedForward output=0 dtype=torch.bfloat16 min=-24.375 max=11.625
FeedForward output=1 dtype=torch.bfloat16 min=-24.375 max=11.5
LayerNorm input=0 dtype=torch.bfloat16 min=-5344.0 max=3792.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=21.0
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=21.125
Linear input=0 dtype=torch.bfloat16 min=-11.5 max=7.625
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=7.6875
GELU input=0 dtype=torch.bfloat16 min=-11.5 max=7.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-66.0 max=42.0
Linear output=1 dtype=torch.bfloat16 min=-52.25 max=41.5
FeedForward input=0 dtype=torch.bfloat16 min=-11.5 max=7.625
FeedForward output=0 dtype=torch.bfloat16 min=-66.0 max=42.0
FeedForward output=1 dtype=torch.bfloat16 min=-52.25 max=41.5
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6048.0 max=4160.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-158.0 max=53.75
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-6.5625 max=12.25
LayerNorm input=0 dtype=torch.bfloat16 min=-158.0 max=53.75
LayerNorm output=0 dtype=torch.bfloat16 min=-30.5 max=16.75
LayerNorm output=1 dtype=torch.bfloat16 min=-30.875 max=16.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-158.0 max=53.75
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-21.75 max=20.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.4375 max=6.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.296875 max=3.34375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.3125 max=1.59375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.5625 max=12.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=18.875
Linear output=1 dtype=torch.bfloat16 min=-21.25 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-6048.0 max=4160.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=24.375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=24.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6048.0 max=4160.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.28125 max=4.71875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-13.875 max=17.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.109375 max=1.1171875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.75 max=3.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.25 max=18.875
Linear input=0 dtype=torch.bfloat16 min=-21.75 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-23.875 max=26.125
Linear output=1 dtype=torch.bfloat16 min=-23.875 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-21.75 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-34.0 max=33.75
Linear output=1 dtype=torch.bfloat16 min=-33.25 max=33.5
Linear input=0 dtype=torch.bfloat16 min=-21.75 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-13.3125 max=10.8125
Linear output=1 dtype=torch.bfloat16 min=-13.3125 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=6.96875
Linear output=1 dtype=torch.bfloat16 min=-6.96875 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-7.0625 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.71875
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-7.4375 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=9.1875
Linear output=0 dtype=torch.bfloat16 min=-47.0 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-46.0 max=13.4375
Dropout input=0 dtype=torch.bfloat16 min=-47.0 max=13.6875
Dropout output=0 dtype=torch.bfloat16 min=-47.0 max=13.6875
Dropout output=1 dtype=torch.bfloat16 min=-46.0 max=13.4375
Linear input=0 dtype=torch.bfloat16 min=-6.6875 max=12.875
Linear output=0 dtype=torch.bfloat16 min=-55.0 max=9.0625
Linear output=1 dtype=torch.bfloat16 min=-55.75 max=9.1875
Attention output=0 dtype=torch.bfloat16 min=-47.0 max=13.6875
Attention output=1 dtype=torch.bfloat16 min=-55.75 max=9.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-388.0 max=63.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=6.875
LayerNorm output=1 dtype=torch.bfloat16 min=-37.5 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-2.75 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-5.15625 max=3.046875
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=3.078125
GELU input=0 dtype=torch.bfloat16 min=-2.75 max=1.46875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.046875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.078125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.078125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.046875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.078125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.078125
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=12.375
Linear output=1 dtype=torch.bfloat16 min=-11.8125 max=12.5
FeedForward input=0 dtype=torch.bfloat16 min=-2.75 max=1.46875
FeedForward output=0 dtype=torch.bfloat16 min=-11.625 max=12.375
FeedForward output=1 dtype=torch.bfloat16 min=-11.8125 max=12.5
LayerNorm input=0 dtype=torch.bfloat16 min=-6080.0 max=4096.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=22.875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=22.625
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-16.5 max=8.1875
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=8.875
GELU input=0 dtype=torch.bfloat16 min=-9.625 max=5.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=26.5
Linear output=1 dtype=torch.bfloat16 min=-23.625 max=26.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.625 max=5.25
FeedForward output=0 dtype=torch.bfloat16 min=-23.375 max=26.5
FeedForward output=1 dtype=torch.bfloat16 min=-23.625 max=26.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6368.0 max=4224.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-290.0 max=82.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=6.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-290.0 max=82.5
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=12.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=12.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-290.0 max=82.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.875 max=13.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.21875 max=4.53125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.375 max=2.734375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8828125 max=2.734375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.75 max=6.59375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-26.375 max=23.5
Linear output=1 dtype=torch.bfloat16 min=-26.5 max=23.5
LayerNorm input=0 dtype=torch.bfloat16 min=-6368.0 max=4224.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=23.75
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=23.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6368.0 max=4224.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.875 max=4.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.9375 max=11.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.98046875 max=0.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.5 max=4.1875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-26.5 max=23.5
Linear input=0 dtype=torch.bfloat16 min=-13.875 max=13.0
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-13.875 max=13.0
Linear output=0 dtype=torch.bfloat16 min=-10.375 max=9.625
Linear output=1 dtype=torch.bfloat16 min=-10.375 max=9.75
Linear input=0 dtype=torch.bfloat16 min=-13.875 max=13.0
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=12.9375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=12.875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=10.625
Linear output=1 dtype=torch.bfloat16 min=-7.75 max=11.625
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=7.15625
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=7.1875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=11.0
Linear input=0 dtype=torch.bfloat16 min=-7.125 max=9.6875
Linear output=0 dtype=torch.bfloat16 min=-42.5 max=14.25
Linear output=1 dtype=torch.bfloat16 min=-42.5 max=14.125
Dropout input=0 dtype=torch.bfloat16 min=-42.5 max=14.25
Dropout output=0 dtype=torch.bfloat16 min=-42.5 max=14.25
Dropout output=1 dtype=torch.bfloat16 min=-42.5 max=14.125
Linear input=0 dtype=torch.bfloat16 min=-10.625 max=9.6875
Linear output=0 dtype=torch.bfloat16 min=-30.0 max=50.0
Linear output=1 dtype=torch.bfloat16 min=-34.5 max=49.0
Attention output=0 dtype=torch.bfloat16 min=-42.5 max=14.25
Attention output=1 dtype=torch.bfloat16 min=-34.5 max=50.0
LayerNorm input=0 dtype=torch.bfloat16 min=-418.0 max=79.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.5 max=8.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-37.5 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-3.140625 max=2.140625
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=3.328125
Linear output=1 dtype=torch.bfloat16 min=-4.1875 max=3.375
GELU input=0 dtype=torch.bfloat16 min=-3.140625 max=2.140625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.328125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.375
Linear output=0 dtype=torch.bfloat16 min=-16.75 max=4.0625
Linear output=1 dtype=torch.bfloat16 min=-16.875 max=4.0625
FeedForward input=0 dtype=torch.bfloat16 min=-3.140625 max=2.140625
FeedForward output=0 dtype=torch.bfloat16 min=-16.75 max=4.0625
FeedForward output=1 dtype=torch.bfloat16 min=-16.875 max=4.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-6464.0 max=4128.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=22.625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=22.375
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=13.0
Linear output=0 dtype=torch.bfloat16 min=-26.25 max=28.125
Linear output=1 dtype=torch.bfloat16 min=-49.5 max=24.5
GELU input=0 dtype=torch.bfloat16 min=-9.4375 max=13.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=28.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=28.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=28.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=28.125
Linear output=0 dtype=torch.bfloat16 min=-50.5 max=52.75
Linear output=1 dtype=torch.bfloat16 min=-50.0 max=50.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.4375 max=13.0
FeedForward output=0 dtype=torch.bfloat16 min=-50.5 max=52.75
FeedForward output=1 dtype=torch.bfloat16 min=-50.0 max=50.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6624.0 max=4672.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-332.0 max=84.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=7.875
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=7.875
LayerNorm input=0 dtype=torch.bfloat16 min=-332.0 max=84.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=13.0625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=12.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-332.0 max=84.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.625 max=15.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=4.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.84375 max=2.125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9296875 max=2.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.0 max=7.875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.625 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-23.75 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-6624.0 max=4672.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=24.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=24.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6624.0 max=4672.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.875 max=8.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.75 max=17.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-6.15625 max=11.375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.5 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=20.0
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=19.5
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-33.0 max=35.0
Linear output=1 dtype=torch.bfloat16 min=-32.5 max=34.75
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-12.8125 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-12.625 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-7.875 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-6.875 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-7.875 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-12.25 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-7.875 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-14.375 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=12.9375
Linear input=0 dtype=torch.bfloat16 min=-8.5625 max=7.96875
Linear output=0 dtype=torch.bfloat16 min=-35.75 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-35.25 max=15.6875
Dropout input=0 dtype=torch.bfloat16 min=-35.75 max=15.75
Dropout output=0 dtype=torch.bfloat16 min=-35.75 max=15.75
Dropout output=1 dtype=torch.bfloat16 min=-35.25 max=15.6875
Linear input=0 dtype=torch.bfloat16 min=-7.8125 max=6.78125
Linear output=0 dtype=torch.bfloat16 min=-56.5 max=59.25
Linear output=1 dtype=torch.bfloat16 min=-56.75 max=59.5
Attention output=0 dtype=torch.bfloat16 min=-35.75 max=15.75
Attention output=1 dtype=torch.bfloat16 min=-56.75 max=59.5
LayerNorm input=0 dtype=torch.bfloat16 min=-416.0 max=85.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=8.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=2.125
Linear output=0 dtype=torch.bfloat16 min=-3.75 max=3.390625
Linear output=1 dtype=torch.bfloat16 min=-3.9375 max=3.296875
GELU input=0 dtype=torch.bfloat16 min=-2.9375 max=2.125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.390625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.296875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.390625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.390625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.296875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.390625
Linear output=0 dtype=torch.bfloat16 min=-18.0 max=5.0
Linear output=1 dtype=torch.bfloat16 min=-18.125 max=5.03125
FeedForward input=0 dtype=torch.bfloat16 min=-2.9375 max=2.125
FeedForward output=0 dtype=torch.bfloat16 min=-18.0 max=5.0
FeedForward output=1 dtype=torch.bfloat16 min=-18.125 max=5.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-6624.0 max=5024.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=26.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-18.375 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-58.75 max=194.0
Linear output=1 dtype=torch.bfloat16 min=-49.25 max=189.0
GELU input=0 dtype=torch.bfloat16 min=-18.375 max=11.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=194.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=189.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=194.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=194.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=189.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=194.0
Linear output=0 dtype=torch.bfloat16 min=-392.0 max=1376.0
Linear output=1 dtype=torch.bfloat16 min=-380.0 max=1344.0
FeedForward input=0 dtype=torch.bfloat16 min=-18.375 max=11.625
FeedForward output=0 dtype=torch.bfloat16 min=-392.0 max=1376.0
FeedForward output=1 dtype=torch.bfloat16 min=-380.0 max=1344.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-21632.0 max=9472.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-326.0 max=90.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=5.625
LayerNorm input=0 dtype=torch.bfloat16 min=-326.0 max=90.5
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=13.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=13.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-326.0 max=90.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.875 max=11.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.09375 max=4.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.125 max=1.625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.21875 max=2.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-8.125 max=5.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.5 max=21.25
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=21.375
LayerNorm input=0 dtype=torch.bfloat16 min=-21632.0 max=9472.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=30.125
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=29.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-21632.0 max=9472.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.0 max=4.15625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-17.5 max=21.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.078125 max=2.234375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.15625 max=7.09375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.28125 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=11.0
Linear output=0 dtype=torch.bfloat16 min=-10.375 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-10.375 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=11.0
Linear output=0 dtype=torch.bfloat16 min=-12.875 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-12.5625 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=11.0
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=7.78125
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-9.0625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=9.375
Linear output=1 dtype=torch.bfloat16 min=-11.4375 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-7.0 max=6.71875
Linear output=0 dtype=torch.bfloat16 min=-42.25 max=9.1875
Linear output=1 dtype=torch.bfloat16 min=-43.0 max=10.875
Dropout input=0 dtype=torch.bfloat16 min=-43.0 max=10.875
Dropout output=0 dtype=torch.bfloat16 min=-42.25 max=9.1875
Dropout output=1 dtype=torch.bfloat16 min=-43.0 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-70.5 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-70.5 max=15.75
Attention output=0 dtype=torch.bfloat16 min=-43.0 max=10.875
Attention output=1 dtype=torch.bfloat16 min=-70.5 max=16.125
LayerNorm input=0 dtype=torch.bfloat16 min=-442.0 max=90.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=8.5
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-2.875 max=1.296875
Linear output=0 dtype=torch.bfloat16 min=-3.703125 max=2.5625
Linear output=1 dtype=torch.bfloat16 min=-3.625 max=2.609375
GELU input=0 dtype=torch.bfloat16 min=-2.875 max=1.296875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.546875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.546875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-25.625 max=4.40625
Linear output=1 dtype=torch.bfloat16 min=-25.625 max=4.375
FeedForward input=0 dtype=torch.bfloat16 min=-2.875 max=1.296875
FeedForward output=0 dtype=torch.bfloat16 min=-25.625 max=4.40625
FeedForward output=1 dtype=torch.bfloat16 min=-25.625 max=4.375
LayerNorm input=0 dtype=torch.bfloat16 min=-23168.0 max=9408.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=30.0
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=29.75
Linear input=0 dtype=torch.bfloat16 min=-47.75 max=46.75
Linear output=0 dtype=torch.bfloat16 min=-19.25 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-19.625 max=16.625
GELU input=0 dtype=torch.bfloat16 min=-47.75 max=46.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=16.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=16.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-676.0 max=1816.0
Linear output=1 dtype=torch.bfloat16 min=-672.0 max=1832.0
FeedForward input=0 dtype=torch.bfloat16 min=-47.75 max=46.75
FeedForward output=0 dtype=torch.bfloat16 min=-676.0 max=1816.0
FeedForward output=1 dtype=torch.bfloat16 min=-672.0 max=1832.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-27776.0 max=21888.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-324.0 max=97.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=7.5
Linear output=1 dtype=torch.bfloat16 min=-6.34375 max=7.53125
LayerNorm input=0 dtype=torch.bfloat16 min=-324.0 max=97.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=15.25
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=15.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-324.0 max=97.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.5 max=10.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.09375 max=2.015625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.7734375 max=1.8984375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.125 max=1.859375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.34375 max=7.53125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.0 max=28.5
Linear output=1 dtype=torch.bfloat16 min=-23.125 max=28.5
LayerNorm input=0 dtype=torch.bfloat16 min=-27776.0 max=21888.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=35.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=35.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-27776.0 max=21888.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.0 max=5.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.125 max=21.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.94140625 max=2.140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.0 max=12.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.0 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-11.5 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-17.0 max=19.5
Linear output=1 dtype=torch.bfloat16 min=-16.625 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-11.5 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-30.625 max=24.875
Linear output=1 dtype=torch.bfloat16 min=-30.625 max=24.625
Linear input=0 dtype=torch.bfloat16 min=-11.5 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-10.625 max=11.6875
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=12.0
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-8.5 max=7.78125
Linear output=1 dtype=torch.bfloat16 min=-8.375 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.3125
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=7.34375
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-8.5625 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-8.5 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-6.375 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=57.0
Linear output=1 dtype=torch.bfloat16 min=-11.875 max=56.0
Dropout input=0 dtype=torch.bfloat16 min=-12.1875 max=57.0
Dropout output=0 dtype=torch.bfloat16 min=-12.1875 max=57.0
Dropout output=1 dtype=torch.bfloat16 min=-11.875 max=56.0
Linear input=0 dtype=torch.bfloat16 min=-4.6875 max=7.34375
Linear output=0 dtype=torch.bfloat16 min=-56.25 max=53.25
Linear output=1 dtype=torch.bfloat16 min=-54.0 max=40.25
Attention output=0 dtype=torch.bfloat16 min=-12.1875 max=57.0
Attention output=1 dtype=torch.bfloat16 min=-56.25 max=53.25
LayerNorm input=0 dtype=torch.bfloat16 min=-460.0 max=103.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=9.5
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=9.75
Linear input=0 dtype=torch.bfloat16 min=-3.03125 max=1.6875
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=2.40625
Linear output=1 dtype=torch.bfloat16 min=-4.125 max=2.609375
GELU input=0 dtype=torch.bfloat16 min=-3.03125 max=1.6875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.390625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.390625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-3.6875 max=27.125
Linear output=1 dtype=torch.bfloat16 min=-3.671875 max=26.375
FeedForward input=0 dtype=torch.bfloat16 min=-3.03125 max=1.6875
FeedForward output=0 dtype=torch.bfloat16 min=-3.6875 max=27.125
FeedForward output=1 dtype=torch.bfloat16 min=-3.671875 max=26.375
LayerNorm input=0 dtype=torch.bfloat16 min=-28672.0 max=21120.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=34.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=34.5
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-16.25 max=19.75
GELU input=0 dtype=torch.bfloat16 min=-17.0 max=18.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=19.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=19.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.75
Linear output=0 dtype=torch.bfloat16 min=-100.5 max=38.5
Linear output=1 dtype=torch.bfloat16 min=-100.5 max=40.5
FeedForward input=0 dtype=torch.bfloat16 min=-17.0 max=18.5
FeedForward output=0 dtype=torch.bfloat16 min=-100.5 max=38.5
FeedForward output=1 dtype=torch.bfloat16 min=-100.5 max=40.5
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-31488.0 max=21888.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-292.0 max=114.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=12.375
Linear output=1 dtype=torch.bfloat16 min=-7.78125 max=12.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-292.0 max=114.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=15.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=15.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-292.0 max=114.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-21.5 max=19.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.34375 max=5.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.046875 max=1.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.921875 max=1.8515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.78125 max=12.4375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=15.6875
Linear output=1 dtype=torch.bfloat16 min=-14.25 max=15.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-31488.0 max=21888.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=35.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=35.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-31488.0 max=21888.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.5625 max=6.59375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-14.25 max=15.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9296875 max=0.75
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.59375 max=13.6875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.3125 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-21.5 max=19.75
Linear output=0 dtype=torch.bfloat16 min=-29.0 max=27.25
Linear output=1 dtype=torch.bfloat16 min=-28.5 max=27.0
Linear input=0 dtype=torch.bfloat16 min=-21.5 max=19.75
Linear output=0 dtype=torch.bfloat16 min=-53.75 max=68.5
Linear output=1 dtype=torch.bfloat16 min=-52.5 max=67.0
Linear input=0 dtype=torch.bfloat16 min=-21.5 max=19.75
Linear output=0 dtype=torch.bfloat16 min=-12.625 max=11.1875
Linear output=1 dtype=torch.bfloat16 min=-12.8125 max=11.125
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-8.1875 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=6.40625
Linear output=1 dtype=torch.bfloat16 min=-6.21875 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-7.375 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-10.4375 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-40.75 max=15.9375
Linear output=1 dtype=torch.bfloat16 min=-41.0 max=17.625
Dropout input=0 dtype=torch.bfloat16 min=-41.0 max=17.625
Dropout output=0 dtype=torch.bfloat16 min=-40.75 max=15.9375
Dropout output=1 dtype=torch.bfloat16 min=-41.0 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-5.40625 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-39.25 max=18.625
Linear output=1 dtype=torch.bfloat16 min=-37.0 max=20.625
Attention output=0 dtype=torch.bfloat16 min=-41.0 max=17.625
Attention output=1 dtype=torch.bfloat16 min=-39.25 max=20.625
LayerNorm input=0 dtype=torch.bfloat16 min=-460.0 max=132.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=11.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=11.6875
Linear input=0 dtype=torch.bfloat16 min=-3.8125 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
Linear output=1 dtype=torch.bfloat16 min=-4.09375 max=2.703125
GELU input=0 dtype=torch.bfloat16 min=-3.8125 max=2.46875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.859375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.859375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.859375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=2.6875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.859375
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-9.8125 max=9.875
FeedForward input=0 dtype=torch.bfloat16 min=-3.8125 max=2.46875
FeedForward output=0 dtype=torch.bfloat16 min=-9.8125 max=10.0625
FeedForward output=1 dtype=torch.bfloat16 min=-9.8125 max=9.875
LayerNorm input=0 dtype=torch.bfloat16 min=-31616.0 max=21760.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=35.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=35.0
Linear input=0 dtype=torch.bfloat16 min=-28.0 max=22.375
Linear output=0 dtype=torch.bfloat16 min=-23.625 max=14.9375
Linear output=1 dtype=torch.bfloat16 min=-24.25 max=16.0
GELU input=0 dtype=torch.bfloat16 min=-28.0 max=22.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-322.0 max=592.0
Linear output=1 dtype=torch.bfloat16 min=-318.0 max=592.0
FeedForward input=0 dtype=torch.bfloat16 min=-28.0 max=22.375
FeedForward output=0 dtype=torch.bfloat16 min=-322.0 max=592.0
FeedForward output=1 dtype=torch.bfloat16 min=-318.0 max=592.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-29824.0 max=15104.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-382.0 max=139.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=8.375
LayerNorm input=0 dtype=torch.bfloat16 min=-382.0 max=139.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=14.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=15.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-382.0 max=139.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.0 max=10.0625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.65625 max=2.390625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8125 max=2.171875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.953125 max=1.703125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-8.125 max=8.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-16.25 max=21.0
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=21.125
LayerNorm input=0 dtype=torch.bfloat16 min=-29824.0 max=15104.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=31.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=31.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-29824.0 max=15104.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.0 max=9.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-16.375 max=21.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.078125 max=1.9453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-0.43359375 max=11.0625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.8125 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=10.0625
Linear output=0 dtype=torch.bfloat16 min=-17.875 max=19.375
Linear output=1 dtype=torch.bfloat16 min=-18.0 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=10.0625
Linear output=0 dtype=torch.bfloat16 min=-33.0 max=26.0
Linear output=1 dtype=torch.bfloat16 min=-32.75 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=10.0625
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-10.5 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-10.0 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-6.65625 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-10.0 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-10.0 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-16.5 max=13.25
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=7.71875
Linear output=0 dtype=torch.bfloat16 min=-17.75 max=25.75
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=25.875
Dropout input=0 dtype=torch.bfloat16 min=-17.75 max=25.875
Dropout output=0 dtype=torch.bfloat16 min=-17.75 max=25.75
Dropout output=1 dtype=torch.bfloat16 min=-17.75 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-16.75 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-16.125 max=11.9375
Attention output=0 dtype=torch.bfloat16 min=-17.75 max=25.875
Attention output=1 dtype=torch.bfloat16 min=-16.75 max=12.0
LayerNorm input=0 dtype=torch.bfloat16 min=-460.0 max=142.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=12.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.5
Linear input=0 dtype=torch.bfloat16 min=-3.6875 max=1.921875
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=3.21875
Linear output=1 dtype=torch.bfloat16 min=-4.625 max=3.28125
GELU input=0 dtype=torch.bfloat16 min=-3.6875 max=1.921875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.21875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.21875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.28125
Linear output=0 dtype=torch.bfloat16 min=-5.03125 max=18.875
Linear output=1 dtype=torch.bfloat16 min=-5.1875 max=18.75
FeedForward input=0 dtype=torch.bfloat16 min=-3.6875 max=1.921875
FeedForward output=0 dtype=torch.bfloat16 min=-5.03125 max=18.875
FeedForward output=1 dtype=torch.bfloat16 min=-5.1875 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-29824.0 max=15040.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=31.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=31.75
Linear input=0 dtype=torch.bfloat16 min=-45.5 max=47.5
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-9.1875 max=10.5625
GELU input=0 dtype=torch.bfloat16 min=-45.5 max=47.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-740.0 max=510.0
Linear output=1 dtype=torch.bfloat16 min=-740.0 max=510.0
FeedForward input=0 dtype=torch.bfloat16 min=-45.5 max=47.5
FeedForward output=0 dtype=torch.bfloat16 min=-740.0 max=510.0
FeedForward output=1 dtype=torch.bfloat16 min=-740.0 max=510.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-29312.0 max=20864.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-332.0 max=152.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=11.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-332.0 max=152.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=17.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=17.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-332.0 max=152.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.9375 max=14.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.3125 max=4.34375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.46875 max=1.6796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.515625 max=1.375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.0 max=11.0625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-23.5 max=23.125
LayerNorm input=0 dtype=torch.bfloat16 min=-29312.0 max=20864.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=33.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=33.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-29312.0 max=20864.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.8125 max=5.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.5 max=23.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6328125 max=0.734375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.875 max=11.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-16.5 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-15.9375 max=14.5625
Linear output=0 dtype=torch.bfloat16 min=-26.5 max=24.625
Linear output=1 dtype=torch.bfloat16 min=-26.375 max=24.125
Linear input=0 dtype=torch.bfloat16 min=-15.9375 max=14.5625
Linear output=0 dtype=torch.bfloat16 min=-62.0 max=71.5
Linear output=1 dtype=torch.bfloat16 min=-62.0 max=71.0
Linear input=0 dtype=torch.bfloat16 min=-15.9375 max=14.5625
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=1 dtype=torch.bfloat16 min=-10.75 max=12.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=8.0
Linear output=1 dtype=torch.bfloat16 min=-9.0625 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=7.96875
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=7.40625
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.65625
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-37.75 max=22.125
Linear output=1 dtype=torch.bfloat16 min=-37.5 max=21.625
Dropout input=0 dtype=torch.bfloat16 min=-37.75 max=22.125
Dropout output=0 dtype=torch.bfloat16 min=-37.75 max=22.125
Dropout output=1 dtype=torch.bfloat16 min=-37.5 max=21.625
Linear input=0 dtype=torch.bfloat16 min=-6.625 max=6.96875
Linear output=0 dtype=torch.bfloat16 min=-38.25 max=36.0
Linear output=1 dtype=torch.bfloat16 min=-34.25 max=38.0
Attention output=0 dtype=torch.bfloat16 min=-37.75 max=22.125
Attention output=1 dtype=torch.bfloat16 min=-38.25 max=38.0
LayerNorm input=0 dtype=torch.bfloat16 min=-478.0 max=155.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=13.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=13.125
Linear input=0 dtype=torch.bfloat16 min=-3.34375 max=2.65625
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=5.03125
Linear output=1 dtype=torch.bfloat16 min=-4.96875 max=5.09375
GELU input=0 dtype=torch.bfloat16 min=-3.34375 max=2.65625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.09375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.09375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.09375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=12.4375
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=12.625
FeedForward input=0 dtype=torch.bfloat16 min=-3.34375 max=2.65625
FeedForward output=0 dtype=torch.bfloat16 min=-6.71875 max=12.4375
FeedForward output=1 dtype=torch.bfloat16 min=-6.71875 max=12.625
LayerNorm input=0 dtype=torch.bfloat16 min=-29568.0 max=20608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=33.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=33.5
Linear input=0 dtype=torch.bfloat16 min=-17.875 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-25.75 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-24.5 max=18.5
GELU input=0 dtype=torch.bfloat16 min=-17.875 max=17.875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=18.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=19.625
Linear output=0 dtype=torch.bfloat16 min=-199.0 max=294.0
Linear output=1 dtype=torch.bfloat16 min=-192.0 max=298.0
FeedForward input=0 dtype=torch.bfloat16 min=-17.875 max=17.875
FeedForward output=0 dtype=torch.bfloat16 min=-199.0 max=294.0
FeedForward output=1 dtype=torch.bfloat16 min=-192.0 max=298.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-28544.0 max=15680.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-376.0 max=164.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-10.1875 max=10.375
LayerNorm input=0 dtype=torch.bfloat16 min=-376.0 max=164.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.75 max=16.375
LayerNorm output=1 dtype=torch.bfloat16 min=-31.75 max=16.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-376.0 max=164.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.875 max=13.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=2.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.765625 max=3.140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.703125 max=3.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.1875 max=10.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-22.0 max=18.75
Linear output=1 dtype=torch.bfloat16 min=-22.0 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-28544.0 max=15680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=30.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=30.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-28544.0 max=15680.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.125 max=7.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-22.0 max=18.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.171875 max=1.4296875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=16.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.828125 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-14.875 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-28.5 max=25.5
Linear output=1 dtype=torch.bfloat16 min=-28.125 max=25.25
Linear input=0 dtype=torch.bfloat16 min=-14.875 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-55.75 max=69.5
Linear output=1 dtype=torch.bfloat16 min=-55.0 max=69.0
Linear input=0 dtype=torch.bfloat16 min=-14.875 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=13.8125
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=7.6875
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=9.1875
Linear output=1 dtype=torch.bfloat16 min=-8.625 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=7.6875
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=13.5625
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=13.5625
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=7.6875
Linear output=0 dtype=torch.bfloat16 min=-10.375 max=10.375
Linear output=1 dtype=torch.bfloat16 min=-10.1875 max=10.125
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-46.75 max=52.0
Linear output=1 dtype=torch.bfloat16 min=-46.75 max=52.0
Dropout input=0 dtype=torch.bfloat16 min=-46.75 max=52.0
Dropout output=0 dtype=torch.bfloat16 min=-46.75 max=52.0
Dropout output=1 dtype=torch.bfloat16 min=-46.75 max=52.0
Linear input=0 dtype=torch.bfloat16 min=-8.6875 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-17.125 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-18.875 max=15.75
Attention output=0 dtype=torch.bfloat16 min=-46.75 max=52.0
Attention output=1 dtype=torch.bfloat16 min=-18.875 max=16.375
LayerNorm input=0 dtype=torch.bfloat16 min=-496.0 max=179.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=13.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=13.5
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=3.984375
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=2.984375
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=3.03125
GELU input=0 dtype=torch.bfloat16 min=-2.65625 max=3.984375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.984375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.03125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.03125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.984375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-21.5 max=8.1875
Linear output=1 dtype=torch.bfloat16 min=-21.5 max=8.0
FeedForward input=0 dtype=torch.bfloat16 min=-2.65625 max=3.984375
FeedForward output=0 dtype=torch.bfloat16 min=-21.5 max=8.1875
FeedForward output=1 dtype=torch.bfloat16 min=-21.5 max=8.0
LayerNorm input=0 dtype=torch.bfloat16 min=-28544.0 max=15680.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=30.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=30.5
Linear input=0 dtype=torch.bfloat16 min=-66.0 max=61.0
Linear output=0 dtype=torch.bfloat16 min=-26.0 max=23.5
Linear output=1 dtype=torch.bfloat16 min=-25.875 max=24.875
GELU input=0 dtype=torch.bfloat16 min=-66.0 max=61.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=23.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=24.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=23.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=24.875
Linear output=0 dtype=torch.bfloat16 min=-988.0 max=1004.0
Linear output=1 dtype=torch.bfloat16 min=-1144.0 max=1160.0
FeedForward input=0 dtype=torch.bfloat16 min=-66.0 max=61.0
FeedForward output=0 dtype=torch.bfloat16 min=-988.0 max=1004.0
FeedForward output=1 dtype=torch.bfloat16 min=-1144.0 max=1160.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-27520.0 max=15936.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-364.0 max=187.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-14.6875 max=14.8125
Linear output=1 dtype=torch.bfloat16 min=-14.6875 max=14.875
LayerNorm input=0 dtype=torch.bfloat16 min=-364.0 max=187.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.625 max=17.5
LayerNorm output=1 dtype=torch.bfloat16 min=-29.5 max=17.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-364.0 max=187.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.3125 max=14.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.390625 max=4.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.2421875 max=3.5625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.03125 max=2.765625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.6875 max=14.875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.75 max=20.5
Linear output=1 dtype=torch.bfloat16 min=-17.875 max=20.5
LayerNorm input=0 dtype=torch.bfloat16 min=-27520.0 max=15936.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=30.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=30.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-27520.0 max=15936.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.9375 max=5.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-17.875 max=20.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.625 max=1.8671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.65625 max=14.8125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.609375 max=1.8046875
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=14.9375
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=21.125
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=21.0
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=14.9375
Linear output=0 dtype=torch.bfloat16 min=-48.75 max=41.75
Linear output=1 dtype=torch.bfloat16 min=-48.5 max=41.5
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=14.9375
Linear output=0 dtype=torch.bfloat16 min=-13.75 max=12.9375
Linear output=1 dtype=torch.bfloat16 min=-13.125 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=7.28125
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=15.5625
Linear output=1 dtype=torch.bfloat16 min=-10.625 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=7.75
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-10.9375 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-49.25 max=26.75
Linear output=1 dtype=torch.bfloat16 min=-51.25 max=26.375
Dropout input=0 dtype=torch.bfloat16 min=-51.25 max=26.75
Dropout output=0 dtype=torch.bfloat16 min=-49.25 max=26.75
Dropout output=1 dtype=torch.bfloat16 min=-51.25 max=26.375
Linear input=0 dtype=torch.bfloat16 min=-10.75 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-16.875 max=26.75
Linear output=1 dtype=torch.bfloat16 min=-17.625 max=23.5
Attention output=0 dtype=torch.bfloat16 min=-51.25 max=26.75
Attention output=1 dtype=torch.bfloat16 min=-17.625 max=26.75
LayerNorm input=0 dtype=torch.bfloat16 min=-476.0 max=200.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.125 max=14.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-31.0 max=14.3125
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=3.0625
Linear output=0 dtype=torch.bfloat16 min=-5.78125 max=4.375
Linear output=1 dtype=torch.bfloat16 min=-5.75 max=4.53125
GELU input=0 dtype=torch.bfloat16 min=-2.859375 max=3.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=38.75
Linear output=1 dtype=torch.bfloat16 min=-8.9375 max=38.75
FeedForward input=0 dtype=torch.bfloat16 min=-2.859375 max=3.0625
FeedForward output=0 dtype=torch.bfloat16 min=-8.8125 max=38.75
FeedForward output=1 dtype=torch.bfloat16 min=-8.9375 max=38.75
LayerNorm input=0 dtype=torch.bfloat16 min=-27520.0 max=15936.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=30.125
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=30.125
Linear input=0 dtype=torch.bfloat16 min=-92.5 max=74.0
Linear output=0 dtype=torch.bfloat16 min=-27.125 max=77.5
Linear output=1 dtype=torch.bfloat16 min=-28.25 max=78.5
GELU input=0 dtype=torch.bfloat16 min=-92.5 max=74.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=77.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=78.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=78.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=77.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=78.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=78.5
Linear output=0 dtype=torch.bfloat16 min=-3440.0 max=2256.0
Linear output=1 dtype=torch.bfloat16 min=-3472.0 max=2320.0
FeedForward input=0 dtype=torch.bfloat16 min=-92.5 max=74.0
FeedForward output=0 dtype=torch.bfloat16 min=-3440.0 max=2256.0
FeedForward output=1 dtype=torch.bfloat16 min=-3472.0 max=2320.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-27264.0 max=15808.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-262.0 max=203.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-262.0 max=203.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.375 max=18.125
LayerNorm output=1 dtype=torch.bfloat16 min=-23.125 max=18.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-262.0 max=203.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-27.375 max=25.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.78125 max=5.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.09375 max=3.546875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.78125 max=3.640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-17.75 max=17.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=22.0
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=22.0
LayerNorm input=0 dtype=torch.bfloat16 min=-27264.0 max=15808.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=30.125
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=30.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-27264.0 max=15808.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.40625 max=8.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-22.5 max=14.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.46875 max=4.75
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-8.6875 max=22.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.203125 max=2.640625
Linear input=0 dtype=torch.bfloat16 min=-27.375 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-29.75 max=32.5
Linear output=1 dtype=torch.bfloat16 min=-29.375 max=32.0
Linear input=0 dtype=torch.bfloat16 min=-27.375 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-52.5 max=60.0
Linear output=1 dtype=torch.bfloat16 min=-52.25 max=59.25
Linear input=0 dtype=torch.bfloat16 min=-27.375 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-19.5 max=15.875
Linear output=1 dtype=torch.bfloat16 min=-18.875 max=15.75
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-7.90625 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-7.5625 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-15.625 max=15.5
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=15.6875
Linear input=0 dtype=torch.bfloat16 min=-7.40625 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-9.1875 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-10.0 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-25.875 max=24.875
Linear output=1 dtype=torch.bfloat16 min=-25.625 max=24.75
Dropout input=0 dtype=torch.bfloat16 min=-25.875 max=24.875
Dropout output=0 dtype=torch.bfloat16 min=-25.875 max=24.875
Dropout output=1 dtype=torch.bfloat16 min=-25.625 max=24.75
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=7.9375
Linear output=0 dtype=torch.bfloat16 min=-29.375 max=37.0
Linear output=1 dtype=torch.bfloat16 min=-31.875 max=40.25
Attention output=0 dtype=torch.bfloat16 min=-25.875 max=24.875
Attention output=1 dtype=torch.bfloat16 min=-31.875 max=40.25
LayerNorm input=0 dtype=torch.bfloat16 min=-342.0 max=212.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.5 max=14.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-22.0 max=14.5
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.1875
Linear output=1 dtype=torch.bfloat16 min=-4.90625 max=4.0625
GELU input=0 dtype=torch.bfloat16 min=-3.609375 max=4.03125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-27.875 max=15.625
Linear output=1 dtype=torch.bfloat16 min=-28.0 max=15.75
FeedForward input=0 dtype=torch.bfloat16 min=-3.609375 max=4.03125
FeedForward output=0 dtype=torch.bfloat16 min=-27.875 max=15.625
FeedForward output=1 dtype=torch.bfloat16 min=-28.0 max=15.75
LayerNorm input=0 dtype=torch.bfloat16 min=-27264.0 max=15808.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=30.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=30.125
Linear input=0 dtype=torch.bfloat16 min=-226.0 max=258.0
Linear output=0 dtype=torch.bfloat16 min=-79.0 max=282.0
Linear output=1 dtype=torch.bfloat16 min=-80.5 max=284.0
GELU input=0 dtype=torch.bfloat16 min=-226.0 max=258.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=282.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=284.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=284.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=282.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=284.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=284.0
Linear output=0 dtype=torch.bfloat16 min=-19456.0 max=30464.0
Linear output=1 dtype=torch.bfloat16 min=-19712.0 max=30592.0
FeedForward input=0 dtype=torch.bfloat16 min=-226.0 max=258.0
FeedForward output=0 dtype=torch.bfloat16 min=-19456.0 max=30464.0
FeedForward output=1 dtype=torch.bfloat16 min=-19712.0 max=30592.0
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-23808.0 max=6976.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-278.0 max=217.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-278.0 max=217.0
LayerNorm output=0 dtype=torch.bfloat16 min=-20.25 max=15.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-19.5 max=16.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-278.0 max=217.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-19.0 max=18.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-7.46875 max=10.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0546875 max=1.2578125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.65625 max=3.984375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.375 max=17.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-4.15625 max=6.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-23808.0 max=6976.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=18.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=17.875
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-23808.0 max=6976.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-23.25 max=7.09375
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-7.625 max=9.8125
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-7.75 max=9.875
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-18.875 max=22.625
Linear output=1 dtype=torch.bfloat16 min=-18.5 max=22.125
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=26.0
Linear output=1 dtype=torch.bfloat16 min=-23.125 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=11.3125
Linear output=1 dtype=torch.bfloat16 min=-7.875 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-1.171875 max=1.140625
Linear output=1 dtype=torch.bfloat16 min=-1.1484375 max=1.0703125
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=8.875
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=7.53125
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-7.25 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-18.375 max=29.25
Linear output=1 dtype=torch.bfloat16 min=-18.375 max=29.375
Dropout input=0 dtype=torch.bfloat16 min=-18.375 max=29.375
Dropout output=0 dtype=torch.bfloat16 min=-18.375 max=29.25
Dropout output=1 dtype=torch.bfloat16 min=-18.375 max=29.375
Attention output=0 dtype=torch.bfloat16 min=-18.375 max=29.375
Attention output=1 dtype=torch.bfloat16 min=-4.96875 max=5.53125
LayerNorm input=0 dtype=torch.bfloat16 min=-412.0 max=244.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=14.0
LayerNorm output=1 dtype=torch.bfloat16 min=-22.5 max=14.0625
Linear input=0 dtype=torch.bfloat16 min=-6.90625 max=6.1875
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-5.59375 max=6.71875
GELU input=0 dtype=torch.bfloat16 min=-6.90625 max=6.1875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.59375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.71875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.71875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.59375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.71875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.71875
Linear output=0 dtype=torch.bfloat16 min=-23.5 max=38.0
Linear output=1 dtype=torch.bfloat16 min=-23.875 max=39.0
FeedForward input=0 dtype=torch.bfloat16 min=-6.90625 max=6.1875
FeedForward output=0 dtype=torch.bfloat16 min=-23.5 max=38.0
FeedForward output=1 dtype=torch.bfloat16 min=-23.875 max=39.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-520.0 max=308.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-1.5 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-1.5078125 max=6.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-520.0 max=308.0
LayerNorm output=0 dtype=torch.bfloat16 min=-17.75 max=10.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-17.625 max=10.4375
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-520.0 max=308.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-23.25 max=7.09375
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-8.0625 max=6.875
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-8.0 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=6.875
Linear output=0 dtype=torch.bfloat16 min=-3.984375 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-3.953125 max=4.96875
SD3Transformer2DModel output=0 dtype=torch.bfloat16 min=-3.984375 max=4.96875
100%|██████████| 2/2 [00:54<00:00, 26.69s/it]
100%|██████████| 2/2 [00:54<00:00, 27.04s/it]
Conv2d input=0 dtype=torch.bfloat16 min=-3.484375 max=2.96875
Conv2d output=0 dtype=torch.bfloat16 min=-2.375 max=2.359375
GroupNorm input=0 dtype=torch.bfloat16 min=-2.375 max=2.359375
GroupNorm output=0 dtype=torch.bfloat16 min=-6.375 max=6.25
SiLU input=0 dtype=torch.bfloat16 min=-6.375 max=6.25
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.25
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.25
Conv2d output=0 dtype=torch.bfloat16 min=-19.625 max=2.65625
GroupNorm input=0 dtype=torch.bfloat16 min=-19.625 max=2.65625
GroupNorm output=0 dtype=torch.bfloat16 min=-7.84375 max=1.8359375
SiLU input=0 dtype=torch.bfloat16 min=-7.84375 max=1.8359375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.5859375
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.5859375
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.5859375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.5859375
Conv2d output=0 dtype=torch.bfloat16 min=-1.453125 max=0.99609375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-2.375 max=2.359375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-2.84375 max=2.578125
GroupNorm input=0 dtype=torch.bfloat16 min=-2.84375 max=2.578125
GroupNorm output=0 dtype=torch.bfloat16 min=-5.59375 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-5.59375 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-7.1875 max=5.6875
Linear input=0 dtype=torch.bfloat16 min=-5.59375 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-5.59375 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-2.953125 max=2.4375
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.8125
Linear output=0 dtype=torch.bfloat16 min=-1.0234375 max=1.0625
Dropout input=0 dtype=torch.bfloat16 min=-1.0234375 max=1.0625
Dropout output=0 dtype=torch.bfloat16 min=-1.0234375 max=1.0625
Attention input=0 dtype=torch.bfloat16 min=-2.84375 max=2.578125
Attention output=0 dtype=torch.bfloat16 min=-3.46875 max=2.8125
GroupNorm input=0 dtype=torch.bfloat16 min=-3.46875 max=2.8125
GroupNorm output=0 dtype=torch.bfloat16 min=-5.46875 max=4.40625
SiLU input=0 dtype=torch.bfloat16 min=-5.46875 max=4.40625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.34375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.34375
Conv2d output=0 dtype=torch.bfloat16 min=-7.0 max=1.6171875
GroupNorm input=0 dtype=torch.bfloat16 min=-7.0 max=1.6171875
GroupNorm output=0 dtype=torch.bfloat16 min=-5.28125 max=1.6875
SiLU input=0 dtype=torch.bfloat16 min=-5.28125 max=1.6875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.421875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.421875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.421875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.421875
Conv2d output=0 dtype=torch.bfloat16 min=-0.9921875 max=1.0703125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.46875 max=2.8125
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.65625 max=3.15625
UNetMidBlock2D input=0 dtype=torch.bfloat16 min=-2.375 max=2.359375
UNetMidBlock2D output=0 dtype=torch.bfloat16 min=-3.65625 max=3.15625
GroupNorm input=0 dtype=torch.bfloat16 min=-3.65625 max=3.15625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.40625 max=4.21875
SiLU input=0 dtype=torch.bfloat16 min=-5.40625 max=4.21875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.15625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.15625
Conv2d output=0 dtype=torch.bfloat16 min=-6.15625 max=1.734375
GroupNorm input=0 dtype=torch.bfloat16 min=-6.15625 max=1.734375
GroupNorm output=0 dtype=torch.bfloat16 min=-4.8125 max=2.34375
SiLU input=0 dtype=torch.bfloat16 min=-4.8125 max=2.34375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.140625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.140625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.140625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.140625
Conv2d output=0 dtype=torch.bfloat16 min=-1.1640625 max=1.5
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.65625 max=3.15625
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.828125 max=3.234375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.828125 max=3.234375
GroupNorm output=0 dtype=torch.bfloat16 min=-5.15625 max=4.28125
SiLU input=0 dtype=torch.bfloat16 min=-5.15625 max=4.28125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.21875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.21875
Conv2d output=0 dtype=torch.bfloat16 min=-6.46875 max=2.25
GroupNorm input=0 dtype=torch.bfloat16 min=-6.46875 max=2.25
GroupNorm output=0 dtype=torch.bfloat16 min=-5.1875 max=3.140625
SiLU input=0 dtype=torch.bfloat16 min=-5.1875 max=3.140625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.015625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.015625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.015625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.015625
Conv2d output=0 dtype=torch.bfloat16 min=-1.09375 max=1.09375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.828125 max=3.234375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.78125 max=3.203125
GroupNorm input=0 dtype=torch.bfloat16 min=-3.78125 max=3.203125
GroupNorm output=0 dtype=torch.bfloat16 min=-5.625 max=4.65625
SiLU input=0 dtype=torch.bfloat16 min=-5.625 max=4.65625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Conv2d output=0 dtype=torch.bfloat16 min=-6.4375 max=2.34375
GroupNorm input=0 dtype=torch.bfloat16 min=-6.4375 max=2.34375
GroupNorm output=0 dtype=torch.bfloat16 min=-5.5 max=2.59375
SiLU input=0 dtype=torch.bfloat16 min=-5.5 max=2.59375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.40625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.40625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.40625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.40625
Conv2d output=0 dtype=torch.bfloat16 min=-0.98828125 max=1.75
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.78125 max=3.203125
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.546875 max=3.484375
Conv2d input=0 dtype=torch.bfloat16 min=-3.546875 max=3.484375
Conv2d output=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
Upsample2D input=0 dtype=torch.bfloat16 min=-3.546875 max=3.484375
Upsample2D output=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-3.65625 max=3.15625
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
GroupNorm input=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
GroupNorm output=0 dtype=torch.bfloat16 min=-8.6875 max=6.1875
SiLU input=0 dtype=torch.bfloat16 min=-8.6875 max=6.1875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.1875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.1875
Conv2d output=0 dtype=torch.bfloat16 min=-4.53125 max=2.125
GroupNorm input=0 dtype=torch.bfloat16 min=-4.53125 max=2.125
GroupNorm output=0 dtype=torch.bfloat16 min=-8.375 max=5.96875
SiLU input=0 dtype=torch.bfloat16 min=-8.375 max=5.96875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Conv2d output=0 dtype=torch.bfloat16 min=-2.75 max=1.4921875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.5 max=2.8125
GroupNorm input=0 dtype=torch.bfloat16 min=-5.5 max=2.8125
GroupNorm output=0 dtype=torch.bfloat16 min=-5.0 max=4.9375
SiLU input=0 dtype=torch.bfloat16 min=-5.0 max=4.9375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.90625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.90625
Conv2d output=0 dtype=torch.bfloat16 min=-3.109375 max=1.515625
GroupNorm input=0 dtype=torch.bfloat16 min=-3.109375 max=1.515625
GroupNorm output=0 dtype=torch.bfloat16 min=-8.75 max=4.375
SiLU input=0 dtype=torch.bfloat16 min=-8.75 max=4.375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.3125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.3125
Conv2d output=0 dtype=torch.bfloat16 min=-2.1875 max=1.7265625
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.5 max=2.8125
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.8125 max=3.1875
GroupNorm input=0 dtype=torch.bfloat16 min=-5.8125 max=3.1875
GroupNorm output=0 dtype=torch.bfloat16 min=-5.4375 max=4.875
SiLU input=0 dtype=torch.bfloat16 min=-5.4375 max=4.875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.84375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.84375
Conv2d output=0 dtype=torch.bfloat16 min=-3.90625 max=2.734375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.90625 max=2.734375
GroupNorm output=0 dtype=torch.bfloat16 min=-7.65625 max=6.21875
SiLU input=0 dtype=torch.bfloat16 min=-7.65625 max=6.21875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.21875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.21875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.21875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.21875
Conv2d output=0 dtype=torch.bfloat16 min=-2.5 max=3.578125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.8125 max=3.1875
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.375 max=3.390625
Conv2d input=0 dtype=torch.bfloat16 min=-5.375 max=3.390625
Conv2d output=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
Upsample2D input=0 dtype=torch.bfloat16 min=-5.375 max=3.390625
Upsample2D output=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-4.125 max=2.859375
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
GroupNorm input=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
GroupNorm output=0 dtype=torch.bfloat16 min=-6.15625 max=7.21875
SiLU input=0 dtype=torch.bfloat16 min=-6.15625 max=7.21875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.21875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.21875
Conv2d output=0 dtype=torch.bfloat16 min=-3.71875 max=1.96875
GroupNorm input=0 dtype=torch.bfloat16 min=-3.71875 max=1.96875
GroupNorm output=0 dtype=torch.bfloat16 min=-6.78125 max=5.09375
SiLU input=0 dtype=torch.bfloat16 min=-6.78125 max=5.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0625
Conv2d output=0 dtype=torch.bfloat16 min=-2.96875 max=1.7890625
Conv2d input=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
Conv2d output=0 dtype=torch.bfloat16 min=-3.125 max=2.984375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.21875 max=3.53125
GroupNorm input=0 dtype=torch.bfloat16 min=-5.21875 max=3.53125
GroupNorm output=0 dtype=torch.bfloat16 min=-7.0 max=3.921875
SiLU input=0 dtype=torch.bfloat16 min=-7.0 max=3.921875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.84375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.84375
Conv2d output=0 dtype=torch.bfloat16 min=-2.046875 max=1.0703125
GroupNorm input=0 dtype=torch.bfloat16 min=-2.046875 max=1.0703125
GroupNorm output=0 dtype=torch.bfloat16 min=-7.46875 max=4.0
SiLU input=0 dtype=torch.bfloat16 min=-7.46875 max=4.0
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.921875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.921875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.921875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.921875
Conv2d output=0 dtype=torch.bfloat16 min=-1.8359375 max=1.5078125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.21875 max=3.53125
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.5625 max=4.0625
GroupNorm input=0 dtype=torch.bfloat16 min=-5.5625 max=4.0625
GroupNorm output=0 dtype=torch.bfloat16 min=-6.5 max=4.34375
SiLU input=0 dtype=torch.bfloat16 min=-6.5 max=4.34375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.28125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.28125
Conv2d output=0 dtype=torch.bfloat16 min=-1.65625 max=1.3203125
GroupNorm input=0 dtype=torch.bfloat16 min=-1.65625 max=1.3203125
GroupNorm output=0 dtype=torch.bfloat16 min=-6.84375 max=7.6875
SiLU input=0 dtype=torch.bfloat16 min=-6.84375 max=7.6875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.6875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.6875
Conv2d output=0 dtype=torch.bfloat16 min=-1.96875 max=2.953125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.5625 max=4.0625
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.375 max=4.25
Conv2d input=0 dtype=torch.bfloat16 min=-5.375 max=4.25
Conv2d output=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
Upsample2D input=0 dtype=torch.bfloat16 min=-5.375 max=4.25
Upsample2D output=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-4.71875 max=3.875
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
GroupNorm input=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
GroupNorm output=0 dtype=torch.bfloat16 min=-6.46875 max=5.96875
SiLU input=0 dtype=torch.bfloat16 min=-6.46875 max=5.96875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.96875
Conv2d output=0 dtype=torch.bfloat16 min=-2.4375 max=1.390625
GroupNorm input=0 dtype=torch.bfloat16 min=-2.4375 max=1.390625
GroupNorm output=0 dtype=torch.bfloat16 min=-6.0 max=3.875
SiLU input=0 dtype=torch.bfloat16 min=-6.0 max=3.875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.796875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.796875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.796875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.796875
Conv2d output=0 dtype=torch.bfloat16 min=-3.25 max=2.1875
Conv2d input=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
Conv2d output=0 dtype=torch.bfloat16 min=-4.1875 max=3.484375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-6.375 max=3.796875
GroupNorm input=0 dtype=torch.bfloat16 min=-6.375 max=3.796875
GroupNorm output=0 dtype=torch.bfloat16 min=-7.25 max=6.0625
SiLU input=0 dtype=torch.bfloat16 min=-7.25 max=6.0625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.0625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.0625
Conv2d output=0 dtype=torch.bfloat16 min=-1.3671875 max=1.0
GroupNorm input=0 dtype=torch.bfloat16 min=-1.3671875 max=1.0
GroupNorm output=0 dtype=torch.bfloat16 min=-9.625 max=9.6875
SiLU input=0 dtype=torch.bfloat16 min=-9.625 max=9.6875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.6875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.6875
Conv2d output=0 dtype=torch.bfloat16 min=-2.921875 max=2.125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-6.375 max=3.796875
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-7.4375 max=4.5
GroupNorm input=0 dtype=torch.bfloat16 min=-7.4375 max=4.5
GroupNorm output=0 dtype=torch.bfloat16 min=-14.8125 max=9.3125
SiLU input=0 dtype=torch.bfloat16 min=-14.8125 max=9.3125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.3125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.3125
Conv2d output=0 dtype=torch.bfloat16 min=-3.203125 max=1.234375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.203125 max=1.234375
GroupNorm output=0 dtype=torch.bfloat16 min=-21.0 max=10.3125
SiLU input=0 dtype=torch.bfloat16 min=-21.0 max=10.3125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=10.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=10.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=10.3125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=10.3125
Conv2d output=0 dtype=torch.bfloat16 min=-6.34375 max=2.671875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-7.4375 max=4.5
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-8.3125 max=4.90625
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-5.6875 max=4.96875
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-8.3125 max=4.90625
GroupNorm input=0 dtype=torch.bfloat16 min=-8.3125 max=4.90625
GroupNorm output=0 dtype=torch.bfloat16 min=-9.0625 max=2.765625
SiLU input=0 dtype=torch.bfloat16 min=-9.0625 max=2.765625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.609375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.609375
Conv2d output=0 dtype=torch.bfloat16 min=-1.109375 max=1.328125
Decoder input=0 dtype=torch.bfloat16 min=-3.484375 max=2.96875
Decoder output=0 dtype=torch.bfloat16 min=-1.109375 max=1.328125
average inference time=73.59430265426636
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment