Skip to content

Instantly share code, notes, and snippets.

@0x6b64
Created June 18, 2024 10:09
Show Gist options
  • Save 0x6b64/9fd1c11f57688720d1919759e6ebcaed to your computer and use it in GitHub Desktop.
Save 0x6b64/9fd1c11f57688720d1919759e6ebcaed to your computer and use it in GitHub Desktop.
sd3_cpu_failing_activations.txt
/usr/local/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py:34: FutureWarning: `Transformer2DModelOutput` is deprecated and will be removed in version 1.0.0. Importing `Transformer2DModelOutput` from `diffusers.models.transformer_2d` is deprecated and this will be removed in a future version. Please use `from diffusers.models.modeling_outputs import Transformer2DModelOutput`, instead.
deprecate("Transformer2DModelOutput", "1.0.0", deprecation_message)
Loading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 11%|█ | 1/9 [00:00<00:01, 5.57it/s]
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 2.14it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 2.27it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 2.25it/s]
Loading pipeline components...: 22%|██▏ | 2/9 [00:01<00:04, 1.55it/s]Some weights of the model checkpoint were not used when initializing SD3Transformer2DModel:
['pos_embed.pos_embed']
Loading pipeline components...: 44%|████▍ | 4/9 [00:02<00:03, 1.54it/s]
Loading pipeline components...: 56%|█████▌ | 5/9 [00:02<00:02, 1.79it/s]
Loading pipeline components...: 78%|███████▊ | 7/9 [00:02<00:00, 3.07it/s]
Loading pipeline components...: 100%|██████████| 9/9 [00:03<00:00, 4.17it/s]
Loading pipeline components...: 100%|██████████| 9/9 [00:03<00:00, 2.85it/s]
Embedding input=0 dtype=torch.int64 min=320 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.5078125 max=0.65234375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.1181640625 max=0.65234375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm output=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-27.25 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-2.421875 max=1.6015625
Linear output=0 dtype=torch.bfloat16 min=-0.94921875 max=1.0078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.94921875 max=1.0078125
LayerNorm input=0 dtype=torch.bfloat16 min=-0.9609375 max=1.71875
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=179.0
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=179.0
Linear output=0 dtype=torch.bfloat16 min=-44.75 max=233.0
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-44.75 max=233.0
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPMLP input=0 dtype=torch.bfloat16 min=-23.25 max=179.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=4.84375
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=3.890625
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-2.171875 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-1.4140625 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-0.375 max=0.84765625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.375 max=0.84765625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-11.75 max=3.8125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.75 max=3.8125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.8125
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=4.875
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-2.6875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-0.400390625 max=0.345703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.400390625 max=0.345703125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=3.90625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.78125 max=3.90625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.90625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.90625
Linear output=0 dtype=torch.bfloat16 min=-0.546875 max=0.50390625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.546875 max=0.50390625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=20.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=2.78125
Linear input=0 dtype=torch.bfloat16 min=-1.453125 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.412109375 max=0.462890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.412109375 max=0.462890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear output=0 dtype=torch.bfloat16 min=-5.90625 max=4.65625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.90625 max=4.65625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.65625
Linear output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.462890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.462890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-17.75 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.65625
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.078125
Linear output=0 dtype=torch.bfloat16 min=-0.380859375 max=0.41796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.380859375 max=0.41796875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.21875 max=6.3125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-0.5 max=0.5625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5 max=0.5625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=20.0
Linear output=0 dtype=torch.bfloat16 min=-2.65625 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-1.40625 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.45703125 max=0.71484375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.45703125 max=0.71484375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=3.40625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.875 max=3.40625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.390625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.390625
Linear output=0 dtype=torch.bfloat16 min=-0.92578125 max=0.494140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.92578125 max=0.494140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-19.0 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-3.21875 max=2.984375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-0.60546875 max=1.0390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.60546875 max=1.0390625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=4.28125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=4.28125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.50390625
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.50390625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-22.0 max=17.5
Linear output=0 dtype=torch.bfloat16 min=-3.609375 max=3.09375
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-0.435546875 max=0.7109375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.435546875 max=0.7109375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.40625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.34375 max=6.40625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.40625
Linear output=0 dtype=torch.bfloat16 min=-1.7265625 max=1.0078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.7265625 max=1.0078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=18.875
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.4375 max=0.96875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.4375 max=0.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.75
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=6.75
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-0.71875 max=0.75
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.71875 max=0.75
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-3.90625 max=3.78125
Linear input=0 dtype=torch.bfloat16 min=-3.484375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-1.171875 max=2.71875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.171875 max=2.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=5.5
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.4375 max=5.5
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.5
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=2.21875
CLIPMLP input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=2.21875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-5.03125 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=15.75
Linear output=0 dtype=torch.bfloat16 min=-3.71875 max=4.25
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=5.65625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-8.0 max=5.65625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-1.3125 max=1.015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3125 max=1.015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-28.375 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-4.5 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=3.359375
Linear output=0 dtype=torch.bfloat16 min=-1.3125 max=1.59375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.3125 max=1.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=4.0625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.3125 max=4.0625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=1.859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.84375 max=1.859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.0 max=33.0
Linear input=0 dtype=torch.bfloat16 min=-3.265625 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-3.265625 max=4.0
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=320 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.1474609375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.130859375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
LayerNorm input=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
LayerNorm output=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-3.484375 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.67578125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.67578125
LayerNorm input=0 dtype=torch.bfloat16 min=-1.40625 max=0.6953125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation input=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.5 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-1.375 max=0.1650390625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-16.0 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-2.90625 max=3.0
Linear output=0 dtype=torch.bfloat16 min=-0.345703125 max=0.283203125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.345703125 max=0.283203125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=5.34375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.3125 max=5.34375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-1.15625 max=1.421875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.875 max=12.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.15625 max=1.421875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-3.859375 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-3.546875 max=3.859375
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-2.234375 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-1.265625 max=1.7265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.265625 max=1.7265625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=4.75
GELUActivation input=0 dtype=torch.bfloat16 min=-10.0625 max=4.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-1.3671875 max=2.109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.21875 max=7.09375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3671875 max=2.109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-7.0 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=3.015625
Linear output=0 dtype=torch.bfloat16 min=-0.8515625 max=0.408203125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.8515625 max=0.408203125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=5.65625
GELUActivation input=0 dtype=torch.bfloat16 min=-10.1875 max=5.65625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-0.490234375 max=1.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-6.59375 max=6.84375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.490234375 max=1.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-4.78125 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-4.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-1.28125 max=0.57421875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.28125 max=0.57421875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
Linear output=0 dtype=torch.bfloat16 min=-7.5 max=5.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.5 max=5.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-0.5078125 max=1.265625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.9375 max=9.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5078125 max=1.265625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-5.1875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-22.25 max=24.0
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-1.171875 max=0.5625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.171875 max=0.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=5.75
GELUActivation input=0 dtype=torch.bfloat16 min=-6.46875 max=5.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.375 max=14.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=7.75
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-19.375 max=19.25
Linear output=0 dtype=torch.bfloat16 min=-3.84375 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-3.1875 max=10.8125
Linear output=0 dtype=torch.bfloat16 min=-1.0546875 max=0.9921875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.0546875 max=0.9921875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=4.34375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.53125 max=4.34375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-0.96875 max=1.0859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.5 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.96875 max=1.0859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.90625 max=3.21875
Linear input=0 dtype=torch.bfloat16 min=-3.34375 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.9765625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.9765625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=5.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.71875 max=5.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.9375
Linear output=0 dtype=torch.bfloat16 min=-1.078125 max=1.3125
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.3125 max=7.84375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.078125 max=1.3125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-3.734375 max=4.71875
Linear input=0 dtype=torch.bfloat16 min=-12.5 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-2.796875 max=3.5
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-1.15625 max=0.75390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.15625 max=0.75390625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Linear output=0 dtype=torch.bfloat16 min=-0.62109375 max=1.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.4375 max=6.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62109375 max=1.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-2.859375 max=2.671875
Linear input=0 dtype=torch.bfloat16 min=-1.9765625 max=2.0
Linear output=0 dtype=torch.bfloat16 min=-0.68359375 max=0.43359375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.68359375 max=0.43359375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=5.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-7.125 max=5.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=1.234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.1640625 max=1.234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-7.6875 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-2.0 max=1.9765625
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=0.671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.99609375 max=0.671875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=5.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.8125 max=5.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.87109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.0625 max=7.4375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.87109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-3.5 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-2.21875 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.77734375 max=0.8046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.77734375 max=0.8046875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=4.53125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.46875 max=4.53125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-0.71484375 max=1.75
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.25 max=7.59375
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.71484375 max=1.75
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=5.6875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=15.1875
Linear output=0 dtype=torch.bfloat16 min=-2.578125 max=3.234375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-0.73046875 max=0.80078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.73046875 max=0.80078125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=5.8125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.0 max=5.8125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-0.828125 max=0.91015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.5 max=8.3125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.828125 max=0.91015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=15.0
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-1.828125 max=3.53125
Linear output=0 dtype=torch.bfloat16 min=-0.54296875 max=0.5546875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.54296875 max=0.5546875
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=3.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.375 max=3.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.046875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.4375 max=8.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.046875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-3.640625 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-1.578125 max=1.765625
Linear output=0 dtype=torch.bfloat16 min=-0.6875 max=0.84375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6875 max=0.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
Linear input=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=6.75
GELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=6.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=0.8203125
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.6875 max=11.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.53125 max=0.8203125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=6.0625
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-9.1875 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-2.671875 max=3.296875
Linear input=0 dtype=torch.bfloat16 min=-1.703125 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.0234375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.0234375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=3.578125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.75 max=3.578125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.578125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.84375
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.8125 max=9.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.84375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-5.96875 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-2.515625 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-1.7109375 max=1.7265625
Linear output=0 dtype=torch.bfloat16 min=-0.44140625 max=0.5078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.44140625 max=0.5078125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-7.9375 max=4.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.9375 max=4.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=0.6953125
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.8125 max=8.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=0.6953125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-10.6875 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.859375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.3984375
Linear output=0 dtype=torch.bfloat16 min=-0.51171875 max=1.328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.51171875 max=1.328125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=4.125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.1875 max=4.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-1.4375 max=0.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.4375 max=0.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.625
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=2.203125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=1.15625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=1.15625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=3.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.875 max=3.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.3125
Linear output=0 dtype=torch.bfloat16 min=-1.3515625 max=0.5859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.3515625 max=0.5859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=21.375
Linear output=0 dtype=torch.bfloat16 min=-2.9375 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-1.1328125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=1.609375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6640625 max=1.609375
LayerNorm input=0 dtype=torch.bfloat16 min=-66.0 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=4.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.6875 max=4.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-2.21875 max=0.67578125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.21875 max=0.67578125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm output=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-7.0625 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-9.25 max=7.78125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=19.125
Linear output=0 dtype=torch.bfloat16 min=-3.140625 max=3.515625
Linear input=0 dtype=torch.bfloat16 min=-1.2578125 max=1.796875
Linear output=0 dtype=torch.bfloat16 min=-0.59375 max=2.4375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.59375 max=2.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-27.0 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-27.0 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=6.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.4375 max=6.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.25
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=0.71875
CLIPMLP input=0 dtype=torch.bfloat16 min=-27.0 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.59375 max=0.71875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-12.875 max=25.5
Linear output=0 dtype=torch.bfloat16 min=-3.40625 max=2.859375
Linear input=0 dtype=torch.bfloat16 min=-0.8671875 max=0.69140625
Linear output=0 dtype=torch.bfloat16 min=-0.373046875 max=2.25
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.373046875 max=2.25
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-19.625 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=3.25
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=3.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear output=0 dtype=torch.bfloat16 min=-6.03125 max=0.86328125
CLIPMLP input=0 dtype=torch.bfloat16 min=-19.625 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-6.03125 max=0.86328125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=26.0
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-4.6875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-0.68359375 max=2.046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.68359375 max=2.046875
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-11.0625 max=2.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-11.0625 max=2.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear output=0 dtype=torch.bfloat16 min=-2.328125 max=0.765625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.328125 max=0.765625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-7.71875 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-9.25 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=29.375
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=4.90625
Linear input=0 dtype=torch.bfloat16 min=-1.8359375 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-0.66015625 max=1.453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.66015625 max=1.453125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-19.5 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-19.5 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=2.796875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.875 max=2.796875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.796875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.796875
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=1.1015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-19.5 max=17.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.28125 max=1.1015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-7.71875 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-3.8125 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=3.296875
Linear output=0 dtype=torch.bfloat16 min=-0.75 max=1.2265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.75 max=1.2265625
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=2.140625
GELUActivation input=0 dtype=torch.bfloat16 min=-9.625 max=2.140625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.109375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.109375
Linear output=0 dtype=torch.bfloat16 min=-1.8828125 max=0.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.8828125 max=0.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.78125
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-16.5 max=26.75
Linear output=0 dtype=torch.bfloat16 min=-3.015625 max=2.9375
Linear input=0 dtype=torch.bfloat16 min=-0.91796875 max=1.15625
Linear output=0 dtype=torch.bfloat16 min=-0.64453125 max=1.6328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.64453125 max=1.6328125
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=3.96875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=3.96875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.96875
Linear output=0 dtype=torch.bfloat16 min=-1.1875 max=0.796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.1875 max=0.796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=7.71875
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=25.75
Linear output=0 dtype=torch.bfloat16 min=-3.890625 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-3.828125 max=3.15625
Linear output=0 dtype=torch.bfloat16 min=-0.70703125 max=1.0
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.70703125 max=1.0
LayerNorm input=0 dtype=torch.bfloat16 min=-63.75 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-10.125 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-10.125 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=3.015625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.125 max=3.015625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.015625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.015625
Linear output=0 dtype=torch.bfloat16 min=-2.671875 max=1.109375
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.125 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.671875 max=1.109375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-2.796875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-1.953125 max=2.5625
Linear output=0 dtype=torch.bfloat16 min=-1.4375 max=1.7421875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.4375 max=1.7421875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.5 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=4.78125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=4.78125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-2.6875 max=5.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.5 max=20.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.6875 max=5.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.25 max=25.25
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-13.25 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-3.265625 max=3.4375
Linear input=0 dtype=torch.bfloat16 min=-1.78125 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-2.640625 max=0.98046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.640625 max=0.98046875
LayerNorm input=0 dtype=torch.bfloat16 min=-62.5 max=19.25
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
Linear input=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=5.90625
GELUActivation input=0 dtype=torch.bfloat16 min=-8.625 max=5.90625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.90625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-1.609375 max=5.53125
CLIPMLP input=0 dtype=torch.bfloat16 min=-23.0 max=12.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.609375 max=5.53125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.0 max=24.625
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.625
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.625
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=24.625
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=2.03125
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.1875 max=2.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-25.25 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-25.25 max=17.25
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=3.8125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=3.8125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Linear output=0 dtype=torch.bfloat16 min=-2.40625 max=5.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.25 max=17.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.40625 max=5.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-15.375 max=21.25
Linear input=0 dtype=torch.bfloat16 min=-15.375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-15.375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-15.375 max=21.25
Linear output=0 dtype=torch.bfloat16 min=-3.578125 max=4.25
Linear input=0 dtype=torch.bfloat16 min=-1.703125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=2.28125
CLIPAttention output=0 dtype=torch.bfloat16 min=-9.75 max=2.28125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-24.5 max=22.0
Linear input=0 dtype=torch.bfloat16 min=-24.5 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.1875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.1875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-3.8125 max=11.1875
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.5 max=22.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.8125 max=11.1875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.4375 max=31.25
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.25
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.25
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=7.1875
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=31.25
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=2.84375
Linear input=0 dtype=torch.bfloat16 min=-3.671875 max=2.328125
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=4.21875
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.09375 max=4.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-72.0 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=3.609375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.28125 max=3.609375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=8.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-6.75 max=8.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=26.25
Linear input=0 dtype=torch.bfloat16 min=-5.0 max=6.53125
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.03125
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=0 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=21820
Embedding output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-0.52734375 max=0.490234375
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=5.28125
Embedding input=0 dtype=torch.int64 min=0 max=30
Embedding output=0 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=3 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=4 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=5 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=6 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=7 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=8 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=9 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=10 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=11 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=12 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=13 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=14 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=15 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=16 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=17 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=18 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=19 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=20 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=21 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=22 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=23 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=24 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=25 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=26 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=27 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=28 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=29 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=30 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=31 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=32 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=33 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=34 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=35 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=36 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=37 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=38 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=39 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=40 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=41 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=42 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=43 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=44 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=45 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=46 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=47 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=48 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=49 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=50 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=51 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=52 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=53 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=54 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=55 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=56 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=57 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=58 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=59 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=60 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=61 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=62 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=63 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=64 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=65 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=66 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=67 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=68 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=69 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=70 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=71 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=72 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=73 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=74 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=75 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=76 dtype=torch.bfloat16 min=-47.25 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-3.375 max=3.09375
Linear output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.109375 max=1.203125
T5Attention output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-120.5 max=94.5
Dropout output=0 dtype=torch.bfloat16 min=-120.5 max=94.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=6.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.0625 max=6.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=7.65625
Dropout input=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Dropout output=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Linear input=0 dtype=torch.bfloat16 min=-25.25 max=25.125
Linear output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.296875 max=2.296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
Dropout input=0 dtype=torch.bfloat16 min=-119.5 max=106.5
Dropout output=0 dtype=torch.bfloat16 min=-119.5 max=106.5
T5LayerFF input=0 dtype=torch.bfloat16 min=-135.0 max=233.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5Block output=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-1.234375 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-7.375 max=6.46875
Linear input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-3.5625 max=3.890625
Linear input=0 dtype=torch.bfloat16 min=-3.125 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.53125 max=5.96875
T5Attention output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-80.5 max=96.0
Dropout output=0 dtype=torch.bfloat16 min=-80.5 max=96.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=9.1875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.40625 max=9.1875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=5.28125
Dropout input=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Dropout output=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=48.5
Linear output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.78125 max=4.28125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
Dropout input=0 dtype=torch.bfloat16 min=-324.0 max=302.0
Dropout output=0 dtype=torch.bfloat16 min=-324.0 max=302.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-197.0 max=231.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block input=0 dtype=torch.bfloat16 min=-204.0 max=235.0
T5Block output=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.6796875
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-3.046875 max=2.8125
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5Attention input=0 dtype=torch.bfloat16 min=-4.09375 max=5.875
T5Attention output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-156.0 max=158.0
Dropout output=0 dtype=torch.bfloat16 min=-156.0 max=158.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=6.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.8125 max=6.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=5.0
Dropout input=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Dropout output=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=22.75
Linear output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.609375 max=5.25
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
Dropout input=0 dtype=torch.bfloat16 min=-111.0 max=110.5
Dropout output=0 dtype=torch.bfloat16 min=-111.0 max=110.5
T5LayerFF input=0 dtype=torch.bfloat16 min=-656.0 max=608.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block input=0 dtype=torch.bfloat16 min=-500.0 max=452.0
T5Block output=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-0.6875 max=0.8125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-3.640625 max=3.09375
Linear input=0 dtype=torch.bfloat16 min=-3.125 max=2.78125
Linear output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.265625 max=3.609375
T5Attention output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-74.0 max=98.5
Dropout output=0 dtype=torch.bfloat16 min=-74.0 max=98.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.09375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.75 max=7.09375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=5.75
Dropout input=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Dropout output=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.3125 max=4.75
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
Dropout input=0 dtype=torch.bfloat16 min=-195.0 max=201.0
Dropout output=0 dtype=torch.bfloat16 min=-195.0 max=201.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-760.0 max=736.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block input=0 dtype=torch.bfloat16 min=-724.0 max=680.0
T5Block output=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.8671875
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-2.984375 max=3.296875
Linear input=0 dtype=torch.bfloat16 min=-2.890625 max=3.203125
Linear output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.4375 max=2.265625
T5Attention output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-95.0 max=116.5
Dropout output=0 dtype=torch.bfloat16 min=-95.0 max=116.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=5.96875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-11.6875 max=5.96875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=7.15625
Dropout input=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Dropout output=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Linear input=0 dtype=torch.bfloat16 min=-54.5 max=29.625
Linear output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.5078125 max=3.1875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
Dropout input=0 dtype=torch.bfloat16 min=-268.0 max=274.0
Dropout output=0 dtype=torch.bfloat16 min=-268.0 max=274.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-848.0 max=824.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block input=0 dtype=torch.bfloat16 min=-828.0 max=804.0
T5Block output=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.3046875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-3.1875 max=3.140625
Linear input=0 dtype=torch.bfloat16 min=-2.890625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5703125
T5Attention output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-130.0 max=131.0
Dropout output=0 dtype=torch.bfloat16 min=-130.0 max=131.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.5625 max=5.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.03125
Dropout input=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Dropout output=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=27.375
Linear output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.296875 max=2.546875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
Dropout input=0 dtype=torch.bfloat16 min=-217.0 max=211.0
Dropout output=0 dtype=torch.bfloat16 min=-217.0 max=211.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-956.0 max=944.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block input=0 dtype=torch.bfloat16 min=-896.0 max=876.0
T5Block output=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-0.8984375 max=0.84765625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-7.96875 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
Linear output=0 dtype=torch.bfloat16 min=-2.9375 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.796875
Linear output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.015625 max=1.5390625
T5Attention output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-84.0 max=75.5
Dropout output=0 dtype=torch.bfloat16 min=-84.0 max=75.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=4.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.15625 max=4.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.28125
Dropout input=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Dropout output=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Linear input=0 dtype=torch.bfloat16 min=-14.8125 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.921875 max=2.09375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
Dropout input=0 dtype=torch.bfloat16 min=-213.0 max=264.0
Dropout output=0 dtype=torch.bfloat16 min=-213.0 max=264.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1152.0 max=1216.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block input=0 dtype=torch.bfloat16 min=-1096.0 max=1144.0
T5Block output=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-0.78125 max=0.875
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
Linear output=0 dtype=torch.bfloat16 min=-2.953125 max=2.75
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5Attention input=0 dtype=torch.bfloat16 min=-0.96484375 max=1.2890625
T5Attention output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-51.25 max=62.0
Dropout output=0 dtype=torch.bfloat16 min=-51.25 max=62.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=28.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.4375 max=28.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=28.25
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
Linear output=0 dtype=torch.bfloat16 min=-56.75 max=44.25
Dropout input=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Dropout output=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Linear input=0 dtype=torch.bfloat16 min=-1568.0 max=1064.0
Linear output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.1875 max=1.9453125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
Dropout input=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
Dropout output=0 dtype=torch.bfloat16 min=-36864.0 max=39168.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1248.0 max=1320.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block input=0 dtype=torch.bfloat16 min=-1232.0 max=1288.0
T5Block output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.734375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-5.5 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-2.484375 max=1.984375
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.8984375
T5Attention output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-47.75 max=55.5
Dropout output=0 dtype=torch.bfloat16 min=-47.75 max=55.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear output=0 dtype=torch.bfloat16 min=-4.8125 max=11.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.8125 max=11.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=11.4375
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=18.25
Dropout input=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Dropout output=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=96.0
Linear output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.5 max=1.90625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
Dropout input=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
Dropout output=0 dtype=torch.bfloat16 min=-1832.0 max=2304.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block input=0 dtype=torch.bfloat16 min=-37120.0 max=39680.0
T5Block output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.64453125
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-8.5 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-3.453125 max=3.78125
Linear input=0 dtype=torch.bfloat16 min=-3.140625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.2421875 max=1.875
T5Attention output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-82.0 max=50.5
Dropout output=0 dtype=torch.bfloat16 min=-82.0 max=50.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=15.375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.09375 max=15.375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=15.375
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-32.25 max=23.625
Dropout input=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Dropout output=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Linear input=0 dtype=torch.bfloat16 min=-93.5 max=84.0
Linear output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.0 max=1.859375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
Dropout input=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
Dropout output=0 dtype=torch.bfloat16 min=-3136.0 max=3696.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block input=0 dtype=torch.bfloat16 min=-38912.0 max=41984.0
T5Block output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-0.72265625 max=0.78515625
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
Linear output=0 dtype=torch.bfloat16 min=-2.875 max=3.234375
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.21875
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=66.5
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=66.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=36.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.65625 max=36.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=36.75
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
Linear output=0 dtype=torch.bfloat16 min=-159.0 max=107.0
Dropout input=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Dropout output=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Linear input=0 dtype=torch.bfloat16 min=-5216.0 max=3456.0
Linear output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-8.8125 max=2.15625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
Dropout input=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
Dropout output=0 dtype=torch.bfloat16 min=-136192.0 max=141312.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block input=0 dtype=torch.bfloat16 min=-41984.0 max=45568.0
T5Block output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.71484375
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
Linear output=0 dtype=torch.bfloat16 min=-3.296875 max=3.078125
Linear input=0 dtype=torch.bfloat16 min=-3.203125 max=2.40625
Linear output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.1328125
T5Attention output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-60.75 max=91.5
Dropout output=0 dtype=torch.bfloat16 min=-60.75 max=91.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=6.59375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.0625 max=6.59375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=18.625
Dropout input=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Dropout output=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-18.875 max=36.0
Linear output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.28125 max=1.5234375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
Dropout input=0 dtype=torch.bfloat16 min=-728.0 max=984.0
Dropout output=0 dtype=torch.bfloat16 min=-728.0 max=984.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block input=0 dtype=torch.bfloat16 min=-176128.0 max=186368.0
T5Block output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-0.6796875 max=0.6796875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
Linear output=0 dtype=torch.bfloat16 min=-2.984375 max=3.3125
Linear input=0 dtype=torch.bfloat16 min=-2.6875 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.8125 max=1.46875
T5Attention output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-75.5 max=99.5
Dropout output=0 dtype=torch.bfloat16 min=-75.5 max=99.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=7.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.46875 max=7.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.75
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-49.0 max=53.25
Dropout input=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Dropout output=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Linear input=0 dtype=torch.bfloat16 min=-298.0 max=264.0
Linear output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.46875 max=1.6328125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
Dropout input=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
Dropout output=0 dtype=torch.bfloat16 min=-5408.0 max=7648.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block input=0 dtype=torch.bfloat16 min=-177152.0 max=187392.0
T5Block output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-0.82421875 max=0.84765625
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-3.6875 max=2.90625
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.8125 max=1.546875
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=98.0
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=98.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=7.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.5625 max=7.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=13.875
Dropout input=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Dropout output=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Linear input=0 dtype=torch.bfloat16 min=-36.25 max=39.5
Linear output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.15625 max=1.6953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
Dropout input=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
Dropout output=0 dtype=torch.bfloat16 min=-1384.0 max=1936.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block input=0 dtype=torch.bfloat16 min=-182272.0 max=194560.0
T5Block output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-0.66796875 max=0.75390625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
Linear output=0 dtype=torch.bfloat16 min=-3.9375 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.71875 max=1.5546875
T5Attention output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-79.5 max=89.0
Dropout output=0 dtype=torch.bfloat16 min=-79.5 max=89.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=5.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.3125 max=5.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
Linear output=0 dtype=torch.bfloat16 min=-15.0625 max=12.9375
Dropout input=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Dropout output=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Linear input=0 dtype=torch.bfloat16 min=-55.0 max=32.75
Linear output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.390625 max=1.484375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
Dropout input=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
Dropout output=0 dtype=torch.bfloat16 min=-1096.0 max=1464.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block input=0 dtype=torch.bfloat16 min=-183296.0 max=196608.0
T5Block output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.80859375
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=3.296875
Linear output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.453125 max=3.03125
T5Attention output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-80.5 max=109.0
Dropout output=0 dtype=torch.bfloat16 min=-80.5 max=109.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=7.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=7.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.625
Linear input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=26.375
Dropout input=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Dropout output=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Linear input=0 dtype=torch.bfloat16 min=-100.5 max=102.0
Linear output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.53125 max=1.5625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
Dropout input=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
Dropout output=0 dtype=torch.bfloat16 min=-2320.0 max=2944.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block input=0 dtype=torch.bfloat16 min=-184320.0 max=197632.0
T5Block output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-0.80859375 max=0.83203125
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-8.5625 max=7.90625
Linear input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-5.1875 max=5.0
Linear output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.515625 max=3.109375
T5Attention output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-178.0 max=177.0
Dropout output=0 dtype=torch.bfloat16 min=-178.0 max=177.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=6.90625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.84375 max=6.90625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
Linear output=0 dtype=torch.bfloat16 min=-106.5 max=65.0
Dropout input=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Dropout output=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Linear input=0 dtype=torch.bfloat16 min=-300.0 max=223.0
Linear output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.265625 max=1.6328125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
Dropout input=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
Dropout output=0 dtype=torch.bfloat16 min=-4448.0 max=6336.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block input=0 dtype=torch.bfloat16 min=-186368.0 max=200704.0
T5Block output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-0.7890625 max=0.91796875
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-6.78125 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-243.0 max=174.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.9375 max=3.40625
T5Attention output=0 dtype=torch.bfloat16 min=-243.0 max=174.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-243.0 max=174.0
Dropout output=0 dtype=torch.bfloat16 min=-243.0 max=174.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=12.75
Linear input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
Linear output=0 dtype=torch.bfloat16 min=-45.0 max=37.75
Dropout input=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Dropout output=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Linear input=0 dtype=torch.bfloat16 min=-249.0 max=192.0
Linear output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.6328125 max=1.65625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
Dropout input=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
Dropout output=0 dtype=torch.bfloat16 min=-7040.0 max=8960.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block input=0 dtype=torch.bfloat16 min=-190464.0 max=206848.0
T5Block output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-1.359375 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-8.5 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=7.5
Linear input=0 dtype=torch.bfloat16 min=-4.8125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-358.0 max=306.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.546875 max=2.890625
T5Attention output=0 dtype=torch.bfloat16 min=-358.0 max=306.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-358.0 max=306.0
Dropout output=0 dtype=torch.bfloat16 min=-358.0 max=306.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-7.5625 max=9.9375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.5625 max=9.9375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-38.75 max=39.75
Dropout input=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Dropout output=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Linear input=0 dtype=torch.bfloat16 min=-167.0 max=98.0
Linear output=0 dtype=torch.bfloat16 min=-2336.0 max=3440.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.3984375 max=1.890625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2336.0 max=3440.0
Dropout input=0 dtype=torch.bfloat16 min=-2336.0 max=3440.0
Dropout output=0 dtype=torch.bfloat16 min=-2336.0 max=3440.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block input=0 dtype=torch.bfloat16 min=-195584.0 max=216064.0
T5Block output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-0.85546875 max=0.8671875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.390625 max=2.609375
T5Attention output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-410.0 max=368.0
Dropout output=0 dtype=torch.bfloat16 min=-410.0 max=368.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=7.8125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.5 max=7.8125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
Linear output=0 dtype=torch.bfloat16 min=-43.75 max=43.5
Dropout input=0 dtype=torch.bfloat16 min=-118.0 max=85.5
Dropout output=0 dtype=torch.bfloat16 min=-118.0 max=85.5
Linear input=0 dtype=torch.bfloat16 min=-118.0 max=85.5
Linear output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.015625 max=2.3125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
Dropout input=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
Dropout output=0 dtype=torch.bfloat16 min=-3056.0 max=3136.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block input=0 dtype=torch.bfloat16 min=-196608.0 max=219136.0
T5Block output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.234375 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-2.234375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-0.82421875 max=1.0390625
Linear input=0 dtype=torch.bfloat16 min=-2.234375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-2.234375 max=2.265625
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=6.625
Linear output=0 dtype=torch.bfloat16 min=-221.0 max=223.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.234375 max=2.265625
T5Attention output=0 dtype=torch.bfloat16 min=-221.0 max=223.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-221.0 max=223.0
Dropout output=0 dtype=torch.bfloat16 min=-221.0 max=223.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.640625 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=20.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=20.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-2.640625 max=2.46875
Linear output=0 dtype=torch.bfloat16 min=-36.0 max=27.75
Dropout input=0 dtype=torch.bfloat16 min=-128.0 max=100.0
Dropout output=0 dtype=torch.bfloat16 min=-128.0 max=100.0
Linear input=0 dtype=torch.bfloat16 min=-128.0 max=100.0
Linear output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.640625 max=2.46875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
Dropout input=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
Dropout output=0 dtype=torch.bfloat16 min=-2096.0 max=1960.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block input=0 dtype=torch.bfloat16 min=-197632.0 max=222208.0
T5Block output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-0.89453125 max=1.2578125
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-14.3125 max=11.6875
Linear input=0 dtype=torch.bfloat16 min=-14.3125 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-199.0 max=268.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.59375 max=3.046875
T5Attention output=0 dtype=torch.bfloat16 min=-199.0 max=268.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-199.0 max=268.0
Dropout output=0 dtype=torch.bfloat16 min=-199.0 max=268.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.515625 max=2.96875
Linear input=0 dtype=torch.bfloat16 min=-3.515625 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=20.375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=20.375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=20.375
Linear input=0 dtype=torch.bfloat16 min=-3.515625 max=2.96875
Linear output=0 dtype=torch.bfloat16 min=-42.25 max=95.5
Dropout input=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Dropout output=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Linear input=0 dtype=torch.bfloat16 min=-135.0 max=330.0
Linear output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.515625 max=2.96875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
Dropout input=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
Dropout output=0 dtype=torch.bfloat16 min=-4128.0 max=3728.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block input=0 dtype=torch.bfloat16 min=-198656.0 max=224256.0
T5Block output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-1.6875 max=1.078125
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-19.875 max=13.875
Linear input=0 dtype=torch.bfloat16 min=-19.875 max=12.0625
Linear output=0 dtype=torch.bfloat16 min=-268.0 max=308.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.09375 max=4.34375
T5Attention output=0 dtype=torch.bfloat16 min=-268.0 max=308.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-268.0 max=308.0
Dropout output=0 dtype=torch.bfloat16 min=-268.0 max=308.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-5.03125 max=3.671875
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=22.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.5625 max=22.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=22.25
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-111.0 max=200.0
Dropout input=0 dtype=torch.bfloat16 min=-720.0 max=1648.0
Dropout output=0 dtype=torch.bfloat16 min=-720.0 max=1648.0
Linear input=0 dtype=torch.bfloat16 min=-720.0 max=1648.0
Linear output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-5.03125 max=3.671875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
Dropout input=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
Dropout output=0 dtype=torch.bfloat16 min=-5760.0 max=5568.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block input=0 dtype=torch.bfloat16 min=-200704.0 max=228352.0
T5Block output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.5625 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=0.8984375
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-27.625 max=25.0
Linear input=0 dtype=torch.bfloat16 min=-15.5 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-404.0 max=342.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.5625 max=4.1875
T5Attention output=0 dtype=torch.bfloat16 min=-404.0 max=342.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-404.0 max=342.0
Dropout output=0 dtype=torch.bfloat16 min=-404.0 max=342.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.65625 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-6.65625 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=104.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-12.125 max=104.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=104.5
Linear input=0 dtype=torch.bfloat16 min=-6.65625 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-100.0 max=138.0
Dropout input=0 dtype=torch.bfloat16 min=-1416.0 max=3776.0
Dropout output=0 dtype=torch.bfloat16 min=-1416.0 max=3776.0
Linear input=0 dtype=torch.bfloat16 min=-1416.0 max=3776.0
Linear output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-6.65625 max=8.9375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
Dropout input=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
Dropout output=0 dtype=torch.bfloat16 min=-43520.0 max=44288.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5Block input=0 dtype=torch.bfloat16 min=-205824.0 max=233472.0
T5Block output=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-209920.0 max=237568.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
Dropout input=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
Dropout output=0 dtype=torch.bfloat16 min=-6.5625 max=2.40625
T5EncoderModel input=0 dtype=torch.int64 min=0 max=21820
Embedding input=0 dtype=torch.int64 min=49406 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.5078125 max=0.65234375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.1181640625 max=0.65234375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-5.15625 max=5.5
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=28.5
Linear output=0 dtype=torch.bfloat16 min=-2.1875 max=1.59375
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.5703125
Linear output=0 dtype=torch.bfloat16 min=-0.99609375 max=1.078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.99609375 max=1.078125
LayerNorm input=0 dtype=torch.bfloat16 min=-0.9921875 max=1.71875
LayerNorm output=0 dtype=torch.bfloat16 min=-22.875 max=179.0
Linear input=0 dtype=torch.bfloat16 min=-22.875 max=179.0
Linear output=0 dtype=torch.bfloat16 min=-45.0 max=233.0
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-45.0 max=233.0
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=233.0
Linear output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.875 max=179.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-0.5234375 max=1.3046875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=2.8125
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-1.109375 max=1.1640625
Linear input=0 dtype=torch.bfloat16 min=-0.734375 max=0.59375
Linear output=0 dtype=torch.bfloat16 min=-0.39453125 max=0.97265625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.39453125 max=0.97265625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-11.875 max=2.28125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.875 max=2.28125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPMLP input=0 dtype=torch.bfloat16 min=-41.0 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.203125 max=1.0234375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-3.25 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-13.6875 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-1.7421875 max=1.4609375
Linear input=0 dtype=torch.bfloat16 min=-0.65625 max=0.8046875
Linear output=0 dtype=torch.bfloat16 min=-0.296875 max=0.326171875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.296875 max=0.326171875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=3.71875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=3.71875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.71875
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.71875
Linear output=0 dtype=torch.bfloat16 min=-0.50390625 max=0.4296875
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.125 max=80.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.50390625 max=0.4296875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=3.765625
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-3.375 max=3.078125
Linear input=0 dtype=torch.bfloat16 min=-12.4375 max=13.3125
Linear output=0 dtype=torch.bfloat16 min=-1.9921875 max=2.125
Linear input=0 dtype=torch.bfloat16 min=-0.76171875 max=0.96875
Linear output=0 dtype=torch.bfloat16 min=-0.296875 max=0.2890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.296875 max=0.2890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=2.296875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.5625 max=2.296875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=2.25
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=2.25
Linear output=0 dtype=torch.bfloat16 min=-0.279296875 max=0.337890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-26.5 max=61.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.279296875 max=0.337890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-4.34375 max=3.65625
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-3.984375 max=3.96875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.8125
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=1.9765625
Linear input=0 dtype=torch.bfloat16 min=-1.3125 max=0.921875
Linear output=0 dtype=torch.bfloat16 min=-0.333984375 max=0.337890625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.333984375 max=0.337890625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=5.21875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=5.21875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.5703125 max=0.453125
CLIPMLP input=0 dtype=torch.bfloat16 min=-25.375 max=49.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.5703125 max=0.453125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-4.40625 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-4.28125 max=3.65625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-1.78125 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-1.3203125 max=1.5234375
Linear output=0 dtype=torch.bfloat16 min=-0.396484375 max=0.455078125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.396484375 max=0.455078125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=6.5
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.28125 max=6.5
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.45703125
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.0 max=44.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.45703125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-3.75 max=3.15625
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=8.4375
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=2.265625
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.078125
Linear output=0 dtype=torch.bfloat16 min=-0.34375 max=0.51953125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.34375 max=0.51953125
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=5.625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.21875 max=5.625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.494140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.75 max=39.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.494140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-3.25 max=3.546875
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-2.046875 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=1.7890625
Linear output=0 dtype=torch.bfloat16 min=-0.423828125 max=0.373046875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.423828125 max=0.373046875
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=3.21875
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=3.21875
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.203125
Linear output=0 dtype=torch.bfloat16 min=-1.078125 max=0.5859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-36.5 max=39.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.078125 max=0.5859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-4.28125 max=3.828125
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-2.53125 max=2.40625
Linear input=0 dtype=torch.bfloat16 min=-1.3359375 max=1.1640625
Linear output=0 dtype=torch.bfloat16 min=-0.41796875 max=0.427734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.41796875 max=0.427734375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=4.375
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-5.59375 max=4.375
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-0.70703125 max=0.7734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-34.5 max=43.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.70703125 max=0.7734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.5625
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=2.640625
Linear input=0 dtype=torch.bfloat16 min=-1.9453125 max=1.9296875
Linear output=0 dtype=torch.bfloat16 min=-1.5234375 max=2.65625
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.5234375 max=2.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=3.59375
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=3.59375
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=3.578125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=2.78125
CLIPMLP input=0 dtype=torch.bfloat16 min=-37.75 max=45.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.62890625 max=2.78125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-3.734375 max=3.28125
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=10.6875
Linear output=0 dtype=torch.bfloat16 min=-2.5 max=2.515625
Linear input=0 dtype=torch.bfloat16 min=-1.484375 max=1.734375
Linear output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.0390625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.96484375 max=1.0390625
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=4.15625
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-11.3125 max=4.15625
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=4.15625
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.74609375
CLIPMLP input=0 dtype=torch.bfloat16 min=-53.0 max=58.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.83203125 max=0.74609375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.6875 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.15625 max=2.796875
Linear input=0 dtype=torch.bfloat16 min=-1.7265625 max=1.890625
Linear output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.59375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.53125
QuickGELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=5.53125
QuickGELUActivation output=0 dtype=torch.bfloat16 min=-0.1640625 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-0.1640625 max=5.53125
Linear output=0 dtype=torch.bfloat16 min=-4.0625 max=1.5078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-54.0 max=55.0
CLIPMLP output=0 dtype=torch.bfloat16 min=-4.0625 max=1.5078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.0 max=33.0
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=49406 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=49407
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.0693359375
Embedding input=0 dtype=torch.int64 min=0 max=76
Embedding output=0 dtype=torch.bfloat16 min=-0.6875 max=0.130859375
CLIPTextEmbeddings output=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
LayerNorm input=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
LayerNorm output=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-3.796875 max=4.28125
Linear input=0 dtype=torch.bfloat16 min=-2.875 max=3.453125
Linear output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.63671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.67578125 max=0.63671875
LayerNorm input=0 dtype=torch.bfloat16 min=-1.40625 max=0.6328125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-33.5 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation input=0 dtype=torch.bfloat16 min=-10.875 max=9.5
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPMLP input=0 dtype=torch.bfloat16 min=-33.5 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-1.375 max=0.13671875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-2.953125 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-0.3125 max=0.291015625
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.3125 max=0.291015625
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=5.21875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=5.21875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.6171875 max=0.69921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.875 max=10.5625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6171875 max=0.69921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-3.65625 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-3.5 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-15.0625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-2.21875 max=2.453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.21875 max=2.453125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=3.703125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.9375 max=3.703125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Linear output=0 dtype=torch.bfloat16 min=-0.8359375 max=0.443359375
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.8125 max=7.15625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.8359375 max=0.443359375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-4.09375 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-3.515625 max=1.7734375
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.25
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=5.21875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=5.21875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-0.474609375 max=1.0078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.875 max=7.3125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.474609375 max=1.0078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-4.25 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=1.875
Linear output=0 dtype=torch.bfloat16 min=-1.203125 max=0.27734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.203125 max=0.27734375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
Linear input=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-7.6875 max=3.546875
GELUActivation input=0 dtype=torch.bfloat16 min=-7.6875 max=3.546875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.546875
Linear output=0 dtype=torch.bfloat16 min=-0.4453125 max=0.9140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.53125 max=10.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.4453125 max=0.9140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-5.75 max=5.75
Linear output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.734375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.86328125 max=0.734375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=5.59375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.78125 max=5.59375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Linear output=0 dtype=torch.bfloat16 min=-0.921875 max=1.7734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-9.6875 max=10.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.921875 max=1.7734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=7.34375
Linear input=0 dtype=torch.bfloat16 min=-16.375 max=20.75
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-2.828125 max=9.75
Linear output=0 dtype=torch.bfloat16 min=-1.296875 max=0.89453125
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.296875 max=0.89453125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
LayerNorm output=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
Linear input=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
Linear output=0 dtype=torch.bfloat16 min=-6.375 max=4.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.375 max=4.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-0.98828125 max=1.015625
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.9375 max=14.8125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.98828125 max=1.015625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-16.125 max=21.875
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=3.203125
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-0.84765625 max=0.85546875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.84765625 max=0.85546875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=5.125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.84375 max=5.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.125
Linear output=0 dtype=torch.bfloat16 min=-0.79296875 max=1.34375
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.5625 max=7.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.79296875 max=1.34375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-2.8125 max=3.359375
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-1.1171875 max=0.80859375
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.1171875 max=0.80859375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=4.9375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.90625 max=4.9375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.875
CLIPMLP input=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.5 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
LayerNorm output=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=18.375
Linear output=0 dtype=torch.bfloat16 min=-3.390625 max=3.859375
Linear input=0 dtype=torch.bfloat16 min=-1.5390625 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-0.76171875 max=0.54296875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.76171875 max=0.54296875
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=3.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-7.21875 max=3.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear output=0 dtype=torch.bfloat16 min=-1.0546875 max=1.078125
CLIPMLP input=0 dtype=torch.bfloat16 min=-6.1875 max=8.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.0546875 max=1.078125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
LayerNorm output=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=20.875
Linear output=0 dtype=torch.bfloat16 min=-3.21875 max=2.71875
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=1.9765625
Linear output=0 dtype=torch.bfloat16 min=-1.109375 max=0.95703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.109375 max=0.95703125
LayerNorm input=0 dtype=torch.bfloat16 min=-69.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=2.75
GELUActivation input=0 dtype=torch.bfloat16 min=-7.03125 max=2.75
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.734375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.734375
Linear output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.69140625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.21875 max=6.90625
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.451171875 max=0.69140625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-69.0 max=17.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=7.21875
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=17.625
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=1.734375
Linear output=0 dtype=torch.bfloat16 min=-0.61328125 max=0.54296875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.61328125 max=0.54296875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=2.015625
GELUActivation input=0 dtype=torch.bfloat16 min=-7.53125 max=2.015625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.96875
Linear output=0 dtype=torch.bfloat16 min=-0.52734375 max=1.4296875
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.1875 max=7.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.52734375 max=1.4296875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.5 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
LayerNorm output=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-9.3125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-1.65625 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-0.71875 max=0.5703125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.71875 max=0.5703125
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=2.734375
GELUActivation input=0 dtype=torch.bfloat16 min=-7.8125 max=2.734375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.71875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.71875
Linear output=0 dtype=torch.bfloat16 min=-0.35546875 max=0.65625
CLIPMLP input=0 dtype=torch.bfloat16 min=-7.46875 max=8.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-0.35546875 max=0.65625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=7.5
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-9.8125 max=15.8125
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=2.625
Linear input=0 dtype=torch.bfloat16 min=-1.71875 max=1.453125
Linear output=0 dtype=torch.bfloat16 min=-0.470703125 max=0.376953125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.470703125 max=0.376953125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=1.6875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.125 max=1.6875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.609375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.609375
Linear output=0 dtype=torch.bfloat16 min=-1.046875 max=0.84375
CLIPMLP input=0 dtype=torch.bfloat16 min=-8.3125 max=7.375
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.046875 max=0.84375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.125
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-6.65625 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-10.8125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=3.1875
Linear input=0 dtype=torch.bfloat16 min=-1.0546875 max=0.8359375
Linear output=0 dtype=torch.bfloat16 min=-0.6015625 max=0.828125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6015625 max=0.828125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.625
LayerNorm output=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
Linear input=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=2.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=2.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.28125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.28125
Linear output=0 dtype=torch.bfloat16 min=-2.03125 max=0.55859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.6875 max=13.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.03125 max=0.55859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=17.625
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-6.0625 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-9.6875 max=19.0
Linear output=0 dtype=torch.bfloat16 min=-2.28125 max=2.03125
Linear input=0 dtype=torch.bfloat16 min=-1.4375 max=1.296875
Linear output=0 dtype=torch.bfloat16 min=-0.419921875 max=0.9609375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.419921875 max=0.9609375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=17.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-6.53125 max=6.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.53125 max=6.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.03125
Linear output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.9921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.8125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2421875 max=1.9921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=17.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=23.125
Linear output=0 dtype=torch.bfloat16 min=-2.28125 max=2.828125
Linear input=0 dtype=torch.bfloat16 min=-0.6875 max=1.0078125
Linear output=0 dtype=torch.bfloat16 min=-0.439453125 max=0.43359375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.439453125 max=0.43359375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=1.25
GELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=1.25
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.1171875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.1171875
Linear output=0 dtype=torch.bfloat16 min=-1.0390625 max=0.380859375
CLIPMLP input=0 dtype=torch.bfloat16 min=-14.5625 max=8.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.0390625 max=0.380859375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-11.1875 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-2.25 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-0.55859375 max=0.51953125
Linear output=0 dtype=torch.bfloat16 min=-0.375 max=1.1796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.375 max=1.1796875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=0.5546875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.3125 max=0.5546875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=0.39453125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=0.39453125
Linear output=0 dtype=torch.bfloat16 min=-1.390625 max=0.66796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.625 max=9.0625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.390625 max=0.66796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.25
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=23.0
Linear output=0 dtype=torch.bfloat16 min=-2.515625 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.0390625
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=1.1328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.6328125 max=1.1328125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.0 max=18.25
LayerNorm output=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=2.03125
GELUActivation input=0 dtype=torch.bfloat16 min=-9.0 max=2.03125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=1.9921875
Linear output=0 dtype=torch.bfloat16 min=-1.2421875 max=0.53125
CLIPMLP input=0 dtype=torch.bfloat16 min=-12.3125 max=8.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.2421875 max=0.53125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-68.0 max=18.25
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=22.0
Linear output=0 dtype=torch.bfloat16 min=-2.1875 max=2.125
Linear input=0 dtype=torch.bfloat16 min=-0.83203125 max=0.88671875
Linear output=0 dtype=torch.bfloat16 min=-0.486328125 max=1.5859375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.486328125 max=1.5859375
LayerNorm input=0 dtype=torch.bfloat16 min=-66.0 max=18.375
LayerNorm output=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-8.5625 max=3.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-8.5625 max=3.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.46875
Linear output=0 dtype=torch.bfloat16 min=-1.8671875 max=0.765625
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.1875 max=11.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.8671875 max=0.765625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.375
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
LayerNorm output=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-6.3125 max=7.53125
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-18.625 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-2.5625 max=3.703125
Linear input=0 dtype=torch.bfloat16 min=-0.75390625 max=0.9765625
Linear output=0 dtype=torch.bfloat16 min=-0.609375 max=2.4375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.609375 max=2.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=18.625
LayerNorm output=0 dtype=torch.bfloat16 min=-29.5 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-29.5 max=16.625
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=3.0625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=3.0625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear output=0 dtype=torch.bfloat16 min=-3.15625 max=0.66796875
CLIPMLP input=0 dtype=torch.bfloat16 min=-29.5 max=16.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.15625 max=0.66796875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.5
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=26.625
Linear output=0 dtype=torch.bfloat16 min=-3.03125 max=2.46875
Linear input=0 dtype=torch.bfloat16 min=-0.74609375 max=0.478515625
Linear output=0 dtype=torch.bfloat16 min=-0.33203125 max=2.21875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.33203125 max=2.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-22.125 max=18.75
Linear input=0 dtype=torch.bfloat16 min=-22.125 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=2.828125
GELUActivation input=0 dtype=torch.bfloat16 min=-6.84375 max=2.828125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.828125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.828125
Linear output=0 dtype=torch.bfloat16 min=-2.453125 max=0.396484375
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.125 max=18.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.453125 max=0.396484375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-67.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=5.90625
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-13.75 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=3.671875
Linear input=0 dtype=torch.bfloat16 min=-1.015625 max=1.03125
Linear output=0 dtype=torch.bfloat16 min=-0.4453125 max=2.109375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.4453125 max=2.109375
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=3.453125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.875 max=3.453125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.453125
Linear output=0 dtype=torch.bfloat16 min=-2.3125 max=0.7890625
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.9375 max=11.75
CLIPMLP output=0 dtype=torch.bfloat16 min=-2.3125 max=0.7890625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-12.0 max=30.375
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.109375
Linear input=0 dtype=torch.bfloat16 min=-0.87109375 max=0.6015625
Linear output=0 dtype=torch.bfloat16 min=-0.58203125 max=1.34375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.58203125 max=1.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.0
LayerNorm output=0 dtype=torch.bfloat16 min=-22.5 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-22.5 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=2.109375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.40625 max=2.109375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.078125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.078125
Linear output=0 dtype=torch.bfloat16 min=-1.21875 max=0.6875
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.5 max=17.875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.21875 max=0.6875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=19.0
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-5.125 max=2.890625
Linear input=0 dtype=torch.bfloat16 min=-0.83984375 max=0.8515625
Linear output=0 dtype=torch.bfloat16 min=-0.578125 max=1.2578125
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.578125 max=1.2578125
LayerNorm input=0 dtype=torch.bfloat16 min=-64.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-10.875 max=12.125
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=4.125
GELUActivation input=0 dtype=torch.bfloat16 min=-8.6875 max=4.125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-1.765625 max=0.6953125
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.875 max=12.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.765625 max=0.6953125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-17.375 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-2.703125 max=2.703125
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=0.80859375
Linear output=0 dtype=torch.bfloat16 min=-0.66796875 max=1.71875
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.66796875 max=1.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=8.3125
GELUActivation input=0 dtype=torch.bfloat16 min=-7.625 max=8.3125
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-1.5 max=1.9921875
CLIPMLP input=0 dtype=torch.bfloat16 min=-11.125 max=12.1875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.5 max=1.9921875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-65.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-9.5625 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=27.125
Linear output=0 dtype=torch.bfloat16 min=-2.390625 max=3.421875
Linear input=0 dtype=torch.bfloat16 min=-1.484375 max=1.0703125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.84375
CLIPAttention output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-63.75 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-10.125 max=18.625
Linear input=0 dtype=torch.bfloat16 min=-10.125 max=18.625
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=2.71875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.875 max=2.71875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.703125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-1.234375 max=1.2734375
CLIPMLP input=0 dtype=torch.bfloat16 min=-10.125 max=18.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.234375 max=1.2734375
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-64.0 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-3.015625 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-2.234375 max=1.265625
Linear output=0 dtype=torch.bfloat16 min=-1.609375 max=1.6796875
CLIPAttention output=0 dtype=torch.bfloat16 min=-1.609375 max=1.6796875
LayerNorm input=0 dtype=torch.bfloat16 min=-65.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.0 max=20.625
Linear input=0 dtype=torch.bfloat16 min=-13.0 max=20.625
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=4.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=4.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-1.5546875 max=5.03125
CLIPMLP input=0 dtype=torch.bfloat16 min=-13.0 max=20.625
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.5546875 max=5.03125
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-63.5 max=18.875
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-13.375 max=26.375
Linear output=0 dtype=torch.bfloat16 min=-2.53125 max=3.6875
Linear input=0 dtype=torch.bfloat16 min=-1.53125 max=1.28125
Linear output=0 dtype=torch.bfloat16 min=-2.828125 max=0.90234375
CLIPAttention output=0 dtype=torch.bfloat16 min=-2.828125 max=0.90234375
LayerNorm input=0 dtype=torch.bfloat16 min=-62.5 max=19.25
LayerNorm output=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
Linear input=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=3.640625
GELUActivation input=0 dtype=torch.bfloat16 min=-6.59375 max=3.640625
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.640625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.640625
Linear output=0 dtype=torch.bfloat16 min=-1.03125 max=7.21875
CLIPMLP input=0 dtype=torch.bfloat16 min=-22.375 max=12.6875
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.03125 max=7.21875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-60.0 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-4.34375 max=4.15625
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=25.875
Linear output=0 dtype=torch.bfloat16 min=-2.46875 max=2.4375
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=1.15625
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=1.1328125
CLIPAttention output=0 dtype=torch.bfloat16 min=-6.46875 max=1.1328125
LayerNorm input=0 dtype=torch.bfloat16 min=-65.5 max=19.125
LayerNorm output=0 dtype=torch.bfloat16 min=-24.75 max=18.125
Linear input=0 dtype=torch.bfloat16 min=-24.75 max=18.125
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=2.4375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.1875 max=2.4375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-1.265625 max=5.90625
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.75 max=18.125
CLIPMLP output=0 dtype=torch.bfloat16 min=-1.265625 max=5.90625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.25 max=19.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
LayerNorm output=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=22.125
Linear output=0 dtype=torch.bfloat16 min=-3.359375 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-1.125 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=1.671875
CLIPAttention output=0 dtype=torch.bfloat16 min=-8.875 max=1.671875
LayerNorm input=0 dtype=torch.bfloat16 min=-68.0 max=18.875
LayerNorm output=0 dtype=torch.bfloat16 min=-24.25 max=25.25
Linear input=0 dtype=torch.bfloat16 min=-24.25 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=4.46875
GELUActivation input=0 dtype=torch.bfloat16 min=-6.96875 max=4.46875
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=14.625
CLIPMLP input=0 dtype=torch.bfloat16 min=-24.25 max=25.25
CLIPMLP output=0 dtype=torch.bfloat16 min=-3.28125 max=14.625
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-59.5 max=18.75
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-11.6875 max=35.5
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=2.171875
Linear input=0 dtype=torch.bfloat16 min=-1.078125 max=1.1015625
Linear output=0 dtype=torch.bfloat16 min=-10.5 max=2.984375
CLIPAttention output=0 dtype=torch.bfloat16 min=-10.5 max=2.984375
LayerNorm input=0 dtype=torch.bfloat16 min=-72.0 max=18.125
LayerNorm output=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=3.59375
GELUActivation input=0 dtype=torch.bfloat16 min=-6.34375 max=3.59375
GELUActivation output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.59375
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=24.875
CLIPMLP input=0 dtype=torch.bfloat16 min=-15.125 max=18.5
CLIPMLP output=0 dtype=torch.bfloat16 min=-12.9375 max=24.875
CLIPEncoderLayer input=0 dtype=torch.bfloat16 min=-66.5 max=18.125
CLIPEncoderLayer input=2 dtype=torch.bfloat16 min=-3.3895313892515355e+38 max=0.0
CLIPEncoderLayer output=0 dtype=torch.bfloat16 min=-67.5 max=30.125
LayerNorm input=0 dtype=torch.bfloat16 min=-67.5 max=30.125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=26.25
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=19.875
Linear output=0 dtype=torch.bfloat16 min=-4.40625 max=3.59375
CLIPTextModelWithProjection input=0 dtype=torch.int64 min=0 max=49407
Embedding input=0 dtype=torch.int64 min=0 max=1
Embedding output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
Dropout output=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-0.5 max=0.490234375
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=4.53125
Linear input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
Linear output=0 dtype=torch.bfloat16 min=-4.09375 max=4.03125
Embedding input=0 dtype=torch.int64 min=0 max=30
Embedding output=0 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=3 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=4 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=5 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=6 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=7 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=8 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=9 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=10 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=11 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=12 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=13 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=14 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=15 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=16 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=17 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=18 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=19 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=20 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=21 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=22 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=23 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=24 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=25 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=26 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=27 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=28 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=29 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=30 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=31 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=32 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=33 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=34 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=35 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=36 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=37 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=38 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=39 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=40 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=41 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=42 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=43 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=44 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=45 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=46 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=47 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=48 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=49 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=50 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=51 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=52 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=53 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=54 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=55 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=56 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=57 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=58 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=59 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=60 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=61 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=62 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=63 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=64 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=65 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=66 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=67 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=68 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=69 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=70 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=71 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=72 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=73 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=74 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=75 dtype=torch.bfloat16 min=-47.25 max=11.1875
Embedding output=76 dtype=torch.bfloat16 min=-47.25 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4375
Linear output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.109375 max=0.75390625
T5Attention output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-34.25 max=33.5
Dropout output=0 dtype=torch.bfloat16 min=-34.25 max=33.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=6.03125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.625 max=6.03125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.03125
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.78125
Dropout input=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Dropout output=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Linear input=0 dtype=torch.bfloat16 min=-33.0 max=11.75
Linear output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.328125 max=2.0625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
Dropout input=0 dtype=torch.bfloat16 min=-48.25 max=77.0
Dropout output=0 dtype=torch.bfloat16 min=-48.25 max=77.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-100.5 max=232.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block input=0 dtype=torch.bfloat16 min=-102.0 max=231.0
T5Block output=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-0.89453125 max=0.69140625
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-3.96875 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=4.0
Linear output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.328125 max=5.8125
T5Attention output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-107.5 max=127.0
Dropout output=0 dtype=torch.bfloat16 min=-107.5 max=127.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.375 max=4.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
Linear output=0 dtype=torch.bfloat16 min=-3.578125 max=4.03125
Dropout input=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Dropout output=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=10.875
Linear output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-4.1875 max=1.625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
Dropout input=0 dtype=torch.bfloat16 min=-422.0 max=392.0
Dropout output=0 dtype=torch.bfloat16 min=-422.0 max=392.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-122.0 max=233.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block input=0 dtype=torch.bfloat16 min=-118.0 max=237.0
T5Block output=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-0.74609375 max=0.73828125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-3.4375 max=2.65625
Linear input=0 dtype=torch.bfloat16 min=-2.765625 max=2.65625
Linear output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5Attention input=0 dtype=torch.bfloat16 min=-4.0625 max=5.875
T5Attention output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-194.0 max=191.0
Dropout output=0 dtype=torch.bfloat16 min=-194.0 max=191.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=6.15625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-11.0 max=6.15625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=5.5
Dropout input=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Dropout output=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Linear input=0 dtype=torch.bfloat16 min=-15.5625 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.953125 max=5.78125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
Dropout input=0 dtype=torch.bfloat16 min=-114.5 max=117.0
Dropout output=0 dtype=torch.bfloat16 min=-114.5 max=117.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-696.0 max=652.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block input=0 dtype=torch.bfloat16 min=-504.0 max=460.0
T5Block output=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-0.75 max=0.95703125
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-7.84375 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-3.515625 max=3.46875
Linear input=0 dtype=torch.bfloat16 min=-3.5 max=3.4375
Linear output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.40625 max=3.125
T5Attention output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-87.0 max=95.0
Dropout output=0 dtype=torch.bfloat16 min=-87.0 max=95.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=7.96875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-9.8125 max=7.96875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-7.90625 max=6.0
Dropout input=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Dropout output=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Linear input=0 dtype=torch.bfloat16 min=-23.25 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.5 max=5.03125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
Dropout input=0 dtype=torch.bfloat16 min=-221.0 max=226.0
Dropout output=0 dtype=torch.bfloat16 min=-221.0 max=226.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-812.0 max=792.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block input=0 dtype=torch.bfloat16 min=-776.0 max=732.0
T5Block output=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-0.9140625 max=0.94921875
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-3.328125 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-3.25 max=3.734375
Linear output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.5 max=1.5859375
T5Attention output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-97.5 max=110.5
Dropout output=0 dtype=torch.bfloat16 min=-97.5 max=110.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=4.78125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-10.6875 max=4.78125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.78125
Linear input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=7.21875
Dropout input=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Dropout output=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-35.0 max=25.375
Linear output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.34375 max=2.953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
Dropout input=0 dtype=torch.bfloat16 min=-193.0 max=198.0
Dropout output=0 dtype=torch.bfloat16 min=-193.0 max=198.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-932.0 max=904.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block input=0 dtype=torch.bfloat16 min=-880.0 max=860.0
T5Block output=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-1.015625 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
Linear output=0 dtype=torch.bfloat16 min=-2.921875 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-2.84375 max=2.0
Linear output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.53125
T5Attention output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-109.5 max=114.0
Dropout output=0 dtype=torch.bfloat16 min=-109.5 max=114.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=4.4375
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.21875 max=4.4375
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
Linear output=0 dtype=torch.bfloat16 min=-4.84375 max=5.28125
Dropout input=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Dropout output=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Linear input=0 dtype=torch.bfloat16 min=-7.875 max=12.875
Linear output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.99609375 max=2.296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
Dropout input=0 dtype=torch.bfloat16 min=-93.5 max=92.0
Dropout output=0 dtype=torch.bfloat16 min=-93.5 max=92.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1080.0 max=1072.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block input=0 dtype=torch.bfloat16 min=-1000.0 max=976.0
T5Block output=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-0.7265625 max=0.7109375
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
Linear output=0 dtype=torch.bfloat16 min=-3.0625 max=3.15625
Linear input=0 dtype=torch.bfloat16 min=-2.296875 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.1171875 max=1.34375
T5Attention output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-111.5 max=82.5
Dropout output=0 dtype=torch.bfloat16 min=-111.5 max=82.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=3.453125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.625 max=3.453125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=3.453125
Linear input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
Linear output=0 dtype=torch.bfloat16 min=-4.125 max=3.984375
Dropout input=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Dropout output=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-12.1875 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-0.85546875 max=1.6640625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
Dropout input=0 dtype=torch.bfloat16 min=-135.0 max=167.0
Dropout output=0 dtype=torch.bfloat16 min=-135.0 max=167.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1200.0 max=1216.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block input=0 dtype=torch.bfloat16 min=-1176.0 max=1168.0
T5Block output=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.78515625
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-1.9765625 max=2.125
Linear output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.2265625 max=0.953125
T5Attention output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-77.0 max=48.0
Dropout output=0 dtype=torch.bfloat16 min=-77.0 max=48.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=16.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.15625 max=16.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=16.125
Linear input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
Linear output=0 dtype=torch.bfloat16 min=-35.75 max=28.0
Dropout input=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Dropout output=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Linear input=0 dtype=torch.bfloat16 min=-564.0 max=388.0
Linear output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.65625 max=1.3125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
Dropout input=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
Dropout output=0 dtype=torch.bfloat16 min=-12928.0 max=13760.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-1272.0 max=1312.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block input=0 dtype=torch.bfloat16 min=-1256.0 max=1280.0
T5Block output=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-0.69140625 max=0.67578125
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.65625
Linear input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-2.125 max=2.421875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=2.28125
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5Attention input=0 dtype=torch.bfloat16 min=-1.1953125 max=1.8984375
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=53.75
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=53.75
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear output=0 dtype=torch.bfloat16 min=-3.953125 max=11.1875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-3.953125 max=11.1875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=17.875
Dropout input=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Dropout output=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Linear input=0 dtype=torch.bfloat16 min=-31.875 max=87.5
Linear output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.46875 max=1.8515625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
Dropout input=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
Dropout output=0 dtype=torch.bfloat16 min=-1688.0 max=2112.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-13184.0 max=14336.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block input=0 dtype=torch.bfloat16 min=-13184.0 max=14272.0
T5Block output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-0.62109375 max=0.5390625
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-7.4375 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
Linear output=0 dtype=torch.bfloat16 min=-2.859375 max=3.484375
Linear input=0 dtype=torch.bfloat16 min=-2.71875 max=3.234375
Linear output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.140625 max=1.4140625
T5Attention output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-97.5 max=47.5
Dropout output=0 dtype=torch.bfloat16 min=-97.5 max=47.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=15.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.75 max=15.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
Linear output=0 dtype=torch.bfloat16 min=-31.75 max=23.125
Dropout input=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Dropout output=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Linear input=0 dtype=torch.bfloat16 min=-82.0 max=72.5
Linear output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.9375 max=1.7421875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
Dropout input=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
Dropout output=0 dtype=torch.bfloat16 min=-2736.0 max=3232.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block input=0 dtype=torch.bfloat16 min=-14848.0 max=16384.0
T5Block output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-0.6328125 max=0.625
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-5.65625 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-2.3125 max=2.296875
Linear input=0 dtype=torch.bfloat16 min=-1.8828125 max=1.859375
Linear output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.0078125 max=1.109375
T5Attention output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-76.0 max=52.5
Dropout output=0 dtype=torch.bfloat16 min=-76.0 max=52.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=36.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.34375 max=36.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
Linear output=0 dtype=torch.bfloat16 min=-162.0 max=109.0
Dropout input=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Dropout output=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Linear input=0 dtype=torch.bfloat16 min=-5312.0 max=3520.0
Linear output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-8.875 max=2.046875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
Dropout input=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
Dropout output=0 dtype=torch.bfloat16 min=-136192.0 max=142336.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block input=0 dtype=torch.bfloat16 min=-17536.0 max=19584.0
T5Block output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-0.58203125 max=0.55859375
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
Linear output=0 dtype=torch.bfloat16 min=-3.4375 max=2.765625
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5Attention input=0 dtype=torch.bfloat16 min=-0.921875 max=1.109375
T5Attention output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-66.0 max=92.5
Dropout output=0 dtype=torch.bfloat16 min=-66.0 max=92.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=6.53125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.71875 max=6.53125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=18.5
Dropout input=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Dropout output=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Linear input=0 dtype=torch.bfloat16 min=-18.5 max=30.875
Linear output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.1875 max=1.3671875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
Dropout input=0 dtype=torch.bfloat16 min=-660.0 max=888.0
Dropout output=0 dtype=torch.bfloat16 min=-660.0 max=888.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block input=0 dtype=torch.bfloat16 min=-153600.0 max=161792.0
T5Block output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-0.6640625 max=0.6484375
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-7.34375 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
Linear output=0 dtype=torch.bfloat16 min=-2.90625 max=3.84375
Linear input=0 dtype=torch.bfloat16 min=-2.59375 max=3.484375
Linear output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.1796875 max=1.4453125
T5Attention output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-99.5 max=93.5
Dropout output=0 dtype=torch.bfloat16 min=-99.5 max=93.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-4.5 max=7.6875
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.5 max=7.6875
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.6875
Linear input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
Linear output=0 dtype=torch.bfloat16 min=-48.5 max=52.75
Dropout input=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Dropout output=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Linear input=0 dtype=torch.bfloat16 min=-278.0 max=244.0
Linear output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.359375 max=1.5859375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
Dropout input=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
Dropout output=0 dtype=torch.bfloat16 min=-5056.0 max=7168.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block input=0 dtype=torch.bfloat16 min=-154624.0 max=162816.0
T5Block output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-0.58984375 max=0.57421875
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
Linear output=0 dtype=torch.bfloat16 min=-3.890625 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=2.796875
Linear output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5Attention input=0 dtype=torch.bfloat16 min=-1.171875 max=1.5078125
T5Attention output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-82.5 max=93.5
Dropout output=0 dtype=torch.bfloat16 min=-82.5 max=93.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=7.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-4.625 max=7.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.5
Linear input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=14.0625
Dropout input=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Dropout output=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Linear input=0 dtype=torch.bfloat16 min=-33.75 max=37.5
Linear output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.0625 max=1.640625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
Dropout input=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
Dropout output=0 dtype=torch.bfloat16 min=-1288.0 max=1816.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block input=0 dtype=torch.bfloat16 min=-159744.0 max=169984.0
T5Block output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-0.62890625 max=0.6328125
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=6.1875
Linear input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.359375
Linear input=0 dtype=torch.bfloat16 min=-2.75 max=2.8125
Linear output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5Attention input=0 dtype=torch.bfloat16 min=-1.1015625 max=1.515625
T5Attention output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-72.0 max=82.0
Dropout output=0 dtype=torch.bfloat16 min=-72.0 max=82.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-5.78125 max=5.03125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-5.78125 max=5.03125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=13.0
Dropout input=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Dropout output=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Linear input=0 dtype=torch.bfloat16 min=-53.5 max=31.5
Linear output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.3125 max=1.4296875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
Dropout input=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
Dropout output=0 dtype=torch.bfloat16 min=-1032.0 max=1384.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block input=0 dtype=torch.bfloat16 min=-160768.0 max=172032.0
T5Block output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-0.671875 max=0.6875
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-4.03125 max=3.578125
Linear output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5Attention input=0 dtype=torch.bfloat16 min=-2.390625 max=2.59375
T5Attention output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-87.5 max=97.5
Dropout output=0 dtype=torch.bfloat16 min=-87.5 max=97.5
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.5625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.25 max=7.5625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
Linear output=0 dtype=torch.bfloat16 min=-48.25 max=26.375
Dropout input=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Dropout output=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Linear input=0 dtype=torch.bfloat16 min=-96.5 max=97.0
Linear output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.4375 max=1.6171875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
Dropout input=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
Dropout output=0 dtype=torch.bfloat16 min=-2192.0 max=2800.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block input=0 dtype=torch.bfloat16 min=-161792.0 max=173056.0
T5Block output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.4375 max=2.59375
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-0.7421875 max=0.6875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-2.4375 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=5.8125
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=5.59375
Linear output=0 dtype=torch.bfloat16 min=-211.0 max=218.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.4375 max=2.59375
T5Attention output=0 dtype=torch.bfloat16 min=-211.0 max=218.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-211.0 max=218.0
Dropout output=0 dtype=torch.bfloat16 min=-211.0 max=218.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.90625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.25 max=6.90625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
Linear output=0 dtype=torch.bfloat16 min=-107.5 max=64.5
Dropout input=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Dropout output=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Linear input=0 dtype=torch.bfloat16 min=-272.0 max=201.0
Linear output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.203125 max=1.6953125
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
Dropout input=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
Dropout output=0 dtype=torch.bfloat16 min=-4096.0 max=5824.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block input=0 dtype=torch.bfloat16 min=-163840.0 max=176128.0
T5Block output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-0.78515625 max=0.8046875
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=7.71875
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
Linear output=0 dtype=torch.bfloat16 min=-4.9375 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.859375 max=2.953125
T5Attention output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-274.0 max=195.0
Dropout output=0 dtype=torch.bfloat16 min=-274.0 max=195.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=12.5625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.46875 max=12.5625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
Linear output=0 dtype=torch.bfloat16 min=-46.5 max=38.5
Dropout input=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Dropout output=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Linear input=0 dtype=torch.bfloat16 min=-254.0 max=191.0
Linear output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.7421875 max=1.59375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
Dropout input=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
Dropout output=0 dtype=torch.bfloat16 min=-7040.0 max=8896.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block input=0 dtype=torch.bfloat16 min=-166912.0 max=182272.0
T5Block output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-1.1171875 max=0.734375
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=11.4375
Linear input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-4.65625 max=6.40625
Linear output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.46875 max=2.546875
T5Attention output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-462.0 max=348.0
Dropout output=0 dtype=torch.bfloat16 min=-462.0 max=348.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-7.1875 max=8.0
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.1875 max=8.0
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
Linear output=0 dtype=torch.bfloat16 min=-39.0 max=34.75
Dropout input=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Dropout output=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Linear input=0 dtype=torch.bfloat16 min=-151.0 max=89.5
Linear output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.375 max=1.8046875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
Dropout input=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
Dropout output=0 dtype=torch.bfloat16 min=-2528.0 max=3504.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block input=0 dtype=torch.bfloat16 min=-172032.0 max=191488.0
T5Block output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-0.91015625 max=0.7265625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=8.625
Linear input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
Linear output=0 dtype=torch.bfloat16 min=-8.3125 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-8.0625 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.328125 max=2.34375
T5Attention output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-430.0 max=420.0
Dropout output=0 dtype=torch.bfloat16 min=-430.0 max=420.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.8125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-8.75 max=7.8125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=7.8125
Linear input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
Linear output=0 dtype=torch.bfloat16 min=-33.5 max=36.0
Dropout input=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Dropout output=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Linear input=0 dtype=torch.bfloat16 min=-109.5 max=161.0
Linear output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-1.890625 max=2.21875
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
Dropout input=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
Dropout output=0 dtype=torch.bfloat16 min=-3136.0 max=3184.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block input=0 dtype=torch.bfloat16 min=-174080.0 max=194560.0
T5Block output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-0.828125 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-7.59375 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5Attention input=0 dtype=torch.bfloat16 min=-2.28125 max=2.234375
T5Attention output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-198.0 max=304.0
Dropout output=0 dtype=torch.bfloat16 min=-198.0 max=304.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-7.1875 max=25.125
NewGELUActivation input=0 dtype=torch.bfloat16 min=-7.1875 max=25.125
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=25.125
Linear input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
Linear output=0 dtype=torch.bfloat16 min=-48.25 max=35.0
Dropout input=0 dtype=torch.bfloat16 min=-127.5 max=203.0
Dropout output=0 dtype=torch.bfloat16 min=-127.5 max=203.0
Linear input=0 dtype=torch.bfloat16 min=-127.5 max=203.0
Linear output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-2.703125 max=2.359375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
Dropout input=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
Dropout output=0 dtype=torch.bfloat16 min=-2160.0 max=1968.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block input=0 dtype=torch.bfloat16 min=-175104.0 max=197632.0
T5Block output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-0.90234375 max=0.984375
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
Linear output=0 dtype=torch.bfloat16 min=-14.8125 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-11.0 max=9.5625
Linear output=0 dtype=torch.bfloat16 min=-280.0 max=380.0
T5Attention input=0 dtype=torch.bfloat16 min=-3.71875 max=2.921875
T5Attention output=0 dtype=torch.bfloat16 min=-280.0 max=380.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-280.0 max=380.0
Dropout output=0 dtype=torch.bfloat16 min=-280.0 max=380.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=24.25
NewGELUActivation input=0 dtype=torch.bfloat16 min=-6.9375 max=24.25
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=24.25
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-47.75 max=83.5
Dropout input=0 dtype=torch.bfloat16 min=-108.5 max=434.0
Dropout output=0 dtype=torch.bfloat16 min=-108.5 max=434.0
Linear input=0 dtype=torch.bfloat16 min=-108.5 max=434.0
Linear output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
Dropout input=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
Dropout output=0 dtype=torch.bfloat16 min=-4192.0 max=3744.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block input=0 dtype=torch.bfloat16 min=-176128.0 max=199680.0
T5Block output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-1.8203125 max=1.015625
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-20.25 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-20.25 max=13.5
Linear output=0 dtype=torch.bfloat16 min=-316.0 max=354.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.125 max=4.1875
T5Attention output=0 dtype=torch.bfloat16 min=-316.0 max=354.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-316.0 max=354.0
Dropout output=0 dtype=torch.bfloat16 min=-316.0 max=354.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear output=0 dtype=torch.bfloat16 min=-13.125 max=27.625
NewGELUActivation input=0 dtype=torch.bfloat16 min=-13.125 max=27.625
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=27.625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
Linear output=0 dtype=torch.bfloat16 min=-117.5 max=211.0
Dropout input=0 dtype=torch.bfloat16 min=-900.0 max=2176.0
Dropout output=0 dtype=torch.bfloat16 min=-900.0 max=2176.0
Linear input=0 dtype=torch.bfloat16 min=-900.0 max=2176.0
Linear output=0 dtype=torch.bfloat16 min=-5728.0 max=5632.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-5.03125 max=3.515625
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-5728.0 max=5632.0
Dropout input=0 dtype=torch.bfloat16 min=-5728.0 max=5632.0
Dropout output=0 dtype=torch.bfloat16 min=-5728.0 max=5632.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block input=0 dtype=torch.bfloat16 min=-178176.0 max=203776.0
T5Block output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-1.1640625 max=0.91796875
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-11.0625 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-27.625 max=25.5
Linear input=0 dtype=torch.bfloat16 min=-27.625 max=25.25
Linear output=0 dtype=torch.bfloat16 min=-458.0 max=644.0
T5Attention input=0 dtype=torch.bfloat16 min=-6.75 max=4.03125
T5Attention output=0 dtype=torch.bfloat16 min=-458.0 max=644.0
T5Attention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-458.0 max=644.0
Dropout output=0 dtype=torch.bfloat16 min=-458.0 max=644.0
T5LayerSelfAttention input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerSelfAttention output=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerSelfAttention output=2 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-7.0625 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-16.125 max=123.5
NewGELUActivation input=0 dtype=torch.bfloat16 min=-16.125 max=123.5
NewGELUActivation output=0 dtype=torch.bfloat16 min=-0.1708984375 max=123.5
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-116.0 max=138.0
Dropout input=0 dtype=torch.bfloat16 min=-2048.0 max=3776.0
Dropout output=0 dtype=torch.bfloat16 min=-2048.0 max=3776.0
Linear input=0 dtype=torch.bfloat16 min=-2048.0 max=3776.0
Linear output=0 dtype=torch.bfloat16 min=-46336.0 max=45824.0
T5DenseGatedActDense input=0 dtype=torch.bfloat16 min=-7.0625 max=9.375
T5DenseGatedActDense output=0 dtype=torch.bfloat16 min=-46336.0 max=45824.0
Dropout input=0 dtype=torch.bfloat16 min=-46336.0 max=45824.0
Dropout output=0 dtype=torch.bfloat16 min=-46336.0 max=45824.0
T5LayerFF input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5LayerFF output=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5Block input=0 dtype=torch.bfloat16 min=-180224.0 max=208896.0
T5Block output=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5Block output=1 dtype=torch.bfloat16 min=-47.25 max=11.1875
T5LayerNorm input=0 dtype=torch.bfloat16 min=-184320.0 max=212992.0
T5LayerNorm output=0 dtype=torch.bfloat16 min=-6.5 max=2.3125
Dropout input=0 dtype=torch.bfloat16 min=-6.5 max=2.3125
Dropout output=0 dtype=torch.bfloat16 min=-6.5 max=2.3125
T5EncoderModel input=0 dtype=torch.int64 min=0 max=1
0%| | 0/2 [00:00<?, ?it/s]Conv2d input=0 dtype=torch.bfloat16 min=-3.328125 max=3.328125
Conv2d output=0 dtype=torch.bfloat16 min=-8.5 max=5.5625
Conv2d output=1 dtype=torch.bfloat16 min=-8.5 max=5.5625
PatchEmbed input=0 dtype=torch.bfloat16 min=-3.328125 max=3.328125
PatchEmbed output=0 dtype=torch.bfloat16 min=-8.1875 max=6.34375
PatchEmbed output=1 dtype=torch.bfloat16 min=-8.1875 max=6.34375
Timesteps input=0 dtype=torch.bfloat16 min=1000.0 max=1000.0
Timesteps output=0 dtype=torch.float32 min=-0.9999996423721313 max=0.9997203946113586
Timesteps output=1 dtype=torch.float32 min=-0.9999996423721313 max=0.9997203946113586
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=2.234375
Linear output=1 dtype=torch.bfloat16 min=-8.75 max=2.234375
SiLU input=0 dtype=torch.bfloat16 min=-8.75 max=2.234375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.015625
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=3.9375
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=3.9375
TimestepEmbedding input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
TimestepEmbedding output=0 dtype=torch.bfloat16 min=-9.0 max=3.9375
TimestepEmbedding output=1 dtype=torch.bfloat16 min=-9.0 max=3.9375
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-33.75 max=15.9375
SiLU input=0 dtype=torch.bfloat16 min=-40.5 max=15.9375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=15.8125
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
Linear output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
PixArtAlphaTextProjection input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
PixArtAlphaTextProjection output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
PixArtAlphaTextProjection output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
CombinedTimestepTextProjEmbeddings input=0 dtype=torch.bfloat16 min=1000.0 max=1000.0
CombinedTimestepTextProjEmbeddings input=1 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CombinedTimestepTextProjEmbeddings output=0 dtype=torch.bfloat16 min=-20.5 max=4.71875
CombinedTimestepTextProjEmbeddings output=1 dtype=torch.bfloat16 min=-21.25 max=4.65625
Linear input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
Linear output=0 dtype=torch.bfloat16 min=-812.0 max=612.0
Linear output=1 dtype=torch.bfloat16 min=-812.0 max=612.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.8359375 max=3.71875
Linear output=1 dtype=torch.bfloat16 min=-1.8125 max=3.75
LayerNorm input=0 dtype=torch.bfloat16 min=-8.1875 max=6.34375
LayerNorm output=0 dtype=torch.bfloat16 min=-12.3125 max=8.25
LayerNorm output=1 dtype=torch.bfloat16 min=-12.3125 max=8.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-8.1875 max=6.34375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.375 max=5.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.09375 max=3.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.49609375 max=0.9375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.5625 max=1.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.734375 max=2.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=7.125
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=7.15625
LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=16.0
LayerNorm output=1 dtype=torch.bfloat16 min=-23.0 max=16.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.15625 max=1.2421875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-8.125 max=7.15625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.046875 max=0.17578125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.890625 max=2.484375
Linear input=0 dtype=torch.bfloat16 min=-6.375 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-10.0625 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-9.875 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-6.375 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-7.8125 max=6.65625
Linear output=1 dtype=torch.bfloat16 min=-7.53125 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-6.375 max=5.5
Linear output=0 dtype=torch.bfloat16 min=-7.96875 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-8.0625 max=8.75
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.53125 max=4.34375
Linear output=1 dtype=torch.bfloat16 min=-5.46875 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=5.25
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=4.25
Linear output=1 dtype=torch.bfloat16 min=-4.40625 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-3.0 max=2.75
Linear output=0 dtype=torch.bfloat16 min=-11.375 max=5.34375
Linear output=1 dtype=torch.bfloat16 min=-11.125 max=5.3125
Dropout input=0 dtype=torch.bfloat16 min=-11.375 max=5.34375
Dropout output=0 dtype=torch.bfloat16 min=-11.375 max=5.34375
Dropout output=1 dtype=torch.bfloat16 min=-11.125 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-4.40625 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-5.6875 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-8.8125 max=10.5625
Attention output=0 dtype=torch.bfloat16 min=-11.375 max=5.34375
Attention output=1 dtype=torch.bfloat16 min=-8.8125 max=10.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-44.25 max=9.125
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=7.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-3.125 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=7.4375
GELU input=0 dtype=torch.bfloat16 min=-3.125 max=4.1875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Linear output=0 dtype=torch.bfloat16 min=-23.75 max=25.25
Linear output=1 dtype=torch.bfloat16 min=-23.875 max=25.375
FeedForward input=0 dtype=torch.bfloat16 min=-3.125 max=4.1875
FeedForward output=0 dtype=torch.bfloat16 min=-23.75 max=25.25
FeedForward output=1 dtype=torch.bfloat16 min=-23.875 max=25.375
LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=16.125
LayerNorm output=1 dtype=torch.bfloat16 min=-23.5 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-15.8125 max=25.75
Linear output=1 dtype=torch.bfloat16 min=-14.375 max=30.25
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=25.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=30.25
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=25.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=30.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=30.25
Linear output=0 dtype=torch.bfloat16 min=-29.875 max=34.25
Linear output=1 dtype=torch.bfloat16 min=-37.25 max=33.75
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=9.0
FeedForward output=0 dtype=torch.bfloat16 min=-29.875 max=34.25
FeedForward output=1 dtype=torch.bfloat16 min=-37.25 max=33.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-8.1875 max=6.34375
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-812.0 max=612.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-840.0 max=612.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-41.5 max=25.25
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.625 max=3.53125
Linear output=1 dtype=torch.bfloat16 min=-2.625 max=3.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-41.5 max=25.25
LayerNorm output=0 dtype=torch.bfloat16 min=-23.75 max=13.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-23.5 max=13.5625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-41.5 max=25.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.1875 max=10.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.078125 max=2.28125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.5234375 max=1.1953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.046875 max=2.78125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.5078125 max=2.875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-3.0 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-3.015625 max=4.28125
LayerNorm input=0 dtype=torch.bfloat16 min=-840.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.0 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=15.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-840.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.734375 max=5.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.765625 max=2.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.40625 max=3.328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.15625 max=4.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.015625 max=1.140625
Linear input=0 dtype=torch.bfloat16 min=-14.1875 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-23.5 max=21.875
Linear output=1 dtype=torch.bfloat16 min=-23.75 max=21.625
Linear input=0 dtype=torch.bfloat16 min=-14.1875 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=13.125
Linear output=1 dtype=torch.bfloat16 min=-11.5625 max=13.125
Linear input=0 dtype=torch.bfloat16 min=-14.1875 max=10.3125
Linear output=0 dtype=torch.bfloat16 min=-12.3125 max=14.0625
Linear output=1 dtype=torch.bfloat16 min=-12.3125 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.4375
Linear output=0 dtype=torch.bfloat16 min=-3.453125 max=4.71875
Linear output=1 dtype=torch.bfloat16 min=-4.0 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.4375
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=5.25
Linear output=1 dtype=torch.bfloat16 min=-5.3125 max=5.28125
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=5.4375
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-3.953125 max=3.796875
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=7.75
Linear output=0 dtype=torch.bfloat16 min=-14.5 max=13.3125
Linear output=1 dtype=torch.bfloat16 min=-14.4375 max=13.1875
Dropout input=0 dtype=torch.bfloat16 min=-14.5 max=13.3125
Dropout output=0 dtype=torch.bfloat16 min=-14.5 max=13.3125
Dropout output=1 dtype=torch.bfloat16 min=-14.4375 max=13.1875
Linear input=0 dtype=torch.bfloat16 min=-3.59375 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-19.125 max=15.0625
Linear output=1 dtype=torch.bfloat16 min=-19.25 max=15.3125
Attention output=0 dtype=torch.bfloat16 min=-14.5 max=13.3125
Attention output=1 dtype=torch.bfloat16 min=-19.25 max=15.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-62.75 max=25.375
LayerNorm output=0 dtype=torch.bfloat16 min=-29.0 max=12.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-28.875 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=7.84375
Linear output=0 dtype=torch.bfloat16 min=-12.625 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-12.4375 max=10.5625
GELU input=0 dtype=torch.bfloat16 min=-7.5 max=7.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.3125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-47.25 max=64.5
Linear output=1 dtype=torch.bfloat16 min=-47.5 max=64.5
FeedForward input=0 dtype=torch.bfloat16 min=-7.5 max=7.84375
FeedForward output=0 dtype=torch.bfloat16 min=-47.25 max=64.5
FeedForward output=1 dtype=torch.bfloat16 min=-47.5 max=64.5
LayerNorm input=0 dtype=torch.bfloat16 min=-840.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-27.75 max=15.875
Linear input=0 dtype=torch.bfloat16 min=-21.5 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-27.75 max=27.5
Linear output=1 dtype=torch.bfloat16 min=-25.125 max=25.75
GELU input=0 dtype=torch.bfloat16 min=-21.5 max=12.6875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=27.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=25.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=27.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=27.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=25.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=27.5
Linear output=0 dtype=torch.bfloat16 min=-206.0 max=498.0
Linear output=1 dtype=torch.bfloat16 min=-208.0 max=500.0
FeedForward input=0 dtype=torch.bfloat16 min=-21.5 max=12.6875
FeedForward output=0 dtype=torch.bfloat16 min=-206.0 max=498.0
FeedForward output=1 dtype=torch.bfloat16 min=-208.0 max=500.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-41.5 max=25.25
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-840.0 max=612.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-81.0 max=27.875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-3.171875 max=3.1875
Linear output=1 dtype=torch.bfloat16 min=-3.203125 max=3.21875
LayerNorm input=0 dtype=torch.bfloat16 min=-81.0 max=27.875
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=12.5
LayerNorm output=1 dtype=torch.bfloat16 min=-31.25 max=12.4375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-81.0 max=27.875
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.203125 max=1.0390625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0 max=2.609375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.4375 max=2.84375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.625 max=1.3359375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.71875 max=4.3125
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=4.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=10.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=8.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-2336.0 max=600.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.375 max=4.0625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.453125 max=1.859375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-6.71875 max=2.984375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.609375 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.4375 max=4.3125
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-7.96875 max=8.1875
Linear output=1 dtype=torch.bfloat16 min=-7.9375 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=7.28125
Linear output=1 dtype=torch.bfloat16 min=-7.0 max=7.15625
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=6.90625
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=8.0625
Linear output=1 dtype=torch.bfloat16 min=-7.6875 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-4.375 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.21875
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-4.375 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=5.625
Linear output=1 dtype=torch.bfloat16 min=-5.84375 max=5.53125
Linear input=0 dtype=torch.bfloat16 min=-4.375 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-3.453125 max=3.625
Linear output=1 dtype=torch.bfloat16 min=-4.59375 max=4.375
Linear input=0 dtype=torch.bfloat16 min=-5.6875 max=4.59375
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=8.25
Dropout input=0 dtype=torch.bfloat16 min=-9.375 max=8.3125
Dropout output=0 dtype=torch.bfloat16 min=-8.9375 max=8.3125
Dropout output=1 dtype=torch.bfloat16 min=-9.375 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-5.3125 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-12.5 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-12.25 max=10.25
Attention output=0 dtype=torch.bfloat16 min=-9.375 max=8.3125
Attention output=1 dtype=torch.bfloat16 min=-12.5 max=10.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-92.0 max=27.875
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=11.875
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-7.59375 max=6.78125
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=10.125
Linear output=1 dtype=torch.bfloat16 min=-11.6875 max=10.1875
GELU input=0 dtype=torch.bfloat16 min=-7.59375 max=6.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.1875
Linear output=0 dtype=torch.bfloat16 min=-41.5 max=31.25
Linear output=1 dtype=torch.bfloat16 min=-40.5 max=32.0
FeedForward input=0 dtype=torch.bfloat16 min=-7.59375 max=6.78125
FeedForward output=0 dtype=torch.bfloat16 min=-41.5 max=31.25
FeedForward output=1 dtype=torch.bfloat16 min=-40.5 max=32.0
LayerNorm input=0 dtype=torch.bfloat16 min=-2352.0 max=592.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=10.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-21.0 max=28.75
Linear output=0 dtype=torch.bfloat16 min=-23.875 max=15.25
Linear output=1 dtype=torch.bfloat16 min=-21.75 max=15.5625
GELU input=0 dtype=torch.bfloat16 min=-21.0 max=28.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=15.5625
Linear output=0 dtype=torch.bfloat16 min=-147.0 max=219.0
Linear output=1 dtype=torch.bfloat16 min=-150.0 max=222.0
FeedForward input=0 dtype=torch.bfloat16 min=-21.0 max=28.75
FeedForward output=0 dtype=torch.bfloat16 min=-147.0 max=219.0
FeedForward output=1 dtype=torch.bfloat16 min=-150.0 max=222.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-81.0 max=27.875
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-2336.0 max=600.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3552.0 max=656.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-56.5 max=27.75
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.59375 max=3.40625
Linear output=1 dtype=torch.bfloat16 min=-2.625 max=3.375
LayerNorm input=0 dtype=torch.bfloat16 min=-56.5 max=27.75
LayerNorm output=0 dtype=torch.bfloat16 min=-25.0 max=11.875
LayerNorm output=1 dtype=torch.bfloat16 min=-24.625 max=11.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-56.5 max=27.75
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.5 max=9.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.65234375 max=3.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5703125 max=2.359375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.3125 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.6171875 max=3.40625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=4.0
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=3.984375
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=656.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=9.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=8.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3552.0 max=656.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.0625 max=9.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.65625 max=3.671875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-4.9375 max=2.59375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.390625 max=1.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.46875 max=4.0
Linear input=0 dtype=torch.bfloat16 min=-10.5 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-25.25 max=19.0
Linear output=1 dtype=torch.bfloat16 min=-24.875 max=18.875
Linear input=0 dtype=torch.bfloat16 min=-10.5 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-11.125 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-10.5 max=9.5
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=8.125
Linear output=1 dtype=torch.bfloat16 min=-11.1875 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-6.0625 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-4.21875 max=4.0
Linear output=1 dtype=torch.bfloat16 min=-4.125 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-6.0625 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=6.71875
Linear output=1 dtype=torch.bfloat16 min=-5.53125 max=6.71875
Linear input=0 dtype=torch.bfloat16 min=-6.0625 max=9.0
Linear output=0 dtype=torch.bfloat16 min=-3.9375 max=4.28125
Linear output=1 dtype=torch.bfloat16 min=-3.359375 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-5.375 max=5.8125
Linear output=0 dtype=torch.bfloat16 min=-14.5 max=10.0
Linear output=1 dtype=torch.bfloat16 min=-14.4375 max=10.1875
Dropout input=0 dtype=torch.bfloat16 min=-14.5 max=10.1875
Dropout output=0 dtype=torch.bfloat16 min=-14.5 max=10.0
Dropout output=1 dtype=torch.bfloat16 min=-14.4375 max=10.1875
Linear input=0 dtype=torch.bfloat16 min=-7.03125 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-13.75 max=13.4375
Linear output=1 dtype=torch.bfloat16 min=-14.0625 max=13.8125
Attention output=0 dtype=torch.bfloat16 min=-14.5 max=10.1875
Attention output=1 dtype=torch.bfloat16 min=-14.0625 max=13.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-95.0 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.0 max=10.375
LayerNorm output=1 dtype=torch.bfloat16 min=-31.75 max=10.375
Linear input=0 dtype=torch.bfloat16 min=-5.15625 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-9.5 max=7.125
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=7.0625
GELU input=0 dtype=torch.bfloat16 min=-5.15625 max=4.28125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=18.625
Linear output=1 dtype=torch.bfloat16 min=-12.5625 max=20.25
FeedForward input=0 dtype=torch.bfloat16 min=-5.15625 max=4.28125
FeedForward output=0 dtype=torch.bfloat16 min=-12.0 max=18.625
FeedForward output=1 dtype=torch.bfloat16 min=-12.5625 max=20.25
LayerNorm input=0 dtype=torch.bfloat16 min=-3568.0 max=644.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=8.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=18.0
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=8.875
Linear output=1 dtype=torch.bfloat16 min=-11.3125 max=8.875
GELU input=0 dtype=torch.bfloat16 min=-17.0 max=18.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear output=0 dtype=torch.bfloat16 min=-58.5 max=32.25
Linear output=1 dtype=torch.bfloat16 min=-57.5 max=26.75
FeedForward input=0 dtype=torch.bfloat16 min=-17.0 max=18.0
FeedForward output=0 dtype=torch.bfloat16 min=-58.5 max=32.25
FeedForward output=1 dtype=torch.bfloat16 min=-57.5 max=26.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-56.5 max=27.75
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-3552.0 max=656.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3808.0 max=640.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-86.5 max=27.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.625 max=4.0
Linear output=1 dtype=torch.bfloat16 min=-1.640625 max=3.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-86.5 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-30.875 max=10.0
LayerNorm output=1 dtype=torch.bfloat16 min=-31.0 max=9.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-86.5 max=27.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.0625 max=3.328125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.75 max=3.890625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.640625 max=1.2421875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.203125 max=1.6640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.8203125 max=4.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=8.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-3808.0 max=640.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=7.15625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=9.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3808.0 max=640.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.71875 max=4.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.03125 max=4.28125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.125 max=1.9453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.171875 max=1.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.0 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=3.328125
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-9.5 max=8.0625
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=3.328125
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=7.09375
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=3.328125
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=5.6875
Linear output=1 dtype=torch.bfloat16 min=-5.59375 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-4.71875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=5.40625
Linear output=1 dtype=torch.bfloat16 min=-5.125 max=5.1875
Linear input=0 dtype=torch.bfloat16 min=-4.71875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=4.875
Linear output=1 dtype=torch.bfloat16 min=-6.96875 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-4.71875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=4.125
Linear output=1 dtype=torch.bfloat16 min=-5.875 max=6.625
Linear input=0 dtype=torch.bfloat16 min=-3.75 max=3.546875
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=6.375
Linear output=1 dtype=torch.bfloat16 min=-7.125 max=6.3125
Dropout input=0 dtype=torch.bfloat16 min=-7.125 max=6.375
Dropout output=0 dtype=torch.bfloat16 min=-6.875 max=6.375
Dropout output=1 dtype=torch.bfloat16 min=-7.125 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=6.46875
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-11.3125 max=8.0
Attention output=0 dtype=torch.bfloat16 min=-7.125 max=6.375
Attention output=1 dtype=torch.bfloat16 min=-11.3125 max=9.875
LayerNorm input=0 dtype=torch.bfloat16 min=-106.5 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=9.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-7.53125 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=7.15625
Linear output=1 dtype=torch.bfloat16 min=-8.8125 max=7.28125
GELU input=0 dtype=torch.bfloat16 min=-7.53125 max=2.703125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.15625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.28125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.28125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.15625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=10.625
Linear output=1 dtype=torch.bfloat16 min=-11.8125 max=11.125
FeedForward input=0 dtype=torch.bfloat16 min=-7.53125 max=2.703125
FeedForward output=0 dtype=torch.bfloat16 min=-12.1875 max=10.625
FeedForward output=1 dtype=torch.bfloat16 min=-11.8125 max=11.125
LayerNorm input=0 dtype=torch.bfloat16 min=-3856.0 max=624.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=5.96875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-8.9375 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-11.5 max=10.0
Linear output=1 dtype=torch.bfloat16 min=-11.25 max=10.0
GELU input=0 dtype=torch.bfloat16 min=-8.9375 max=6.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0
Linear output=0 dtype=torch.bfloat16 min=-52.0 max=30.25
Linear output=1 dtype=torch.bfloat16 min=-51.75 max=24.375
FeedForward input=0 dtype=torch.bfloat16 min=-8.9375 max=6.75
FeedForward output=0 dtype=torch.bfloat16 min=-52.0 max=30.25
FeedForward output=1 dtype=torch.bfloat16 min=-51.75 max=24.375
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-86.5 max=27.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-3808.0 max=640.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4288.0 max=660.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-86.0 max=27.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.125
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=4.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-86.0 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-31.0 max=9.75
LayerNorm output=1 dtype=torch.bfloat16 min=-30.875 max=9.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-86.0 max=27.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.875 max=10.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.96875 max=0.78515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9375 max=1.3125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.75 max=2.46875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.9140625 max=4.125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.3125 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-4.3125 max=9.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4288.0 max=660.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=12.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4288.0 max=660.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.59375 max=4.53125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.3125 max=4.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5390625 max=1.765625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.65625 max=1.4296875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.09375 max=9.75
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=6.46875
Linear output=1 dtype=torch.bfloat16 min=-6.90625 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=8.0
Linear output=1 dtype=torch.bfloat16 min=-6.75 max=7.875
Linear input=0 dtype=torch.bfloat16 min=-10.875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-5.71875 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-5.625 max=7.59375
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-4.75 max=4.75
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=4.34375
Linear output=1 dtype=torch.bfloat16 min=-7.53125 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=4.53125
Linear output=0 dtype=torch.bfloat16 min=-4.0 max=4.09375
Linear output=1 dtype=torch.bfloat16 min=-5.15625 max=4.9375
Linear input=0 dtype=torch.bfloat16 min=-3.6875 max=5.59375
Linear output=0 dtype=torch.bfloat16 min=-7.0625 max=6.96875
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=6.90625
Dropout input=0 dtype=torch.bfloat16 min=-7.09375 max=6.96875
Dropout output=0 dtype=torch.bfloat16 min=-7.0625 max=6.96875
Dropout output=1 dtype=torch.bfloat16 min=-7.09375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-3.46875 max=4.09375
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=15.1875
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=10.625
Attention output=0 dtype=torch.bfloat16 min=-7.09375 max=6.96875
Attention output=1 dtype=torch.bfloat16 min=-9.625 max=15.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-112.5 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=8.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-8.0 max=2.578125
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=5.65625
Linear output=1 dtype=torch.bfloat16 min=-9.1875 max=5.59375
GELU input=0 dtype=torch.bfloat16 min=-8.0 max=2.578125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.65625
Linear output=0 dtype=torch.bfloat16 min=-10.1875 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-10.125 max=9.75
FeedForward input=0 dtype=torch.bfloat16 min=-8.0 max=2.578125
FeedForward output=0 dtype=torch.bfloat16 min=-10.1875 max=9.9375
FeedForward output=1 dtype=torch.bfloat16 min=-10.125 max=9.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4288.0 max=664.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=12.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.25
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=4.84375
Linear output=0 dtype=torch.bfloat16 min=-11.75 max=7.84375
Linear output=1 dtype=torch.bfloat16 min=-9.5625 max=8.375
GELU input=0 dtype=torch.bfloat16 min=-7.6875 max=4.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.84375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.84375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-50.75 max=36.25
Linear output=1 dtype=torch.bfloat16 min=-50.75 max=36.25
FeedForward input=0 dtype=torch.bfloat16 min=-7.6875 max=4.84375
FeedForward output=0 dtype=torch.bfloat16 min=-50.75 max=36.25
FeedForward output=1 dtype=torch.bfloat16 min=-50.75 max=36.25
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-86.0 max=27.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4288.0 max=660.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4768.0 max=820.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-100.5 max=27.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=4.28125
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=4.25
LayerNorm input=0 dtype=torch.bfloat16 min=-100.5 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.0 max=9.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-31.875 max=9.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-100.5 max=27.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.0625 max=6.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.46875 max=1.09375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8515625 max=1.796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.1875 max=2.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.3046875 max=4.28125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-10.6875 max=8.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-4768.0 max=820.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=19.875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=20.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4768.0 max=820.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.84375 max=5.09375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.2421875 max=1.125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.40625 max=1.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.6875 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=9.9375
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-10.8125 max=9.8125
Linear output=1 dtype=torch.bfloat16 min=-10.5 max=9.5
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=5.9375
Linear output=1 dtype=torch.bfloat16 min=-6.34375 max=5.84375
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-5.78125 max=5.4375
Linear output=1 dtype=torch.bfloat16 min=-5.8125 max=6.25
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=5.375
Linear output=1 dtype=torch.bfloat16 min=-6.15625 max=5.625
Linear input=0 dtype=torch.bfloat16 min=-4.9375 max=3.921875
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.5
Linear output=1 dtype=torch.bfloat16 min=-7.28125 max=7.25
Dropout input=0 dtype=torch.bfloat16 min=-7.28125 max=7.5
Dropout output=0 dtype=torch.bfloat16 min=-7.21875 max=7.5
Dropout output=1 dtype=torch.bfloat16 min=-7.28125 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-10.4375 max=9.1875
Attention output=0 dtype=torch.bfloat16 min=-7.28125 max=7.5
Attention output=1 dtype=torch.bfloat16 min=-11.25 max=9.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-119.0 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=8.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=2.78125
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=6.875
Linear output=1 dtype=torch.bfloat16 min=-7.84375 max=6.84375
GELU input=0 dtype=torch.bfloat16 min=-6.75 max=2.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.84375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Linear output=0 dtype=torch.bfloat16 min=-7.3125 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-7.53125 max=8.3125
FeedForward input=0 dtype=torch.bfloat16 min=-6.75 max=2.78125
FeedForward output=0 dtype=torch.bfloat16 min=-7.3125 max=8.25
FeedForward output=1 dtype=torch.bfloat16 min=-7.53125 max=8.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-4768.0 max=824.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=20.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=20.125
Linear input=0 dtype=torch.bfloat16 min=-6.8125 max=3.953125
Linear output=0 dtype=torch.bfloat16 min=-10.625 max=9.4375
Linear output=1 dtype=torch.bfloat16 min=-11.0 max=7.96875
GELU input=0 dtype=torch.bfloat16 min=-6.8125 max=3.953125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.4375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.96875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.4375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.4375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-17.25 max=41.75
Linear output=1 dtype=torch.bfloat16 min=-17.875 max=43.25
FeedForward input=0 dtype=torch.bfloat16 min=-6.8125 max=3.953125
FeedForward output=0 dtype=torch.bfloat16 min=-17.25 max=41.75
FeedForward output=1 dtype=torch.bfloat16 min=-17.875 max=43.25
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-100.5 max=27.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4768.0 max=820.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4896.0 max=968.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-131.0 max=27.375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=5.0625
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=5.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-131.0 max=27.375
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=9.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=8.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-131.0 max=27.375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.6875 max=15.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.2109375 max=4.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.0625 max=1.671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.734375 max=1.9609375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.0625 max=1.4609375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-12.1875 max=12.25
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=968.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=23.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=23.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4896.0 max=968.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.875 max=3.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.8125 max=5.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.8984375 max=0.8203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.546875 max=1.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.1875 max=12.25
Linear input=0 dtype=torch.bfloat16 min=-15.6875 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-13.5625 max=14.0
Linear output=1 dtype=torch.bfloat16 min=-13.5 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-15.6875 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=15.125
Linear output=1 dtype=torch.bfloat16 min=-14.1875 max=15.0
Linear input=0 dtype=torch.bfloat16 min=-15.6875 max=15.125
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=11.5625
Linear output=1 dtype=torch.bfloat16 min=-10.375 max=11.375
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=3.5625
Linear output=0 dtype=torch.bfloat16 min=-5.3125 max=6.46875
Linear output=1 dtype=torch.bfloat16 min=-6.375 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=3.5625
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-6.15625 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-4.875 max=3.5625
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-7.6875 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-15.375 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=9.0
Dropout input=0 dtype=torch.bfloat16 min=-15.375 max=9.25
Dropout output=0 dtype=torch.bfloat16 min=-15.375 max=9.25
Dropout output=1 dtype=torch.bfloat16 min=-14.625 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-6.90625 max=6.4375
Linear output=0 dtype=torch.bfloat16 min=-14.8125 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-14.0 max=12.375
Attention output=0 dtype=torch.bfloat16 min=-15.375 max=9.25
Attention output=1 dtype=torch.bfloat16 min=-14.8125 max=12.375
LayerNorm input=0 dtype=torch.bfloat16 min=-176.0 max=27.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=6.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-10.375 max=2.0625
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=4.5
Linear output=1 dtype=torch.bfloat16 min=-16.625 max=4.4375
GELU input=0 dtype=torch.bfloat16 min=-10.375 max=2.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-20.75 max=11.8125
Linear output=1 dtype=torch.bfloat16 min=-20.375 max=12.25
FeedForward input=0 dtype=torch.bfloat16 min=-10.375 max=2.0625
FeedForward output=0 dtype=torch.bfloat16 min=-20.75 max=11.8125
FeedForward output=1 dtype=torch.bfloat16 min=-20.375 max=12.25
LayerNorm input=0 dtype=torch.bfloat16 min=-4864.0 max=940.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=20.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=20.875
Linear input=0 dtype=torch.bfloat16 min=-6.75 max=2.984375
Linear output=0 dtype=torch.bfloat16 min=-12.1875 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-12.625 max=9.3125
GELU input=0 dtype=torch.bfloat16 min=-6.75 max=2.984375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.4375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.4375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.3125
Linear output=0 dtype=torch.bfloat16 min=-69.5 max=58.0
Linear output=1 dtype=torch.bfloat16 min=-76.5 max=65.0
FeedForward input=0 dtype=torch.bfloat16 min=-6.75 max=2.984375
FeedForward output=0 dtype=torch.bfloat16 min=-69.5 max=58.0
FeedForward output=1 dtype=torch.bfloat16 min=-76.5 max=65.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-131.0 max=27.375
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4896.0 max=968.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5024.0 max=1248.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-109.0 max=34.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.296875 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-2.328125 max=6.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-109.0 max=34.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.625 max=11.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-31.0 max=11.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-109.0 max=34.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.75 max=10.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.015625 max=5.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9296875 max=1.71875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9375 max=1.8984375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.9765625 max=6.4375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0625 max=15.5625
Linear output=1 dtype=torch.bfloat16 min=-14.0 max=15.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5024.0 max=1248.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=22.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=22.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5024.0 max=1248.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.25 max=3.984375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-8.25 max=9.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.3203125 max=0.73828125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.5 max=1.2578125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.0625 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=10.4375
Linear output=0 dtype=torch.bfloat16 min=-18.375 max=12.6875
Linear output=1 dtype=torch.bfloat16 min=-18.375 max=12.75
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=10.4375
Linear output=0 dtype=torch.bfloat16 min=-13.9375 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-13.9375 max=16.375
Linear input=0 dtype=torch.bfloat16 min=-12.75 max=10.4375
Linear output=0 dtype=torch.bfloat16 min=-15.0 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-15.0625 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=3.984375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-6.03125 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=3.984375
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=8.625
Linear output=1 dtype=torch.bfloat16 min=-6.3125 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=3.984375
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.03125
Linear output=1 dtype=torch.bfloat16 min=-7.21875 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-7.8125 max=7.8125
Linear output=0 dtype=torch.bfloat16 min=-17.0 max=6.75
Linear output=1 dtype=torch.bfloat16 min=-16.25 max=6.875
Dropout input=0 dtype=torch.bfloat16 min=-17.0 max=6.875
Dropout output=0 dtype=torch.bfloat16 min=-17.0 max=6.75
Dropout output=1 dtype=torch.bfloat16 min=-16.25 max=6.875
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=7.9375
Linear output=0 dtype=torch.bfloat16 min=-19.0 max=11.125
Linear output=1 dtype=torch.bfloat16 min=-21.125 max=11.3125
Attention output=0 dtype=torch.bfloat16 min=-17.0 max=6.875
Attention output=1 dtype=torch.bfloat16 min=-21.125 max=11.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-190.0 max=46.75
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=10.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=2.53125
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=5.71875
Linear output=1 dtype=torch.bfloat16 min=-11.875 max=5.5625
GELU input=0 dtype=torch.bfloat16 min=-9.5 max=2.53125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.875
Linear output=1 dtype=torch.bfloat16 min=-6.75 max=6.625
FeedForward input=0 dtype=torch.bfloat16 min=-9.5 max=2.53125
FeedForward output=0 dtype=torch.bfloat16 min=-6.6875 max=6.875
FeedForward output=1 dtype=torch.bfloat16 min=-6.75 max=6.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5024.0 max=1240.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=21.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=21.75
Linear input=0 dtype=torch.bfloat16 min=-7.46875 max=3.71875
Linear output=0 dtype=torch.bfloat16 min=-10.5 max=11.5
Linear output=1 dtype=torch.bfloat16 min=-11.875 max=11.3125
GELU input=0 dtype=torch.bfloat16 min=-7.46875 max=3.71875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Linear output=0 dtype=torch.bfloat16 min=-54.0 max=48.75
Linear output=1 dtype=torch.bfloat16 min=-57.25 max=52.25
FeedForward input=0 dtype=torch.bfloat16 min=-7.46875 max=3.71875
FeedForward output=0 dtype=torch.bfloat16 min=-54.0 max=48.75
FeedForward output=1 dtype=torch.bfloat16 min=-57.25 max=52.25
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-109.0 max=34.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5024.0 max=1248.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4832.0 max=1920.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-198.0 max=45.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.0
Linear output=1 dtype=torch.bfloat16 min=-6.375 max=6.0
LayerNorm input=0 dtype=torch.bfloat16 min=-198.0 max=45.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=10.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=10.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-198.0 max=45.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.96875 max=6.9375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.375 max=1.9609375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8828125 max=1.6015625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.921875 max=1.3515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.046875 max=6.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-13.4375 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-13.5 max=13.375
LayerNorm input=0 dtype=torch.bfloat16 min=-4832.0 max=1920.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=24.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=24.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4832.0 max=1920.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.90625 max=5.46875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.1875 max=10.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.76171875 max=0.64453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.171875 max=1.53125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-13.5 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-6.96875 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=8.375
Linear output=1 dtype=torch.bfloat16 min=-7.34375 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-6.96875 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-8.875 max=8.8125
Linear output=1 dtype=torch.bfloat16 min=-8.9375 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-6.96875 max=6.9375
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.0
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-5.90625 max=5.46875
Linear output=0 dtype=torch.bfloat16 min=-6.4375 max=8.625
Linear output=1 dtype=torch.bfloat16 min=-6.75 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-5.90625 max=5.46875
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-7.75 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-5.90625 max=5.46875
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=8.9375
Linear input=0 dtype=torch.bfloat16 min=-5.09375 max=4.65625
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=12.75
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=12.25
Dropout input=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
Dropout output=0 dtype=torch.bfloat16 min=-6.625 max=12.75
Dropout output=1 dtype=torch.bfloat16 min=-6.84375 max=12.25
Linear input=0 dtype=torch.bfloat16 min=-4.75 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-13.75 max=19.125
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=20.0
Attention output=0 dtype=torch.bfloat16 min=-6.84375 max=12.75
Attention output=1 dtype=torch.bfloat16 min=-15.25 max=20.0
LayerNorm input=0 dtype=torch.bfloat16 min=-272.0 max=50.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=8.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-8.4375 max=2.234375
Linear output=0 dtype=torch.bfloat16 min=-7.1875 max=3.34375
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=3.953125
GELU input=0 dtype=torch.bfloat16 min=-8.4375 max=2.234375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.953125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.953125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.34375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.953125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.953125
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=28.5
Linear output=1 dtype=torch.bfloat16 min=-5.84375 max=28.125
FeedForward input=0 dtype=torch.bfloat16 min=-8.4375 max=2.234375
FeedForward output=0 dtype=torch.bfloat16 min=-5.875 max=28.5
FeedForward output=1 dtype=torch.bfloat16 min=-5.84375 max=28.125
LayerNorm input=0 dtype=torch.bfloat16 min=-4928.0 max=1984.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=24.75
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=24.75
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-23.625 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-24.125 max=16.0
GELU input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.0
Linear output=0 dtype=torch.bfloat16 min=-83.5 max=77.5
Linear output=1 dtype=torch.bfloat16 min=-84.5 max=79.0
FeedForward input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
FeedForward output=0 dtype=torch.bfloat16 min=-83.5 max=77.5
FeedForward output=1 dtype=torch.bfloat16 min=-84.5 max=79.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-198.0 max=45.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4832.0 max=1920.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4960.0 max=3056.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-192.0 max=68.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.734375 max=5.71875
Linear output=1 dtype=torch.bfloat16 min=-2.71875 max=5.71875
LayerNorm input=0 dtype=torch.bfloat16 min=-192.0 max=68.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=19.625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=19.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-192.0 max=68.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.9375 max=13.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.9453125 max=5.71875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.484375 max=1.4765625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.859375 max=1.1640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.734375 max=5.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-13.625 max=15.5
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=15.5
LayerNorm input=0 dtype=torch.bfloat16 min=-4960.0 max=3056.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=28.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=28.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4960.0 max=3056.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.125 max=4.34375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-6.09375 max=11.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.8359375 max=1.0859375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=1.8515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-13.6875 max=15.5
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=13.875
Linear output=0 dtype=torch.bfloat16 min=-15.8125 max=11.0625
Linear output=1 dtype=torch.bfloat16 min=-15.4375 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=13.875
Linear output=0 dtype=torch.bfloat16 min=-20.125 max=17.625
Linear output=1 dtype=torch.bfloat16 min=-19.75 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=13.875
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.6875
Linear output=1 dtype=torch.bfloat16 min=-7.78125 max=7.34375
Linear input=0 dtype=torch.bfloat16 min=-5.125 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=5.9375
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=5.9375
Linear input=0 dtype=torch.bfloat16 min=-5.125 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=5.3125
Linear output=1 dtype=torch.bfloat16 min=-6.875 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-5.125 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=7.1875
Linear output=1 dtype=torch.bfloat16 min=-8.5625 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-4.90625 max=6.3125
Linear output=0 dtype=torch.bfloat16 min=-19.75 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-19.25 max=8.4375
Dropout input=0 dtype=torch.bfloat16 min=-19.75 max=8.6875
Dropout output=0 dtype=torch.bfloat16 min=-19.75 max=8.6875
Dropout output=1 dtype=torch.bfloat16 min=-19.25 max=8.4375
Linear input=0 dtype=torch.bfloat16 min=-7.59375 max=6.625
Linear output=0 dtype=torch.bfloat16 min=-23.75 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-25.0 max=11.5625
Attention output=0 dtype=torch.bfloat16 min=-19.75 max=8.6875
Attention output=1 dtype=torch.bfloat16 min=-25.0 max=11.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-262.0 max=62.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=12.375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=12.25
Linear input=0 dtype=torch.bfloat16 min=-8.375 max=3.3125
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=4.28125
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=4.09375
GELU input=0 dtype=torch.bfloat16 min=-8.375 max=3.3125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.09375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.09375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.28125
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=15.9375
Linear output=1 dtype=torch.bfloat16 min=-9.8125 max=19.375
FeedForward input=0 dtype=torch.bfloat16 min=-8.375 max=3.3125
FeedForward output=0 dtype=torch.bfloat16 min=-12.9375 max=15.9375
FeedForward output=1 dtype=torch.bfloat16 min=-9.8125 max=19.375
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=3024.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=28.25
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=8.625
Linear output=1 dtype=torch.bfloat16 min=-16.75 max=9.0625
GELU input=0 dtype=torch.bfloat16 min=-7.75 max=4.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.0625
Linear output=0 dtype=torch.bfloat16 min=-65.0 max=64.5
Linear output=1 dtype=torch.bfloat16 min=-66.5 max=66.5
FeedForward input=0 dtype=torch.bfloat16 min=-7.75 max=4.375
FeedForward output=0 dtype=torch.bfloat16 min=-65.0 max=64.5
FeedForward output=1 dtype=torch.bfloat16 min=-66.5 max=66.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-192.0 max=68.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4960.0 max=3056.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4832.0 max=3808.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-182.0 max=70.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=5.84375
Linear output=1 dtype=torch.bfloat16 min=-5.5625 max=5.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-182.0 max=70.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=14.375
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=14.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-182.0 max=70.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.9375 max=10.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.5 max=5.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.046875 max=1.8046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.71875 max=0.9375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.5625 max=3.015625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=11.8125
Linear output=1 dtype=torch.bfloat16 min=-14.125 max=11.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4832.0 max=3808.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=29.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=29.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4832.0 max=3808.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.3125 max=4.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.8125 max=11.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9609375 max=0.875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.453125 max=3.03125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.125 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-9.9375 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-11.0625 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-11.0625 max=11.25
Linear input=0 dtype=torch.bfloat16 min=-9.9375 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-13.375 max=12.1875
Linear output=1 dtype=torch.bfloat16 min=-13.6875 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-9.9375 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=8.625
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-5.3125 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-6.5625 max=6.0625
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-5.3125 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-5.65625 max=6.21875
Linear output=1 dtype=torch.bfloat16 min=-7.0625 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-5.3125 max=4.5
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=8.0
Linear output=1 dtype=torch.bfloat16 min=-7.15625 max=7.96875
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=4.875
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-19.625 max=7.65625
Dropout input=0 dtype=torch.bfloat16 min=-19.625 max=8.5
Dropout output=0 dtype=torch.bfloat16 min=-17.625 max=8.5
Dropout output=1 dtype=torch.bfloat16 min=-19.625 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-28.75 max=20.625
Linear output=1 dtype=torch.bfloat16 min=-31.5 max=22.0
Attention output=0 dtype=torch.bfloat16 min=-19.625 max=8.5
Attention output=1 dtype=torch.bfloat16 min=-31.5 max=22.0
LayerNorm input=0 dtype=torch.bfloat16 min=-246.0 max=78.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=11.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=11.3125
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=2.65625
Linear output=0 dtype=torch.bfloat16 min=-7.15625 max=4.34375
Linear output=1 dtype=torch.bfloat16 min=-6.875 max=4.46875
GELU input=0 dtype=torch.bfloat16 min=-10.1875 max=2.65625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.34375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-41.75 max=8.1875
Linear output=1 dtype=torch.bfloat16 min=-41.25 max=8.3125
FeedForward input=0 dtype=torch.bfloat16 min=-10.1875 max=2.65625
FeedForward output=0 dtype=torch.bfloat16 min=-41.75 max=8.1875
FeedForward output=1 dtype=torch.bfloat16 min=-41.25 max=8.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-4768.0 max=3712.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.75 max=29.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.75 max=29.875
Linear input=0 dtype=torch.bfloat16 min=-8.5625 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-18.125 max=9.375
Linear output=1 dtype=torch.bfloat16 min=-17.875 max=8.875
GELU input=0 dtype=torch.bfloat16 min=-8.5625 max=5.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-49.25 max=39.0
Linear output=1 dtype=torch.bfloat16 min=-51.25 max=37.75
FeedForward input=0 dtype=torch.bfloat16 min=-8.5625 max=5.34375
FeedForward output=0 dtype=torch.bfloat16 min=-49.25 max=39.0
FeedForward output=1 dtype=torch.bfloat16 min=-51.25 max=37.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-182.0 max=70.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4832.0 max=3808.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5312.0 max=4016.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-89.0 max=87.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=8.6875
Linear output=1 dtype=torch.bfloat16 min=-4.59375 max=8.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-89.0 max=87.5
LayerNorm output=0 dtype=torch.bfloat16 min=-21.5 max=20.75
LayerNorm output=1 dtype=torch.bfloat16 min=-22.5 max=20.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-89.0 max=87.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-18.0 max=18.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.59375 max=6.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.828125 max=2.515625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6796875 max=1.2421875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.03125 max=8.6875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-14.25 max=13.375
LayerNorm input=0 dtype=torch.bfloat16 min=-5312.0 max=4016.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=31.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=31.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5312.0 max=4016.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.0625 max=4.78125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.75 max=11.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.09375 max=0.76953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.203125 max=2.40625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.25 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-17.25 max=19.25
Linear output=1 dtype=torch.bfloat16 min=-17.125 max=19.25
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-28.625 max=27.375
Linear output=1 dtype=torch.bfloat16 min=-28.5 max=27.875
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=18.75
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-10.875 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-7.375 max=6.875
Linear output=1 dtype=torch.bfloat16 min=-7.84375 max=7.125
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=6.9375
Linear output=1 dtype=torch.bfloat16 min=-7.0 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=4.78125
Linear output=0 dtype=torch.bfloat16 min=-6.40625 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-6.09375 max=7.5625
Linear input=0 dtype=torch.bfloat16 min=-8.0 max=6.84375
Linear output=0 dtype=torch.bfloat16 min=-36.75 max=12.0625
Linear output=1 dtype=torch.bfloat16 min=-35.0 max=12.1875
Dropout input=0 dtype=torch.bfloat16 min=-36.75 max=12.1875
Dropout output=0 dtype=torch.bfloat16 min=-36.75 max=12.0625
Dropout output=1 dtype=torch.bfloat16 min=-35.0 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-23.5 max=29.75
Linear output=1 dtype=torch.bfloat16 min=-22.125 max=27.875
Attention output=0 dtype=torch.bfloat16 min=-36.75 max=12.1875
Attention output=1 dtype=torch.bfloat16 min=-23.5 max=29.75
LayerNorm input=0 dtype=torch.bfloat16 min=-235.0 max=100.5
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=14.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-30.75 max=14.25
Linear input=0 dtype=torch.bfloat16 min=-5.5625 max=3.0625
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=13.25
Linear output=1 dtype=torch.bfloat16 min=-10.8125 max=13.5
GELU input=0 dtype=torch.bfloat16 min=-5.5625 max=3.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=9.0
Linear output=1 dtype=torch.bfloat16 min=-10.75 max=8.75
FeedForward input=0 dtype=torch.bfloat16 min=-5.5625 max=3.0625
FeedForward output=0 dtype=torch.bfloat16 min=-10.125 max=9.0
FeedForward output=1 dtype=torch.bfloat16 min=-10.75 max=8.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5216.0 max=3904.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=30.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=31.125
Linear input=0 dtype=torch.bfloat16 min=-9.875 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-19.125 max=11.1875
GELU input=0 dtype=torch.bfloat16 min=-9.875 max=4.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.1875
Linear output=0 dtype=torch.bfloat16 min=-22.75 max=31.75
Linear output=1 dtype=torch.bfloat16 min=-24.0 max=31.625
FeedForward input=0 dtype=torch.bfloat16 min=-9.875 max=4.34375
FeedForward output=0 dtype=torch.bfloat16 min=-22.75 max=31.75
FeedForward output=1 dtype=torch.bfloat16 min=-24.0 max=31.625
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-89.0 max=87.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5312.0 max=4016.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5312.0 max=3968.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-202.0 max=113.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=5.3125
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=5.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-202.0 max=113.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.5 max=16.625
LayerNorm output=1 dtype=torch.bfloat16 min=-29.75 max=16.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-202.0 max=113.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.9375 max=12.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.34375 max=5.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.5 max=1.8359375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6640625 max=2.03125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.09375 max=3.921875
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=11.25
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=11.25
LayerNorm input=0 dtype=torch.bfloat16 min=-5312.0 max=3968.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=30.875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=31.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5312.0 max=3968.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.8125 max=7.46875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.375 max=9.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.125 max=0.5703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=3.34375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.25 max=11.25
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=12.5625
Linear output=0 dtype=torch.bfloat16 min=-12.25 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=10.375
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=12.5625
Linear output=0 dtype=torch.bfloat16 min=-13.5 max=16.25
Linear output=1 dtype=torch.bfloat16 min=-13.4375 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-14.9375 max=12.5625
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=8.5625
Linear output=1 dtype=torch.bfloat16 min=-9.625 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=7.46875
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-8.0625 max=12.4375
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=7.46875
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=9.5
Linear output=1 dtype=torch.bfloat16 min=-8.375 max=9.0625
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=7.46875
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-9.8125 max=10.4375
Linear input=0 dtype=torch.bfloat16 min=-6.78125 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-27.875 max=8.375
Linear output=1 dtype=torch.bfloat16 min=-25.125 max=8.1875
Dropout input=0 dtype=torch.bfloat16 min=-27.875 max=8.375
Dropout output=0 dtype=torch.bfloat16 min=-27.875 max=8.375
Dropout output=1 dtype=torch.bfloat16 min=-25.125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-7.28125 max=7.125
Linear output=0 dtype=torch.bfloat16 min=-35.0 max=12.1875
Linear output=1 dtype=torch.bfloat16 min=-32.5 max=20.5
Attention output=0 dtype=torch.bfloat16 min=-27.875 max=8.375
Attention output=1 dtype=torch.bfloat16 min=-35.0 max=20.5
LayerNorm input=0 dtype=torch.bfloat16 min=-314.0 max=106.5
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=11.875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=11.875
Linear input=0 dtype=torch.bfloat16 min=-2.625 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-10.375 max=12.8125
Linear output=1 dtype=torch.bfloat16 min=-11.875 max=12.4375
GELU input=0 dtype=torch.bfloat16 min=-2.625 max=2.421875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=12.8125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.8125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=12.8125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.8125
Linear output=0 dtype=torch.bfloat16 min=-14.5 max=6.09375
Linear output=1 dtype=torch.bfloat16 min=-14.0625 max=6.375
FeedForward input=0 dtype=torch.bfloat16 min=-2.625 max=2.421875
FeedForward output=0 dtype=torch.bfloat16 min=-14.5 max=6.09375
FeedForward output=1 dtype=torch.bfloat16 min=-14.0625 max=6.375
LayerNorm input=0 dtype=torch.bfloat16 min=-5312.0 max=3856.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=29.625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=30.0
Linear input=0 dtype=torch.bfloat16 min=-10.9375 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-83.5 max=28.375
Linear output=1 dtype=torch.bfloat16 min=-88.5 max=24.625
GELU input=0 dtype=torch.bfloat16 min=-10.9375 max=5.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=28.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=28.375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=28.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=28.375
Linear output=0 dtype=torch.bfloat16 min=-54.25 max=62.25
Linear output=1 dtype=torch.bfloat16 min=-57.5 max=62.75
FeedForward input=0 dtype=torch.bfloat16 min=-10.9375 max=5.625
FeedForward output=0 dtype=torch.bfloat16 min=-54.25 max=62.25
FeedForward output=1 dtype=torch.bfloat16 min=-57.5 max=62.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-202.0 max=113.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5312.0 max=3968.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5280.0 max=4160.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-246.0 max=111.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-6.15625 max=5.65625
Linear output=1 dtype=torch.bfloat16 min=-6.125 max=5.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-246.0 max=111.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=14.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=14.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-246.0 max=111.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.9375 max=11.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=5.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.4296875 max=1.375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9765625 max=2.21875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.15625 max=3.53125
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=10.9375
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=10.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-5280.0 max=4160.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=30.125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=30.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5280.0 max=4160.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.5 max=5.34375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.375 max=10.9375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.94921875 max=1.515625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.71875 max=9.75
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.625 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-11.4375 max=12.8125
Linear output=1 dtype=torch.bfloat16 min=-11.5 max=12.4375
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=17.875
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=17.125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=9.6875
Linear output=1 dtype=torch.bfloat16 min=-8.9375 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-5.65625 max=7.28125
Linear output=1 dtype=torch.bfloat16 min=-6.96875 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-6.28125 max=6.875
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-14.75 max=12.5
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=6.53125
Linear output=1 dtype=torch.bfloat16 min=-21.0 max=6.34375
Dropout input=0 dtype=torch.bfloat16 min=-21.25 max=6.53125
Dropout output=0 dtype=torch.bfloat16 min=-21.25 max=6.53125
Dropout output=1 dtype=torch.bfloat16 min=-21.0 max=6.34375
Linear input=0 dtype=torch.bfloat16 min=-13.0 max=11.0
Linear output=0 dtype=torch.bfloat16 min=-67.0 max=81.5
Linear output=1 dtype=torch.bfloat16 min=-51.25 max=61.75
Attention output=0 dtype=torch.bfloat16 min=-21.25 max=6.53125
Attention output=1 dtype=torch.bfloat16 min=-67.0 max=81.5
LayerNorm input=0 dtype=torch.bfloat16 min=-340.0 max=99.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=11.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-2.609375 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=3.203125
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=3.0
GELU input=0 dtype=torch.bfloat16 min=-2.609375 max=2.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.203125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.203125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.203125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.203125
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=7.40625
Linear output=1 dtype=torch.bfloat16 min=-22.0 max=7.03125
FeedForward input=0 dtype=torch.bfloat16 min=-2.609375 max=2.84375
FeedForward output=0 dtype=torch.bfloat16 min=-22.5 max=7.40625
FeedForward output=1 dtype=torch.bfloat16 min=-22.0 max=7.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-5184.0 max=4192.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=30.625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=30.875
Linear input=0 dtype=torch.bfloat16 min=-13.5 max=12.5
Linear output=0 dtype=torch.bfloat16 min=-62.0 max=105.0
Linear output=1 dtype=torch.bfloat16 min=-62.75 max=125.5
GELU input=0 dtype=torch.bfloat16 min=-13.5 max=12.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=105.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=125.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=125.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=105.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=125.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=125.5
Linear output=0 dtype=torch.bfloat16 min=-223.0 max=500.0
Linear output=1 dtype=torch.bfloat16 min=-223.0 max=612.0
FeedForward input=0 dtype=torch.bfloat16 min=-13.5 max=12.5
FeedForward output=0 dtype=torch.bfloat16 min=-223.0 max=500.0
FeedForward output=1 dtype=torch.bfloat16 min=-223.0 max=612.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-246.0 max=111.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5280.0 max=4160.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-7616.0 max=6784.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-224.0 max=102.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=5.40625
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=5.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-224.0 max=102.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=15.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=15.0625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-224.0 max=102.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.8125 max=12.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.15625 max=5.40625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.9609375 max=1.25
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.0625 max=1.4296875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.375 max=3.265625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.4375 max=13.3125
Linear output=1 dtype=torch.bfloat16 min=-9.4375 max=13.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-7616.0 max=6784.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=34.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=34.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-7616.0 max=6784.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.53125 max=5.90625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.4375 max=13.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.51171875 max=0.86328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.765625 max=4.53125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.953125 max=4.71875
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=12.75
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=9.625
Linear output=1 dtype=torch.bfloat16 min=-10.9375 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=12.75
Linear output=0 dtype=torch.bfloat16 min=-14.6875 max=14.8125
Linear output=1 dtype=torch.bfloat16 min=-15.0 max=14.9375
Linear input=0 dtype=torch.bfloat16 min=-12.8125 max=12.75
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=7.84375
Linear output=1 dtype=torch.bfloat16 min=-6.53125 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-6.53125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-6.90625 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-6.53125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=5.5625
Linear output=1 dtype=torch.bfloat16 min=-6.59375 max=6.0625
Linear input=0 dtype=torch.bfloat16 min=-6.53125 max=5.90625
Linear output=0 dtype=torch.bfloat16 min=-11.5 max=9.0
Linear output=1 dtype=torch.bfloat16 min=-11.25 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=4.3125
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=6.09375
Linear output=1 dtype=torch.bfloat16 min=-21.25 max=6.40625
Dropout input=0 dtype=torch.bfloat16 min=-22.5 max=6.40625
Dropout output=0 dtype=torch.bfloat16 min=-22.5 max=6.09375
Dropout output=1 dtype=torch.bfloat16 min=-21.25 max=6.40625
Linear input=0 dtype=torch.bfloat16 min=-4.53125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-22.375 max=9.8125
Attention output=0 dtype=torch.bfloat16 min=-22.5 max=6.40625
Attention output=1 dtype=torch.bfloat16 min=-22.375 max=9.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-328.0 max=100.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=12.375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=2.421875
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=3.125
Linear output=1 dtype=torch.bfloat16 min=-4.09375 max=3.0625
GELU input=0 dtype=torch.bfloat16 min=-8.875 max=2.421875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.125
Linear output=0 dtype=torch.bfloat16 min=-36.5 max=4.65625
Linear output=1 dtype=torch.bfloat16 min=-36.5 max=4.5
FeedForward input=0 dtype=torch.bfloat16 min=-8.875 max=2.421875
FeedForward output=0 dtype=torch.bfloat16 min=-36.5 max=4.65625
FeedForward output=1 dtype=torch.bfloat16 min=-36.5 max=4.5
LayerNorm input=0 dtype=torch.bfloat16 min=-7904.0 max=6720.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=34.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=34.25
Linear input=0 dtype=torch.bfloat16 min=-38.25 max=44.25
Linear output=0 dtype=torch.bfloat16 min=-21.0 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-16.875 max=12.0625
GELU input=0 dtype=torch.bfloat16 min=-38.25 max=44.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=12.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=12.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.0625
Linear output=0 dtype=torch.bfloat16 min=-484.0 max=1216.0
Linear output=1 dtype=torch.bfloat16 min=-490.0 max=1240.0
FeedForward input=0 dtype=torch.bfloat16 min=-38.25 max=44.25
FeedForward output=0 dtype=torch.bfloat16 min=-484.0 max=1216.0
FeedForward output=1 dtype=torch.bfloat16 min=-490.0 max=1240.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-224.0 max=102.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-7616.0 max=6784.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-9728.0 max=12544.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-172.0 max=100.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.28125 max=5.5625
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=5.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-172.0 max=100.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.625 max=17.625
LayerNorm output=1 dtype=torch.bfloat16 min=-29.0 max=16.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-172.0 max=100.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.0 max=13.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.28125 max=2.109375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9921875 max=1.1796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.765625 max=1.96875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.15625 max=5.5625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-13.9375 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-9728.0 max=12544.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=36.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=36.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9728.0 max=12544.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.5 max=4.96875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-14.0 max=13.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.87109375 max=0.5703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.828125 max=9.4375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.8125 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-18.125 max=21.75
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=21.25
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-27.375 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-26.875 max=23.125
Linear input=0 dtype=torch.bfloat16 min=-14.0 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-9.8125 max=9.4375
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-4.5 max=4.96875
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-7.15625 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-4.5 max=4.96875
Linear output=0 dtype=torch.bfloat16 min=-6.125 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-5.5 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-4.5 max=4.96875
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-5.78125 max=5.46875
Linear output=0 dtype=torch.bfloat16 min=-9.0625 max=49.25
Linear output=1 dtype=torch.bfloat16 min=-9.0625 max=50.0
Dropout input=0 dtype=torch.bfloat16 min=-9.0625 max=50.0
Dropout output=0 dtype=torch.bfloat16 min=-9.0625 max=49.25
Dropout output=1 dtype=torch.bfloat16 min=-9.0625 max=50.0
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=5.0
Linear output=0 dtype=torch.bfloat16 min=-21.875 max=37.25
Linear output=1 dtype=torch.bfloat16 min=-22.0 max=36.75
Attention output=0 dtype=torch.bfloat16 min=-9.0625 max=50.0
Attention output=1 dtype=torch.bfloat16 min=-22.0 max=37.25
LayerNorm input=0 dtype=torch.bfloat16 min=-396.0 max=103.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=10.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=9.875
Linear input=0 dtype=torch.bfloat16 min=-3.453125 max=2.390625
Linear output=0 dtype=torch.bfloat16 min=-3.953125 max=3.265625
Linear output=1 dtype=torch.bfloat16 min=-3.890625 max=3.25
GELU input=0 dtype=torch.bfloat16 min=-3.453125 max=2.390625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.265625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.265625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.265625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.265625
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=44.75
Linear output=1 dtype=torch.bfloat16 min=-6.0625 max=46.0
FeedForward input=0 dtype=torch.bfloat16 min=-3.453125 max=2.390625
FeedForward output=0 dtype=torch.bfloat16 min=-5.84375 max=44.75
FeedForward output=1 dtype=torch.bfloat16 min=-6.0625 max=46.0
LayerNorm input=0 dtype=torch.bfloat16 min=-10240.0 max=12160.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=36.25
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=36.25
Linear input=0 dtype=torch.bfloat16 min=-9.5 max=6.40625
Linear output=0 dtype=torch.bfloat16 min=-18.125 max=13.8125
Linear output=1 dtype=torch.bfloat16 min=-18.125 max=14.5
GELU input=0 dtype=torch.bfloat16 min=-9.5 max=6.40625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.8125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.8125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Linear output=0 dtype=torch.bfloat16 min=-44.0 max=40.25
Linear output=1 dtype=torch.bfloat16 min=-46.0 max=41.75
FeedForward input=0 dtype=torch.bfloat16 min=-9.5 max=6.40625
FeedForward output=0 dtype=torch.bfloat16 min=-44.0 max=40.25
FeedForward output=1 dtype=torch.bfloat16 min=-46.0 max=41.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-172.0 max=100.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-9728.0 max=12544.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-10560.0 max=12416.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-160.0 max=104.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-4.59375 max=9.4375
Linear output=1 dtype=torch.bfloat16 min=-4.5625 max=9.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-160.0 max=104.5
LayerNorm output=0 dtype=torch.bfloat16 min=-27.25 max=17.5
LayerNorm output=1 dtype=torch.bfloat16 min=-27.125 max=16.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-160.0 max=104.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-18.0 max=17.875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.28125 max=5.53125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.7578125 max=1.015625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8984375 max=2.15625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.59375 max=9.4375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.375 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-9.375 max=9.875
LayerNorm input=0 dtype=torch.bfloat16 min=-10560.0 max=12416.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=36.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=36.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-10560.0 max=12416.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.5625 max=4.40625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.375 max=9.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.625 max=1.0390625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.359375 max=9.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.5625 max=7.84375
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-20.0 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-19.625 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=47.0
Linear output=1 dtype=torch.bfloat16 min=-40.0 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-18.0 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-11.9375 max=10.375
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=7.09375
Linear output=1 dtype=torch.bfloat16 min=-7.09375 max=7.21875
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=7.4375
Linear output=1 dtype=torch.bfloat16 min=-5.25 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-4.5625 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-7.53125 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-7.90625 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=6.59375
Linear output=0 dtype=torch.bfloat16 min=-33.0 max=13.1875
Linear output=1 dtype=torch.bfloat16 min=-34.0 max=12.875
Dropout input=0 dtype=torch.bfloat16 min=-34.0 max=13.1875
Dropout output=0 dtype=torch.bfloat16 min=-33.0 max=13.1875
Dropout output=1 dtype=torch.bfloat16 min=-34.0 max=12.875
Linear input=0 dtype=torch.bfloat16 min=-5.6875 max=4.9375
Linear output=0 dtype=torch.bfloat16 min=-32.75 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-31.625 max=9.9375
Attention output=0 dtype=torch.bfloat16 min=-34.0 max=13.1875
Attention output=1 dtype=torch.bfloat16 min=-32.75 max=9.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-326.0 max=104.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=11.125
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=4.0625
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=4.71875
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=4.8125
GELU input=0 dtype=torch.bfloat16 min=-3.734375 max=4.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.71875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.71875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-9.25 max=10.25
FeedForward input=0 dtype=torch.bfloat16 min=-3.734375 max=4.0625
FeedForward output=0 dtype=torch.bfloat16 min=-9.75 max=10.0625
FeedForward output=1 dtype=torch.bfloat16 min=-9.25 max=10.25
LayerNorm input=0 dtype=torch.bfloat16 min=-10688.0 max=12288.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=36.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=36.25
Linear input=0 dtype=torch.bfloat16 min=-16.25 max=11.3125
Linear output=0 dtype=torch.bfloat16 min=-13.0625 max=10.75
Linear output=1 dtype=torch.bfloat16 min=-13.0625 max=10.75
GELU input=0 dtype=torch.bfloat16 min=-16.25 max=11.3125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-245.0 max=516.0
Linear output=1 dtype=torch.bfloat16 min=-243.0 max=524.0
FeedForward input=0 dtype=torch.bfloat16 min=-16.25 max=11.3125
FeedForward output=0 dtype=torch.bfloat16 min=-245.0 max=516.0
FeedForward output=1 dtype=torch.bfloat16 min=-243.0 max=524.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-160.0 max=104.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-10560.0 max=12416.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-9408.0 max=8320.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-266.0 max=101.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.34375 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=6.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-266.0 max=101.5
LayerNorm output=0 dtype=torch.bfloat16 min=-31.5 max=12.25
LayerNorm output=1 dtype=torch.bfloat16 min=-31.0 max=12.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-266.0 max=101.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.25 max=12.0625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.375 max=2.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.046875 max=1.3203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8125 max=1.5859375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.3125 max=6.84375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-9.9375 max=12.75
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=12.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-9408.0 max=8320.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=31.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=32.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9408.0 max=8320.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.9375 max=6.96875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.9375 max=12.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.462890625 max=1.3046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-0.72265625 max=7.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.8125 max=4.21875
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=12.0625
Linear output=0 dtype=torch.bfloat16 min=-14.5 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-14.1875 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=12.0625
Linear output=0 dtype=torch.bfloat16 min=-18.25 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-18.25 max=17.0
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=12.0625
Linear output=0 dtype=torch.bfloat16 min=-9.6875 max=9.6875
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=9.875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=6.96875
Linear output=0 dtype=torch.bfloat16 min=-5.9375 max=7.21875
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=7.1875
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=6.96875
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=6.15625
Linear output=1 dtype=torch.bfloat16 min=-5.84375 max=6.09375
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=6.96875
Linear output=0 dtype=torch.bfloat16 min=-13.0625 max=10.75
Linear output=1 dtype=torch.bfloat16 min=-13.125 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=34.0
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=35.5
Dropout input=0 dtype=torch.bfloat16 min=-16.625 max=35.5
Dropout output=0 dtype=torch.bfloat16 min=-16.625 max=34.0
Dropout output=1 dtype=torch.bfloat16 min=-16.375 max=35.5
Linear input=0 dtype=torch.bfloat16 min=-4.78125 max=4.8125
Linear output=0 dtype=torch.bfloat16 min=-8.4375 max=11.0625
Linear output=1 dtype=torch.bfloat16 min=-8.8125 max=9.75
Attention output=0 dtype=torch.bfloat16 min=-16.625 max=35.5
Attention output=1 dtype=torch.bfloat16 min=-8.8125 max=11.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-432.0 max=106.5
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=9.25
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-4.5 max=3.421875
Linear output=0 dtype=torch.bfloat16 min=-4.4375 max=4.03125
Linear output=1 dtype=torch.bfloat16 min=-4.0625 max=4.1875
GELU input=0 dtype=torch.bfloat16 min=-4.5 max=3.421875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.03125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-6.59375 max=31.0
Linear output=1 dtype=torch.bfloat16 min=-6.78125 max=32.0
FeedForward input=0 dtype=torch.bfloat16 min=-4.5 max=3.421875
FeedForward output=0 dtype=torch.bfloat16 min=-6.59375 max=31.0
FeedForward output=1 dtype=torch.bfloat16 min=-6.78125 max=32.0
LayerNorm input=0 dtype=torch.bfloat16 min=-9472.0 max=8256.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=31.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=31.75
Linear input=0 dtype=torch.bfloat16 min=-30.375 max=32.25
Linear output=0 dtype=torch.bfloat16 min=-6.875 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-6.8125 max=6.5625
GELU input=0 dtype=torch.bfloat16 min=-30.375 max=32.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-332.0 max=396.0
Linear output=1 dtype=torch.bfloat16 min=-300.0 max=284.0
FeedForward input=0 dtype=torch.bfloat16 min=-30.375 max=32.25
FeedForward output=0 dtype=torch.bfloat16 min=-332.0 max=396.0
FeedForward output=1 dtype=torch.bfloat16 min=-300.0 max=284.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-266.0 max=101.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-9408.0 max=8320.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-9536.0 max=9728.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-230.0 max=106.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=8.9375
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=8.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-230.0 max=106.5
LayerNorm output=0 dtype=torch.bfloat16 min=-29.5 max=13.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-28.875 max=13.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-230.0 max=106.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.3125 max=13.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.84375 max=5.8125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.484375 max=1.3203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.78125 max=1.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.90625 max=8.9375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-15.9375 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=16.5
LayerNorm input=0 dtype=torch.bfloat16 min=-9536.0 max=9728.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=33.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=31.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-9536.0 max=9728.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.96875 max=5.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.9375 max=16.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.734375 max=1.15625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.0625 max=9.375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.125 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-18.625 max=21.75
Linear output=1 dtype=torch.bfloat16 min=-19.0 max=20.75
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-44.75 max=42.5
Linear output=1 dtype=torch.bfloat16 min=-44.25 max=42.75
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=13.375
Linear output=1 dtype=torch.bfloat16 min=-10.5 max=13.75
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-6.8125 max=6.375
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-7.21875 max=5.59375
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-6.21875 max=5.5625
Linear output=1 dtype=torch.bfloat16 min=-6.28125 max=5.71875
Linear input=0 dtype=torch.bfloat16 min=-6.09375 max=5.78125
Linear output=0 dtype=torch.bfloat16 min=-27.375 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-30.625 max=10.5
Dropout input=0 dtype=torch.bfloat16 min=-30.625 max=10.5625
Dropout output=0 dtype=torch.bfloat16 min=-27.375 max=10.5625
Dropout output=1 dtype=torch.bfloat16 min=-30.625 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-4.3125 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-20.125 max=14.75
Linear output=1 dtype=torch.bfloat16 min=-19.625 max=18.0
Attention output=0 dtype=torch.bfloat16 min=-30.625 max=10.5625
Attention output=1 dtype=torch.bfloat16 min=-20.125 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-380.0 max=125.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=11.0625
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=11.25
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.5625
Linear output=1 dtype=torch.bfloat16 min=-4.6875 max=4.4375
GELU input=0 dtype=torch.bfloat16 min=-11.375 max=3.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-15.625 max=18.625
Linear output=1 dtype=torch.bfloat16 min=-15.4375 max=18.875
FeedForward input=0 dtype=torch.bfloat16 min=-11.375 max=3.78125
FeedForward output=0 dtype=torch.bfloat16 min=-15.625 max=18.625
FeedForward output=1 dtype=torch.bfloat16 min=-15.4375 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-9664.0 max=9600.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=32.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=31.375
Linear input=0 dtype=torch.bfloat16 min=-14.8125 max=13.625
Linear output=0 dtype=torch.bfloat16 min=-16.0 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-15.0 max=12.1875
GELU input=0 dtype=torch.bfloat16 min=-14.8125 max=13.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=12.1875
Linear output=0 dtype=torch.bfloat16 min=-164.0 max=170.0
Linear output=1 dtype=torch.bfloat16 min=-187.0 max=167.0
FeedForward input=0 dtype=torch.bfloat16 min=-14.8125 max=13.625
FeedForward output=0 dtype=torch.bfloat16 min=-164.0 max=170.0
FeedForward output=1 dtype=torch.bfloat16 min=-187.0 max=167.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-230.0 max=106.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-9536.0 max=9728.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-7392.0 max=7584.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-244.0 max=119.5
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-7.03125 max=7.0625
Linear output=1 dtype=torch.bfloat16 min=-7.03125 max=7.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-244.0 max=119.5
LayerNorm output=0 dtype=torch.bfloat16 min=-27.375 max=13.375
LayerNorm output=1 dtype=torch.bfloat16 min=-27.0 max=13.5625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-244.0 max=119.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.625 max=14.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.03125 max=3.515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.859375 max=2.671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6328125 max=1.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.03125 max=7.09375
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-11.8125 max=12.125
Linear output=1 dtype=torch.bfloat16 min=-11.8125 max=12.125
LayerNorm input=0 dtype=torch.bfloat16 min=-7392.0 max=7584.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=28.625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=28.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-7392.0 max=7584.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.625 max=5.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.8125 max=11.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.46875 max=1.6953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.65625 max=12.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.6796875 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=14.0
Linear output=0 dtype=torch.bfloat16 min=-19.875 max=17.75
Linear output=1 dtype=torch.bfloat16 min=-19.875 max=16.875
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=14.0
Linear output=0 dtype=torch.bfloat16 min=-33.5 max=44.25
Linear output=1 dtype=torch.bfloat16 min=-33.5 max=43.75
Linear input=0 dtype=torch.bfloat16 min=-14.625 max=14.0
Linear output=0 dtype=torch.bfloat16 min=-12.3125 max=11.75
Linear output=1 dtype=torch.bfloat16 min=-11.5625 max=10.625
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=5.3125
Linear output=0 dtype=torch.bfloat16 min=-7.875 max=7.0625
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=5.3125
Linear output=0 dtype=torch.bfloat16 min=-10.875 max=12.4375
Linear output=1 dtype=torch.bfloat16 min=-11.0 max=12.5
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=5.3125
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=10.5625
Linear output=1 dtype=torch.bfloat16 min=-9.625 max=10.1875
Linear input=0 dtype=torch.bfloat16 min=-6.96875 max=8.0625
Linear output=0 dtype=torch.bfloat16 min=-16.25 max=42.5
Linear output=1 dtype=torch.bfloat16 min=-14.8125 max=42.0
Dropout input=0 dtype=torch.bfloat16 min=-16.25 max=42.5
Dropout output=0 dtype=torch.bfloat16 min=-16.25 max=42.5
Dropout output=1 dtype=torch.bfloat16 min=-14.8125 max=42.0
Linear input=0 dtype=torch.bfloat16 min=-7.625 max=8.25
Linear output=0 dtype=torch.bfloat16 min=-21.125 max=18.875
Linear output=1 dtype=torch.bfloat16 min=-15.5 max=18.5
Attention output=0 dtype=torch.bfloat16 min=-16.25 max=42.5
Attention output=1 dtype=torch.bfloat16 min=-21.125 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-436.0 max=158.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=12.625
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=12.3125
Linear input=0 dtype=torch.bfloat16 min=-6.1875 max=3.4375
Linear output=0 dtype=torch.bfloat16 min=-4.96875 max=4.25
Linear output=1 dtype=torch.bfloat16 min=-5.0625 max=4.15625
GELU input=0 dtype=torch.bfloat16 min=-6.1875 max=3.4375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.15625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.25
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-41.5 max=11.3125
FeedForward input=0 dtype=torch.bfloat16 min=-6.1875 max=3.4375
FeedForward output=0 dtype=torch.bfloat16 min=-40.5 max=11.0
FeedForward output=1 dtype=torch.bfloat16 min=-41.5 max=11.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-7360.0 max=7584.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=28.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=28.375
Linear input=0 dtype=torch.bfloat16 min=-103.5 max=93.5
Linear output=0 dtype=torch.bfloat16 min=-42.25 max=85.5
Linear output=1 dtype=torch.bfloat16 min=-41.75 max=70.5
GELU input=0 dtype=torch.bfloat16 min=-103.5 max=93.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=85.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=70.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=85.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=85.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=70.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=85.5
Linear output=0 dtype=torch.bfloat16 min=-4192.0 max=4192.0
Linear output=1 dtype=torch.bfloat16 min=-3376.0 max=3376.0
FeedForward input=0 dtype=torch.bfloat16 min=-103.5 max=93.5
FeedForward output=0 dtype=torch.bfloat16 min=-4192.0 max=4192.0
FeedForward output=1 dtype=torch.bfloat16 min=-3376.0 max=3376.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-244.0 max=119.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-7392.0 max=7584.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6208.0 max=7392.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-211.0 max=162.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-10.0 max=9.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-211.0 max=162.0
LayerNorm output=0 dtype=torch.bfloat16 min=-21.5 max=15.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-21.0 max=15.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-211.0 max=162.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.75 max=13.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.3125 max=5.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0234375 max=2.953125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.65625 max=1.75
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.0 max=9.5625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.5625 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=12.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-6208.0 max=7392.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=28.25
LayerNorm output=1 dtype=torch.bfloat16 min=-31.625 max=28.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6208.0 max=7392.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.65625 max=5.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.5625 max=11.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.90625 max=2.0
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-8.5 max=12.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.1484375 max=1.140625
Linear input=0 dtype=torch.bfloat16 min=-15.75 max=13.75
Linear output=0 dtype=torch.bfloat16 min=-16.375 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-16.125 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-15.75 max=13.75
Linear output=0 dtype=torch.bfloat16 min=-41.75 max=26.875
Linear output=1 dtype=torch.bfloat16 min=-41.0 max=27.625
Linear input=0 dtype=torch.bfloat16 min=-15.75 max=13.75
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=11.1875
Linear output=1 dtype=torch.bfloat16 min=-11.4375 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-7.65625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-7.65625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=14.1875
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=14.0625
Linear input=0 dtype=torch.bfloat16 min=-7.65625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-6.9375 max=7.75
Linear output=1 dtype=torch.bfloat16 min=-7.65625 max=7.46875
Linear input=0 dtype=torch.bfloat16 min=-6.78125 max=7.21875
Linear output=0 dtype=torch.bfloat16 min=-45.25 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-50.5 max=15.75
Dropout input=0 dtype=torch.bfloat16 min=-50.5 max=16.375
Dropout output=0 dtype=torch.bfloat16 min=-45.25 max=16.375
Dropout output=1 dtype=torch.bfloat16 min=-50.5 max=15.75
Linear input=0 dtype=torch.bfloat16 min=-6.28125 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=29.125
Linear output=1 dtype=torch.bfloat16 min=-16.75 max=26.75
Attention output=0 dtype=torch.bfloat16 min=-50.5 max=16.375
Attention output=1 dtype=torch.bfloat16 min=-16.75 max=29.125
LayerNorm input=0 dtype=torch.bfloat16 min=-416.0 max=168.0
LayerNorm output=0 dtype=torch.bfloat16 min=-25.0 max=11.125
LayerNorm output=1 dtype=torch.bfloat16 min=-26.125 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-5.6875 max=3.21875
Linear output=0 dtype=torch.bfloat16 min=-6.53125 max=6.875
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=6.0625
GELU input=0 dtype=torch.bfloat16 min=-5.6875 max=3.21875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.875
Linear output=0 dtype=torch.bfloat16 min=-15.1875 max=76.0
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=77.0
FeedForward input=0 dtype=torch.bfloat16 min=-5.6875 max=3.21875
FeedForward output=0 dtype=torch.bfloat16 min=-15.1875 max=76.0
FeedForward output=1 dtype=torch.bfloat16 min=-14.625 max=77.0
LayerNorm input=0 dtype=torch.bfloat16 min=-6240.0 max=7392.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=28.25
LayerNorm output=1 dtype=torch.bfloat16 min=-31.625 max=28.125
Linear input=0 dtype=torch.bfloat16 min=-212.0 max=165.0
Linear output=0 dtype=torch.bfloat16 min=-58.0 max=127.5
Linear output=1 dtype=torch.bfloat16 min=-66.5 max=128.0
GELU input=0 dtype=torch.bfloat16 min=-212.0 max=165.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=127.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=128.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=128.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=127.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=128.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=128.0
Linear output=0 dtype=torch.bfloat16 min=-7840.0 max=5408.0
Linear output=1 dtype=torch.bfloat16 min=-6688.0 max=5088.0
FeedForward input=0 dtype=torch.bfloat16 min=-212.0 max=165.0
FeedForward output=0 dtype=torch.bfloat16 min=-7840.0 max=5408.0
FeedForward output=1 dtype=torch.bfloat16 min=-6688.0 max=5088.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-211.0 max=162.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-6208.0 max=7392.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6016.0 max=7264.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-195.0 max=166.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=10.625
Linear output=1 dtype=torch.bfloat16 min=-10.75 max=10.625
LayerNorm input=0 dtype=torch.bfloat16 min=-195.0 max=166.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.875 max=12.5
LayerNorm output=1 dtype=torch.bfloat16 min=-14.625 max=12.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-195.0 max=166.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-18.75 max=14.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.3125 max=4.9375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.21875 max=2.703125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.6875 max=1.6328125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.75 max=10.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-12.5 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-12.5625 max=16.5
LayerNorm input=0 dtype=torch.bfloat16 min=-6016.0 max=7264.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=32.75
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=32.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6016.0 max=7264.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.0625 max=5.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.5625 max=7.78125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.515625 max=4.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-9.6875 max=16.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.5625 max=1.0078125
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=14.625
Linear output=0 dtype=torch.bfloat16 min=-28.25 max=30.625
Linear output=1 dtype=torch.bfloat16 min=-28.75 max=30.625
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=14.625
Linear output=0 dtype=torch.bfloat16 min=-31.0 max=33.25
Linear output=1 dtype=torch.bfloat16 min=-31.875 max=34.25
Linear input=0 dtype=torch.bfloat16 min=-18.75 max=14.625
Linear output=0 dtype=torch.bfloat16 min=-12.0625 max=13.8125
Linear output=1 dtype=torch.bfloat16 min=-13.125 max=13.875
Linear input=0 dtype=torch.bfloat16 min=-9.0625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-11.375 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-11.1875 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-9.0625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-15.9375 max=15.4375
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=15.6875
Linear input=0 dtype=torch.bfloat16 min=-9.0625 max=5.28125
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.25
Linear output=1 dtype=torch.bfloat16 min=-8.4375 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-17.25 max=22.625
Linear output=1 dtype=torch.bfloat16 min=-15.75 max=20.0
Dropout input=0 dtype=torch.bfloat16 min=-17.25 max=22.625
Dropout output=0 dtype=torch.bfloat16 min=-17.25 max=22.625
Dropout output=1 dtype=torch.bfloat16 min=-15.75 max=20.0
Linear input=0 dtype=torch.bfloat16 min=-6.46875 max=6.75
Linear output=0 dtype=torch.bfloat16 min=-24.25 max=24.75
Linear output=1 dtype=torch.bfloat16 min=-19.75 max=21.75
Attention output=0 dtype=torch.bfloat16 min=-17.25 max=22.625
Attention output=1 dtype=torch.bfloat16 min=-24.25 max=24.75
LayerNorm input=0 dtype=torch.bfloat16 min=-189.0 max=165.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.625 max=10.5
LayerNorm output=1 dtype=torch.bfloat16 min=-12.125 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-7.40625 max=5.71875
Linear output=1 dtype=torch.bfloat16 min=-6.78125 max=5.3125
GELU input=0 dtype=torch.bfloat16 min=-5.46875 max=5.0625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-51.75 max=15.3125
Linear output=1 dtype=torch.bfloat16 min=-51.5 max=14.8125
FeedForward input=0 dtype=torch.bfloat16 min=-5.46875 max=5.0625
FeedForward output=0 dtype=torch.bfloat16 min=-51.75 max=15.3125
FeedForward output=1 dtype=torch.bfloat16 min=-51.5 max=14.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-6016.0 max=7264.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=32.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=32.5
Linear input=0 dtype=torch.bfloat16 min=-284.0 max=250.0
Linear output=0 dtype=torch.bfloat16 min=-69.5 max=286.0
Linear output=1 dtype=torch.bfloat16 min=-79.5 max=298.0
GELU input=0 dtype=torch.bfloat16 min=-284.0 max=250.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=286.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=298.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=298.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=286.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=298.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=298.0
Linear output=0 dtype=torch.bfloat16 min=-19712.0 max=31616.0
Linear output=1 dtype=torch.bfloat16 min=-20352.0 max=33280.0
FeedForward input=0 dtype=torch.bfloat16 min=-284.0 max=250.0
FeedForward output=0 dtype=torch.bfloat16 min=-19712.0 max=31616.0
FeedForward output=1 dtype=torch.bfloat16 min=-20352.0 max=33280.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-195.0 max=166.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-6016.0 max=7264.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5568.0 max=1728.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-190.0 max=292.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=9.625
Linear output=1 dtype=torch.bfloat16 min=-10.375 max=9.625
LayerNorm input=0 dtype=torch.bfloat16 min=-190.0 max=292.0
LayerNorm output=0 dtype=torch.bfloat16 min=-10.5625 max=16.125
LayerNorm output=1 dtype=torch.bfloat16 min=-12.125 max=15.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-190.0 max=292.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.375 max=9.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.375 max=5.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.015625 max=1.421875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.4375 max=1.078125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.4375 max=9.625
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-2.140625 max=5.03125
Linear output=1 dtype=torch.bfloat16 min=-2.140625 max=5.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-5568.0 max=1728.0
LayerNorm output=0 dtype=torch.bfloat16 min=-29.625 max=12.375
LayerNorm output=1 dtype=torch.bfloat16 min=-29.875 max=12.25
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-5568.0 max=1728.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-21.25 max=4.71875
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-14.875 max=5.5625
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-15.125 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=15.0625
Linear output=1 dtype=torch.bfloat16 min=-11.9375 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-12.8125 max=11.5
Linear output=1 dtype=torch.bfloat16 min=-13.0 max=12.5625
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=9.25
Linear output=0 dtype=torch.bfloat16 min=-9.1875 max=9.375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-1.3203125 max=1.34375
Linear output=1 dtype=torch.bfloat16 min=-1.3984375 max=1.265625
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.59375
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=5.8125
Linear output=1 dtype=torch.bfloat16 min=-6.5625 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=5.71875
Linear output=0 dtype=torch.bfloat16 min=-16.0 max=20.125
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=16.625
Dropout input=0 dtype=torch.bfloat16 min=-16.0 max=20.125
Dropout output=0 dtype=torch.bfloat16 min=-16.0 max=20.125
Dropout output=1 dtype=torch.bfloat16 min=-14.625 max=16.625
Attention output=0 dtype=torch.bfloat16 min=-16.0 max=20.125
Attention output=1 dtype=torch.bfloat16 min=-4.6875 max=4.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-174.0 max=292.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.0625 max=15.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-10.375 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-4.34375 max=3.546875
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=4.90625
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=5.21875
GELU input=0 dtype=torch.bfloat16 min=-4.34375 max=3.546875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.90625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.90625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.21875
Linear output=0 dtype=torch.bfloat16 min=-28.0 max=37.5
Linear output=1 dtype=torch.bfloat16 min=-28.625 max=39.25
FeedForward input=0 dtype=torch.bfloat16 min=-4.34375 max=3.546875
FeedForward output=0 dtype=torch.bfloat16 min=-28.0 max=37.5
FeedForward output=1 dtype=torch.bfloat16 min=-28.625 max=39.25
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-190.0 max=292.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5568.0 max=1728.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-21.25 max=4.71875
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-172.0 max=214.0
SiLU input=0 dtype=torch.bfloat16 min=-21.25 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=4.6875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Linear output=0 dtype=torch.bfloat16 min=-1.046875 max=5.625
Linear output=1 dtype=torch.bfloat16 min=-1.0546875 max=5.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-172.0 max=214.0
LayerNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=12.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-10.625 max=12.0
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-172.0 max=214.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-21.25 max=4.71875
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-7.90625 max=8.5625
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-7.8125 max=8.5625
Linear input=0 dtype=torch.bfloat16 min=-7.90625 max=8.5625
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.25
Linear output=1 dtype=torch.bfloat16 min=-6.40625 max=6.1875
SD3Transformer2DModel output=0 dtype=torch.bfloat16 min=-6.40625 max=6.25
Conv2d input=0 dtype=torch.bfloat16 min=-5.03125 max=4.9375
Conv2d output=0 dtype=torch.bfloat16 min=-19.125 max=7.9375
Conv2d output=1 dtype=torch.bfloat16 min=-19.125 max=7.9375
PatchEmbed input=0 dtype=torch.bfloat16 min=-5.03125 max=4.9375
PatchEmbed output=0 dtype=torch.bfloat16 min=-20.125 max=8.9375
PatchEmbed output=1 dtype=torch.bfloat16 min=-20.125 max=8.9375
Timesteps input=0 dtype=torch.bfloat16 min=8.9375 max=8.9375
Timesteps output=0 dtype=torch.float32 min=-0.9993733763694763 max=0.9999995231628418
Timesteps output=1 dtype=torch.float32 min=-0.9993733763694763 max=0.9999995231628418
Linear input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=2.875
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=2.875
SiLU input=0 dtype=torch.bfloat16 min=-8.25 max=2.875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.71875
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=2.71875
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.71875
Linear output=0 dtype=torch.bfloat16 min=-10.625 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-10.625 max=6.3125
TimestepEmbedding input=0 dtype=torch.bfloat16 min=-1.0 max=1.0
TimestepEmbedding output=0 dtype=torch.bfloat16 min=-10.625 max=6.3125
TimestepEmbedding output=1 dtype=torch.bfloat16 min=-10.625 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-33.75 max=15.9375
SiLU input=0 dtype=torch.bfloat16 min=-40.5 max=15.9375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=15.8125
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=15.9375
Linear output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
Linear output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
PixArtAlphaTextProjection input=0 dtype=torch.bfloat16 min=-5.34375 max=7.40625
PixArtAlphaTextProjection output=0 dtype=torch.bfloat16 min=-19.375 max=1.2265625
PixArtAlphaTextProjection output=1 dtype=torch.bfloat16 min=-20.125 max=1.359375
CombinedTimestepTextProjEmbeddings input=0 dtype=torch.bfloat16 min=8.9375 max=8.9375
CombinedTimestepTextProjEmbeddings input=1 dtype=torch.bfloat16 min=-5.34375 max=7.40625
CombinedTimestepTextProjEmbeddings output=0 dtype=torch.bfloat16 min=-22.5 max=7.09375
CombinedTimestepTextProjEmbeddings output=1 dtype=torch.bfloat16 min=-23.25 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-808.0 max=856.0
Linear output=0 dtype=torch.bfloat16 min=-812.0 max=612.0
Linear output=1 dtype=torch.bfloat16 min=-812.0 max=612.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.3125 max=3.953125
Linear output=1 dtype=torch.bfloat16 min=-3.3125 max=3.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-20.125 max=8.9375
LayerNorm output=0 dtype=torch.bfloat16 min=-19.75 max=8.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-19.75 max=8.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-20.125 max=8.9375
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.25 max=7.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.73046875 max=2.671875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.5703125 max=0.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.421875 max=1.6953125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.40625 max=3.203125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-7.90625 max=7.0
Linear output=1 dtype=torch.bfloat16 min=-8.0 max=7.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.0 max=16.0
LayerNorm output=1 dtype=torch.bfloat16 min=-23.0 max=16.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-812.0 max=612.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.3125 max=7.59375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.8046875 max=1.21875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-8.0 max=7.0625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.25 max=0.5703125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.15625 max=2.921875
Linear input=0 dtype=torch.bfloat16 min=-9.25 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-17.0 max=15.75
Linear output=1 dtype=torch.bfloat16 min=-17.0 max=15.5625
Linear input=0 dtype=torch.bfloat16 min=-9.25 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-13.25 max=10.9375
Linear output=1 dtype=torch.bfloat16 min=-13.0625 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-9.25 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=9.6875
Linear output=1 dtype=torch.bfloat16 min=-9.9375 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.59375
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.21875
Linear output=1 dtype=torch.bfloat16 min=-6.90625 max=6.6875
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.59375
Linear output=0 dtype=torch.bfloat16 min=-4.78125 max=4.75
Linear output=1 dtype=torch.bfloat16 min=-5.09375 max=5.0
Linear input=0 dtype=torch.bfloat16 min=-3.3125 max=7.59375
Linear output=0 dtype=torch.bfloat16 min=-3.046875 max=3.171875
Linear output=1 dtype=torch.bfloat16 min=-3.90625 max=4.46875
Linear input=0 dtype=torch.bfloat16 min=-5.4375 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-18.25 max=7.4375
Linear output=1 dtype=torch.bfloat16 min=-18.0 max=7.34375
Dropout input=0 dtype=torch.bfloat16 min=-18.25 max=7.4375
Dropout output=0 dtype=torch.bfloat16 min=-18.25 max=7.4375
Dropout output=1 dtype=torch.bfloat16 min=-18.0 max=7.34375
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=5.625
Linear output=0 dtype=torch.bfloat16 min=-7.46875 max=11.4375
Linear output=1 dtype=torch.bfloat16 min=-9.3125 max=11.625
Attention output=0 dtype=torch.bfloat16 min=-18.25 max=7.4375
Attention output=1 dtype=torch.bfloat16 min=-9.3125 max=11.625
LayerNorm input=0 dtype=torch.bfloat16 min=-54.25 max=13.375
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=7.75
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=7.75
Linear input=0 dtype=torch.bfloat16 min=-2.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.46875
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=7.4375
GELU input=0 dtype=torch.bfloat16 min=-2.625 max=2.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.46875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.46875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.46875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.46875
Linear output=0 dtype=torch.bfloat16 min=-30.0 max=40.0
Linear output=1 dtype=torch.bfloat16 min=-29.875 max=40.0
FeedForward input=0 dtype=torch.bfloat16 min=-2.625 max=2.84375
FeedForward output=0 dtype=torch.bfloat16 min=-30.0 max=40.0
FeedForward output=1 dtype=torch.bfloat16 min=-29.875 max=40.0
LayerNorm input=0 dtype=torch.bfloat16 min=-816.0 max=612.0
LayerNorm output=0 dtype=torch.bfloat16 min=-23.25 max=16.0
LayerNorm output=1 dtype=torch.bfloat16 min=-23.375 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=8.625
Linear output=0 dtype=torch.bfloat16 min=-22.25 max=30.5
Linear output=1 dtype=torch.bfloat16 min=-19.375 max=36.0
GELU input=0 dtype=torch.bfloat16 min=-9.0 max=8.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=30.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=36.0
Linear output=0 dtype=torch.bfloat16 min=-43.5 max=54.5
Linear output=1 dtype=torch.bfloat16 min=-45.75 max=51.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.0 max=8.625
FeedForward output=0 dtype=torch.bfloat16 min=-43.5 max=54.5
FeedForward output=1 dtype=torch.bfloat16 min=-45.75 max=51.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-20.125 max=8.9375
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-812.0 max=612.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-836.0 max=608.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-57.25 max=42.75
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=3.984375
Linear output=1 dtype=torch.bfloat16 min=-4.5 max=3.984375
LayerNorm input=0 dtype=torch.bfloat16 min=-57.25 max=42.75
LayerNorm output=0 dtype=torch.bfloat16 min=-24.875 max=15.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-24.875 max=15.4375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-57.25 max=42.75
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-17.25 max=10.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.84765625 max=3.984375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.65625 max=1.3984375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.9375 max=2.921875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.93359375 max=3.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.109375 max=5.125
Linear output=1 dtype=torch.bfloat16 min=-3.15625 max=5.125
LayerNorm input=0 dtype=torch.bfloat16 min=-836.0 max=608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.625 max=15.875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=15.875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-836.0 max=608.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-9.0 max=6.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=2.59375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.328125 max=3.203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.421875 max=5.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.15625 max=1.484375
Linear input=0 dtype=torch.bfloat16 min=-17.25 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-25.875 max=24.875
Linear output=1 dtype=torch.bfloat16 min=-25.625 max=24.75
Linear input=0 dtype=torch.bfloat16 min=-17.25 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-12.9375 max=14.3125
Linear output=1 dtype=torch.bfloat16 min=-12.75 max=14.0625
Linear input=0 dtype=torch.bfloat16 min=-17.25 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-13.5625 max=13.8125
Linear output=1 dtype=torch.bfloat16 min=-13.4375 max=13.875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-4.625 max=4.9375
Linear output=1 dtype=torch.bfloat16 min=-4.40625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-4.8125 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-4.65625 max=4.5625
Linear input=0 dtype=torch.bfloat16 min=-9.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-3.28125 max=3.421875
Linear output=1 dtype=torch.bfloat16 min=-2.234375 max=2.578125
Linear input=0 dtype=torch.bfloat16 min=-6.5625 max=6.53125
Linear output=0 dtype=torch.bfloat16 min=-24.75 max=13.5625
Linear output=1 dtype=torch.bfloat16 min=-24.0 max=13.5
Dropout input=0 dtype=torch.bfloat16 min=-24.75 max=13.5625
Dropout output=0 dtype=torch.bfloat16 min=-24.75 max=13.5625
Dropout output=1 dtype=torch.bfloat16 min=-24.0 max=13.5
Linear input=0 dtype=torch.bfloat16 min=-5.53125 max=4.21875
Linear output=0 dtype=torch.bfloat16 min=-14.9375 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-16.25 max=12.125
Attention output=0 dtype=torch.bfloat16 min=-24.75 max=13.5625
Attention output=1 dtype=torch.bfloat16 min=-16.25 max=12.125
LayerNorm input=0 dtype=torch.bfloat16 min=-142.0 max=41.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=11.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-4.84375 max=3.515625
Linear output=0 dtype=torch.bfloat16 min=-11.125 max=6.53125
Linear output=1 dtype=torch.bfloat16 min=-11.125 max=6.46875
GELU input=0 dtype=torch.bfloat16 min=-4.84375 max=3.515625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.53125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.46875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.53125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.53125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.46875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.53125
Linear output=0 dtype=torch.bfloat16 min=-41.75 max=45.5
Linear output=1 dtype=torch.bfloat16 min=-42.5 max=46.0
FeedForward input=0 dtype=torch.bfloat16 min=-4.84375 max=3.515625
FeedForward output=0 dtype=torch.bfloat16 min=-41.75 max=45.5
FeedForward output=1 dtype=torch.bfloat16 min=-42.5 max=46.0
LayerNorm input=0 dtype=torch.bfloat16 min=-836.0 max=608.0
LayerNorm output=0 dtype=torch.bfloat16 min=-26.5 max=15.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-26.875 max=15.875
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=11.3125
Linear output=0 dtype=torch.bfloat16 min=-29.25 max=26.125
Linear output=1 dtype=torch.bfloat16 min=-25.0 max=24.125
GELU input=0 dtype=torch.bfloat16 min=-19.625 max=11.3125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=24.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=26.125
Linear output=0 dtype=torch.bfloat16 min=-219.0 max=510.0
Linear output=1 dtype=torch.bfloat16 min=-218.0 max=508.0
FeedForward input=0 dtype=torch.bfloat16 min=-19.625 max=11.3125
FeedForward output=0 dtype=torch.bfloat16 min=-219.0 max=510.0
FeedForward output=1 dtype=torch.bfloat16 min=-218.0 max=508.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-57.25 max=42.75
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-836.0 max=608.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-2432.0 max=628.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-117.0 max=46.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.21875 max=3.84375
Linear output=1 dtype=torch.bfloat16 min=-5.28125 max=3.84375
LayerNorm input=0 dtype=torch.bfloat16 min=-117.0 max=46.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=12.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=13.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-117.0 max=46.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.9375 max=5.09375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.53125 max=0.53515625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.03125 max=2.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.28125 max=3.4375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.62109375 max=2.28125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=3.328125
Linear output=1 dtype=torch.bfloat16 min=-6.15625 max=3.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-2432.0 max=628.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=10.25
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=11.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-2432.0 max=628.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.625 max=5.84375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=2.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-6.15625 max=3.109375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.953125 max=2.015625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.34375 max=3.34375
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-12.1875 max=10.0625
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-10.9375 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-11.0 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-6.9375 max=5.09375
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=10.125
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=10.0
Linear input=0 dtype=torch.bfloat16 min=-6.625 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-3.703125 max=4.15625
Linear output=1 dtype=torch.bfloat16 min=-4.5625 max=4.90625
Linear input=0 dtype=torch.bfloat16 min=-6.625 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=5.09375
Linear output=1 dtype=torch.bfloat16 min=-4.625 max=4.6875
Linear input=0 dtype=torch.bfloat16 min=-6.625 max=5.84375
Linear output=0 dtype=torch.bfloat16 min=-5.5625 max=4.125
Linear output=1 dtype=torch.bfloat16 min=-4.78125 max=3.53125
Linear input=0 dtype=torch.bfloat16 min=-6.875 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-15.4375 max=17.75
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=18.25
Dropout input=0 dtype=torch.bfloat16 min=-15.625 max=18.25
Dropout output=0 dtype=torch.bfloat16 min=-15.4375 max=17.75
Dropout output=1 dtype=torch.bfloat16 min=-15.625 max=18.25
Linear input=0 dtype=torch.bfloat16 min=-5.5 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-15.5 max=14.125
Attention output=0 dtype=torch.bfloat16 min=-15.625 max=18.25
Attention output=1 dtype=torch.bfloat16 min=-15.5 max=14.125
LayerNorm input=0 dtype=torch.bfloat16 min=-161.0 max=45.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=10.75
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=10.8125
Linear output=1 dtype=torch.bfloat16 min=-8.0 max=10.9375
GELU input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear output=0 dtype=torch.bfloat16 min=-26.625 max=51.75
Linear output=1 dtype=torch.bfloat16 min=-26.375 max=51.75
FeedForward input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
FeedForward output=0 dtype=torch.bfloat16 min=-26.625 max=51.75
FeedForward output=1 dtype=torch.bfloat16 min=-26.375 max=51.75
LayerNorm input=0 dtype=torch.bfloat16 min=-2448.0 max=616.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=9.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=10.75
Linear input=0 dtype=torch.bfloat16 min=-21.375 max=24.375
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=17.125
Linear output=1 dtype=torch.bfloat16 min=-21.25 max=17.25
GELU input=0 dtype=torch.bfloat16 min=-21.375 max=24.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=17.25
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=17.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=17.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=17.25
Linear output=0 dtype=torch.bfloat16 min=-174.0 max=253.0
Linear output=1 dtype=torch.bfloat16 min=-174.0 max=253.0
FeedForward input=0 dtype=torch.bfloat16 min=-21.375 max=24.375
FeedForward output=0 dtype=torch.bfloat16 min=-174.0 max=253.0
FeedForward output=1 dtype=torch.bfloat16 min=-174.0 max=253.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-117.0 max=46.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-2432.0 max=628.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3552.0 max=672.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-80.0 max=44.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.1875 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-4.21875 max=5.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-80.0 max=44.5
LayerNorm output=0 dtype=torch.bfloat16 min=-29.125 max=14.75
LayerNorm output=1 dtype=torch.bfloat16 min=-29.0 max=14.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-80.0 max=44.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.3125 max=11.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.84375 max=4.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.3125 max=2.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.09375 max=2.34375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.8984375 max=4.78125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.03125 max=3.8125
Linear output=1 dtype=torch.bfloat16 min=-5.0625 max=3.828125
LayerNorm input=0 dtype=torch.bfloat16 min=-3552.0 max=672.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=7.46875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=10.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3552.0 max=672.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.21875 max=7.53125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.46875 max=3.609375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-5.0625 max=2.5625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.84375 max=1.96875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.984375 max=3.828125
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-35.25 max=28.375
Linear output=1 dtype=torch.bfloat16 min=-35.25 max=28.25
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=16.375
Linear input=0 dtype=torch.bfloat16 min=-15.3125 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=13.5
Linear output=1 dtype=torch.bfloat16 min=-12.375 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-4.90625 max=4.375
Linear output=1 dtype=torch.bfloat16 min=-4.34375 max=4.0
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-4.5 max=5.3125
Linear output=1 dtype=torch.bfloat16 min=-4.5625 max=5.3125
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=7.53125
Linear output=0 dtype=torch.bfloat16 min=-3.8125 max=4.78125
Linear output=1 dtype=torch.bfloat16 min=-3.359375 max=4.5
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=10.1875
Linear output=0 dtype=torch.bfloat16 min=-37.5 max=14.0
Linear output=1 dtype=torch.bfloat16 min=-38.25 max=13.4375
Dropout input=0 dtype=torch.bfloat16 min=-38.25 max=14.0
Dropout output=0 dtype=torch.bfloat16 min=-37.5 max=14.0
Dropout output=1 dtype=torch.bfloat16 min=-38.25 max=13.4375
Linear input=0 dtype=torch.bfloat16 min=-6.90625 max=7.71875
Linear output=0 dtype=torch.bfloat16 min=-31.0 max=23.625
Linear output=1 dtype=torch.bfloat16 min=-32.25 max=24.0
Attention output=0 dtype=torch.bfloat16 min=-38.25 max=14.0
Attention output=1 dtype=torch.bfloat16 min=-32.25 max=24.0
LayerNorm input=0 dtype=torch.bfloat16 min=-227.0 max=44.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.75 max=8.5625
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-5.8125 max=4.34375
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=6.6875
Linear output=1 dtype=torch.bfloat16 min=-6.875 max=6.5625
GELU input=0 dtype=torch.bfloat16 min=-5.8125 max=4.34375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.6875
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=17.125
Linear output=1 dtype=torch.bfloat16 min=-8.5625 max=16.625
FeedForward input=0 dtype=torch.bfloat16 min=-5.8125 max=4.34375
FeedForward output=0 dtype=torch.bfloat16 min=-8.6875 max=17.125
FeedForward output=1 dtype=torch.bfloat16 min=-8.5625 max=16.625
LayerNorm input=0 dtype=torch.bfloat16 min=-3584.0 max=676.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.25 max=9.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-15.125 max=15.375
Linear output=0 dtype=torch.bfloat16 min=-12.5625 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-11.625 max=9.5625
GELU input=0 dtype=torch.bfloat16 min=-15.125 max=15.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.875
Linear output=0 dtype=torch.bfloat16 min=-70.0 max=39.0
Linear output=1 dtype=torch.bfloat16 min=-69.5 max=33.0
FeedForward input=0 dtype=torch.bfloat16 min=-15.125 max=15.375
FeedForward output=0 dtype=torch.bfloat16 min=-70.0 max=39.0
FeedForward output=1 dtype=torch.bfloat16 min=-69.5 max=33.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-80.0 max=44.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-3552.0 max=672.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-3840.0 max=684.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-164.0 max=44.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.78125 max=6.65625
Linear output=1 dtype=torch.bfloat16 min=-3.78125 max=6.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-164.0 max=44.5
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=9.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=9.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-164.0 max=44.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.5 max=6.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-0.76171875 max=4.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0234375 max=2.34375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.125 max=2.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.015625 max=6.6875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=7.625
Linear output=1 dtype=torch.bfloat16 min=-5.90625 max=7.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-3840.0 max=684.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=9.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=11.8125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-3840.0 max=684.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.34375 max=6.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.921875 max=5.15625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.78125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.671875 max=2.21875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-5.90625 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-13.0 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-12.6875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-10.375 max=10.6875
Linear output=1 dtype=torch.bfloat16 min=-10.25 max=10.375
Linear input=0 dtype=torch.bfloat16 min=-7.5 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-11.875 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-4.34375 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=4.53125
Linear output=1 dtype=torch.bfloat16 min=-4.71875 max=4.34375
Linear input=0 dtype=torch.bfloat16 min=-4.34375 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-5.40625 max=5.25
Linear output=1 dtype=torch.bfloat16 min=-5.8125 max=5.21875
Linear input=0 dtype=torch.bfloat16 min=-4.34375 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-4.375 max=4.75
Linear output=1 dtype=torch.bfloat16 min=-4.34375 max=5.09375
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=6.96875
Linear output=0 dtype=torch.bfloat16 min=-18.25 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=11.0
Dropout input=0 dtype=torch.bfloat16 min=-18.25 max=11.375
Dropout output=0 dtype=torch.bfloat16 min=-18.25 max=11.375
Dropout output=1 dtype=torch.bfloat16 min=-17.5 max=11.0
Linear input=0 dtype=torch.bfloat16 min=-4.96875 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-11.25 max=11.625
Linear output=1 dtype=torch.bfloat16 min=-9.875 max=9.8125
Attention output=0 dtype=torch.bfloat16 min=-18.25 max=11.375
Attention output=1 dtype=torch.bfloat16 min=-11.25 max=11.625
LayerNorm input=0 dtype=torch.bfloat16 min=-217.0 max=44.25
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=7.59375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=7.59375
Linear input=0 dtype=torch.bfloat16 min=-5.96875 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=5.34375
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=5.34375
GELU input=0 dtype=torch.bfloat16 min=-5.96875 max=3.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.34375
Linear output=0 dtype=torch.bfloat16 min=-9.125 max=11.375
Linear output=1 dtype=torch.bfloat16 min=-9.0625 max=11.375
FeedForward input=0 dtype=torch.bfloat16 min=-5.96875 max=3.78125
FeedForward output=0 dtype=torch.bfloat16 min=-9.125 max=11.375
FeedForward output=1 dtype=torch.bfloat16 min=-9.0625 max=11.375
LayerNorm input=0 dtype=torch.bfloat16 min=-3872.0 max=684.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=8.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.25 max=10.6875
Linear input=0 dtype=torch.bfloat16 min=-10.25 max=6.8125
Linear output=0 dtype=torch.bfloat16 min=-13.5625 max=13.5625
Linear output=1 dtype=torch.bfloat16 min=-13.0625 max=13.375
GELU input=0 dtype=torch.bfloat16 min=-10.25 max=6.8125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.5625
Linear output=0 dtype=torch.bfloat16 min=-69.0 max=38.25
Linear output=1 dtype=torch.bfloat16 min=-66.5 max=34.75
FeedForward input=0 dtype=torch.bfloat16 min=-10.25 max=6.8125
FeedForward output=0 dtype=torch.bfloat16 min=-69.0 max=38.25
FeedForward output=1 dtype=torch.bfloat16 min=-66.5 max=34.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-164.0 max=44.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-3840.0 max=684.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4384.0 max=704.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-192.0 max=45.75
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-4.75 max=6.03125
LayerNorm input=0 dtype=torch.bfloat16 min=-192.0 max=45.75
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=8.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=8.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-192.0 max=45.75
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.8125 max=5.96875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.75 max=0.76171875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.0625 max=1.7890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.765625 max=2.3125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-0.97265625 max=6.03125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.46875 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-4.46875 max=9.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4384.0 max=704.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=8.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=9.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4384.0 max=704.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.0625 max=3.78125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.46875 max=6.1875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6796875 max=1.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.484375 max=1.921875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.671875 max=9.8125
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-8.0 max=8.875
Linear output=1 dtype=torch.bfloat16 min=-7.84375 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=8.125
Linear output=1 dtype=torch.bfloat16 min=-8.4375 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-8.8125 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-8.4375 max=9.25
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-3.796875 max=4.15625
Linear output=1 dtype=torch.bfloat16 min=-4.1875 max=4.0625
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=4.8125
Linear output=1 dtype=torch.bfloat16 min=-5.0625 max=4.8125
Linear input=0 dtype=torch.bfloat16 min=-7.0625 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-4.5625 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-4.96875 max=4.71875
Linear input=0 dtype=torch.bfloat16 min=-7.21875 max=5.96875
Linear output=0 dtype=torch.bfloat16 min=-11.625 max=16.5
Linear output=1 dtype=torch.bfloat16 min=-11.375 max=16.0
Dropout input=0 dtype=torch.bfloat16 min=-11.625 max=16.5
Dropout output=0 dtype=torch.bfloat16 min=-11.625 max=16.5
Dropout output=1 dtype=torch.bfloat16 min=-11.375 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-3.734375 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-9.3125 max=12.9375
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=10.0
Attention output=0 dtype=torch.bfloat16 min=-11.625 max=16.5
Attention output=1 dtype=torch.bfloat16 min=-9.3125 max=12.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-247.0 max=45.75
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=6.96875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=6.96875
Linear input=0 dtype=torch.bfloat16 min=-4.21875 max=2.78125
Linear output=0 dtype=torch.bfloat16 min=-9.875 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-9.875 max=7.4375
GELU input=0 dtype=torch.bfloat16 min=-4.21875 max=2.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Linear output=0 dtype=torch.bfloat16 min=-7.9375 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-7.8125 max=19.625
FeedForward input=0 dtype=torch.bfloat16 min=-4.21875 max=2.78125
FeedForward output=0 dtype=torch.bfloat16 min=-7.9375 max=19.625
FeedForward output=1 dtype=torch.bfloat16 min=-7.8125 max=19.625
LayerNorm input=0 dtype=torch.bfloat16 min=-4384.0 max=704.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=7.65625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-6.53125 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-12.25 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-13.25 max=13.1875
GELU input=0 dtype=torch.bfloat16 min=-6.53125 max=5.40625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=13.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=13.1875
Linear output=0 dtype=torch.bfloat16 min=-52.5 max=36.5
Linear output=1 dtype=torch.bfloat16 min=-51.5 max=36.75
FeedForward input=0 dtype=torch.bfloat16 min=-6.53125 max=5.40625
FeedForward output=0 dtype=torch.bfloat16 min=-52.5 max=36.5
FeedForward output=1 dtype=torch.bfloat16 min=-51.5 max=36.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-192.0 max=45.75
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4384.0 max=704.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4896.0 max=744.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-194.0 max=46.25
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.0625 max=5.96875
Linear output=1 dtype=torch.bfloat16 min=-4.09375 max=5.96875
LayerNorm input=0 dtype=torch.bfloat16 min=-194.0 max=46.25
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=9.25
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=9.25
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-194.0 max=46.25
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.875 max=6.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.09375 max=1.109375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.234375 max=2.203125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.328125 max=3.328125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.5390625 max=5.96875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-11.0625 max=8.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=744.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=6.78125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=8.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4896.0 max=744.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-4.0 max=6.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-7.71875 max=6.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.453125 max=1.9296875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.0625 max=8.8125
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-10.4375 max=11.5
Linear output=1 dtype=torch.bfloat16 min=-10.25 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-13.4375 max=12.3125
Linear output=1 dtype=torch.bfloat16 min=-13.25 max=12.0625
Linear input=0 dtype=torch.bfloat16 min=-11.875 max=6.5
Linear output=0 dtype=torch.bfloat16 min=-7.625 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-7.59375 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-4.65625 max=6.03125
Linear output=1 dtype=torch.bfloat16 min=-4.71875 max=5.96875
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-4.75 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-4.90625 max=5.125
Linear input=0 dtype=torch.bfloat16 min=-4.0 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-5.75 max=6.84375
Linear output=1 dtype=torch.bfloat16 min=-5.78125 max=6.84375
Linear input=0 dtype=torch.bfloat16 min=-5.25 max=6.375
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=22.875
Linear output=1 dtype=torch.bfloat16 min=-9.6875 max=23.125
Dropout input=0 dtype=torch.bfloat16 min=-10.125 max=23.125
Dropout output=0 dtype=torch.bfloat16 min=-10.125 max=22.875
Dropout output=1 dtype=torch.bfloat16 min=-9.6875 max=23.125
Linear input=0 dtype=torch.bfloat16 min=-3.609375 max=4.75
Linear output=0 dtype=torch.bfloat16 min=-15.125 max=9.4375
Linear output=1 dtype=torch.bfloat16 min=-13.875 max=9.25
Attention output=0 dtype=torch.bfloat16 min=-10.125 max=23.125
Attention output=1 dtype=torch.bfloat16 min=-15.125 max=9.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-260.0 max=46.25
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=6.8125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=6.75
Linear input=0 dtype=torch.bfloat16 min=-4.65625 max=2.5
Linear output=0 dtype=torch.bfloat16 min=-7.21875 max=4.96875
Linear output=1 dtype=torch.bfloat16 min=-7.21875 max=5.0625
GELU input=0 dtype=torch.bfloat16 min=-4.65625 max=2.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.96875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.96875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-7.40625 max=17.875
Linear output=1 dtype=torch.bfloat16 min=-7.34375 max=17.75
FeedForward input=0 dtype=torch.bfloat16 min=-4.65625 max=2.5
FeedForward output=0 dtype=torch.bfloat16 min=-7.40625 max=17.875
FeedForward output=1 dtype=torch.bfloat16 min=-7.34375 max=17.75
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=740.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=7.0
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=7.65625
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=3.78125
Linear output=0 dtype=torch.bfloat16 min=-13.125 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-13.4375 max=9.0
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=3.78125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=9.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=9.125
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=38.25
Linear output=1 dtype=torch.bfloat16 min=-23.625 max=39.75
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=3.78125
FeedForward output=0 dtype=torch.bfloat16 min=-21.25 max=38.25
FeedForward output=1 dtype=torch.bfloat16 min=-23.625 max=39.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-194.0 max=46.25
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4896.0 max=744.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5088.0 max=860.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-190.0 max=46.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.84375 max=6.15625
Linear output=1 dtype=torch.bfloat16 min=-6.84375 max=6.1875
LayerNorm input=0 dtype=torch.bfloat16 min=-190.0 max=46.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=9.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=9.1875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-190.0 max=46.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-17.625 max=12.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.2734375 max=4.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.0 max=2.53125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.546875 max=2.65625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.84375 max=2.03125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=14.25
Linear output=1 dtype=torch.bfloat16 min=-12.4375 max=14.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-5088.0 max=860.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=9.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=9.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5088.0 max=860.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.65625 max=3.671875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-9.75 max=7.5625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.96875 max=1.0390625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.328125 max=2.078125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-12.4375 max=14.3125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=12.4375
Linear output=0 dtype=torch.bfloat16 min=-14.6875 max=14.1875
Linear output=1 dtype=torch.bfloat16 min=-14.5625 max=14.125
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=12.4375
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-15.25 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=12.4375
Linear output=0 dtype=torch.bfloat16 min=-15.625 max=13.75
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=13.625
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-4.6875 max=6.28125
Linear output=1 dtype=torch.bfloat16 min=-4.71875 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=5.8125
Linear output=1 dtype=torch.bfloat16 min=-5.5625 max=6.0625
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-4.71875 max=6.1875
Linear output=1 dtype=torch.bfloat16 min=-4.9375 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-10.1875 max=8.125
Linear output=0 dtype=torch.bfloat16 min=-25.25 max=12.4375
Linear output=1 dtype=torch.bfloat16 min=-24.75 max=12.1875
Dropout input=0 dtype=torch.bfloat16 min=-25.25 max=12.4375
Dropout output=0 dtype=torch.bfloat16 min=-25.25 max=12.4375
Dropout output=1 dtype=torch.bfloat16 min=-24.75 max=12.1875
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=6.125
Linear output=0 dtype=torch.bfloat16 min=-13.6875 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-13.9375 max=10.25
Attention output=0 dtype=torch.bfloat16 min=-25.25 max=12.4375
Attention output=1 dtype=torch.bfloat16 min=-13.9375 max=10.25
LayerNorm input=0 dtype=torch.bfloat16 min=-260.0 max=46.25
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=6.84375
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=6.71875
Linear input=0 dtype=torch.bfloat16 min=-2.921875 max=2.65625
Linear output=0 dtype=torch.bfloat16 min=-10.9375 max=4.9375
Linear output=1 dtype=torch.bfloat16 min=-10.875 max=5.03125
GELU input=0 dtype=torch.bfloat16 min=-2.921875 max=2.65625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=9.6875
Linear output=1 dtype=torch.bfloat16 min=-16.625 max=9.75
FeedForward input=0 dtype=torch.bfloat16 min=-2.921875 max=2.65625
FeedForward output=0 dtype=torch.bfloat16 min=-16.625 max=9.6875
FeedForward output=1 dtype=torch.bfloat16 min=-16.625 max=9.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5056.0 max=804.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=5.78125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=6.5
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=5.53125
Linear output=0 dtype=torch.bfloat16 min=-12.125 max=10.0625
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=11.3125
GELU input=0 dtype=torch.bfloat16 min=-8.25 max=5.53125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.0625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.3125
Linear output=0 dtype=torch.bfloat16 min=-63.5 max=49.75
Linear output=1 dtype=torch.bfloat16 min=-69.0 max=55.5
FeedForward input=0 dtype=torch.bfloat16 min=-8.25 max=5.53125
FeedForward output=0 dtype=torch.bfloat16 min=-63.5 max=49.75
FeedForward output=1 dtype=torch.bfloat16 min=-69.0 max=55.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-190.0 max=46.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5088.0 max=860.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5248.0 max=976.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-188.0 max=46.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.1875 max=9.8125
Linear output=1 dtype=torch.bfloat16 min=-3.203125 max=9.8125
LayerNorm input=0 dtype=torch.bfloat16 min=-188.0 max=46.5
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=9.9375
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=9.9375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-188.0 max=46.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.25 max=10.25
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.78125 max=5.65625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.140625 max=2.328125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.3125 max=2.5
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.09375 max=9.8125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-21.0 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-21.0 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-5248.0 max=976.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=15.0
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=15.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5248.0 max=976.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-3.65625 max=3.09375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-10.6875 max=10.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.359375 max=0.90234375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.859375 max=1.6484375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.0 max=17.5
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-17.375 max=13.25
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=13.0625
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-15.4375 max=15.625
Linear output=1 dtype=torch.bfloat16 min=-15.125 max=15.4375
Linear input=0 dtype=torch.bfloat16 min=-14.25 max=10.25
Linear output=0 dtype=torch.bfloat16 min=-12.0 max=15.4375
Linear output=1 dtype=torch.bfloat16 min=-11.6875 max=15.0625
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.09375
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=5.28125
Linear output=1 dtype=torch.bfloat16 min=-5.875 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.09375
Linear output=0 dtype=torch.bfloat16 min=-5.625 max=4.6875
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=4.96875
Linear input=0 dtype=torch.bfloat16 min=-3.65625 max=3.09375
Linear output=0 dtype=torch.bfloat16 min=-4.8125 max=4.1875
Linear output=1 dtype=torch.bfloat16 min=-5.3125 max=5.03125
Linear input=0 dtype=torch.bfloat16 min=-6.8125 max=6.875
Linear output=0 dtype=torch.bfloat16 min=-23.0 max=11.1875
Linear output=1 dtype=torch.bfloat16 min=-22.875 max=11.1875
Dropout input=0 dtype=torch.bfloat16 min=-23.0 max=11.1875
Dropout output=0 dtype=torch.bfloat16 min=-23.0 max=11.1875
Dropout output=1 dtype=torch.bfloat16 min=-22.875 max=11.1875
Linear input=0 dtype=torch.bfloat16 min=-5.78125 max=7.65625
Linear output=0 dtype=torch.bfloat16 min=-18.375 max=15.5
Linear output=1 dtype=torch.bfloat16 min=-18.375 max=14.9375
Attention output=0 dtype=torch.bfloat16 min=-23.0 max=11.1875
Attention output=1 dtype=torch.bfloat16 min=-18.375 max=15.5
LayerNorm input=0 dtype=torch.bfloat16 min=-290.0 max=54.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=8.0
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-4.0625 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=14.75
Linear output=1 dtype=torch.bfloat16 min=-8.375 max=16.75
GELU input=0 dtype=torch.bfloat16 min=-4.0625 max=2.59375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.75
Linear output=0 dtype=torch.bfloat16 min=-10.75 max=14.8125
Linear output=1 dtype=torch.bfloat16 min=-12.75 max=14.75
FeedForward input=0 dtype=torch.bfloat16 min=-4.0625 max=2.59375
FeedForward output=0 dtype=torch.bfloat16 min=-10.75 max=14.8125
FeedForward output=1 dtype=torch.bfloat16 min=-12.75 max=14.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=936.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=15.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-9.25 max=4.3125
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=11.5
Linear output=1 dtype=torch.bfloat16 min=-10.1875 max=11.5
GELU input=0 dtype=torch.bfloat16 min=-9.25 max=4.3125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.5
Linear output=0 dtype=torch.bfloat16 min=-51.25 max=45.25
Linear output=1 dtype=torch.bfloat16 min=-51.0 max=45.5
FeedForward input=0 dtype=torch.bfloat16 min=-9.25 max=4.3125
FeedForward output=0 dtype=torch.bfloat16 min=-51.25 max=45.25
FeedForward output=1 dtype=torch.bfloat16 min=-51.0 max=45.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-188.0 max=46.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5248.0 max=976.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-4896.0 max=1552.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-404.0 max=58.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-5.3125 max=7.40625
LayerNorm input=0 dtype=torch.bfloat16 min=-404.0 max=58.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.0 max=13.625
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=13.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-404.0 max=58.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.5625 max=12.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.3125 max=1.6953125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6953125 max=2.046875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.1875 max=1.953125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-2.25 max=7.40625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=16.625
Linear output=1 dtype=torch.bfloat16 min=-16.75 max=16.625
LayerNorm input=0 dtype=torch.bfloat16 min=-4896.0 max=1552.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=17.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=17.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-4896.0 max=1552.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.4375 max=4.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.625 max=12.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.078125 max=0.87890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.96875 max=2.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-16.75 max=16.625
Linear input=0 dtype=torch.bfloat16 min=-10.5625 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=9.25
Linear output=1 dtype=torch.bfloat16 min=-8.8125 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-10.5625 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-13.3125 max=13.0
Linear output=1 dtype=torch.bfloat16 min=-13.1875 max=12.9375
Linear input=0 dtype=torch.bfloat16 min=-10.5625 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-8.625 max=9.1875
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-5.53125 max=7.375
Linear output=1 dtype=torch.bfloat16 min=-5.40625 max=6.78125
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-5.84375 max=6.0625
Linear output=1 dtype=torch.bfloat16 min=-5.96875 max=6.09375
Linear input=0 dtype=torch.bfloat16 min=-6.4375 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-6.96875 max=7.90625
Linear output=1 dtype=torch.bfloat16 min=-6.9375 max=7.9375
Linear input=0 dtype=torch.bfloat16 min=-5.875 max=5.875
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=26.25
Linear output=1 dtype=torch.bfloat16 min=-9.625 max=26.125
Dropout input=0 dtype=torch.bfloat16 min=-9.625 max=26.25
Dropout output=0 dtype=torch.bfloat16 min=-9.625 max=26.25
Dropout output=1 dtype=torch.bfloat16 min=-9.625 max=26.125
Linear input=0 dtype=torch.bfloat16 min=-5.71875 max=6.28125
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-21.0 max=24.25
Attention output=0 dtype=torch.bfloat16 min=-9.625 max=26.25
Attention output=1 dtype=torch.bfloat16 min=-21.0 max=24.25
LayerNorm input=0 dtype=torch.bfloat16 min=-490.0 max=54.75
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=8.125
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
Linear output=0 dtype=torch.bfloat16 min=-8.75 max=7.4375
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=8.1875
GELU input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.4375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-7.59375 max=22.875
Linear output=1 dtype=torch.bfloat16 min=-11.6875 max=22.875
FeedForward input=0 dtype=torch.bfloat16 min=-3.625 max=2.84375
FeedForward output=0 dtype=torch.bfloat16 min=-7.59375 max=22.875
FeedForward output=1 dtype=torch.bfloat16 min=-11.6875 max=22.875
LayerNorm input=0 dtype=torch.bfloat16 min=-4992.0 max=1568.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=17.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.75 max=17.625
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=5.1875
Linear output=0 dtype=torch.bfloat16 min=-24.875 max=14.75
Linear output=1 dtype=torch.bfloat16 min=-23.875 max=14.5
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=5.1875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=14.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=14.75
Linear output=0 dtype=torch.bfloat16 min=-78.5 max=74.5
Linear output=1 dtype=torch.bfloat16 min=-77.0 max=72.5
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=5.1875
FeedForward output=0 dtype=torch.bfloat16 min=-78.5 max=74.5
FeedForward output=1 dtype=torch.bfloat16 min=-77.0 max=72.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-404.0 max=58.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-4896.0 max=1552.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5120.0 max=2848.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-576.0 max=68.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-3.421875 max=6.625
Linear output=1 dtype=torch.bfloat16 min=-3.421875 max=6.65625
LayerNorm input=0 dtype=torch.bfloat16 min=-576.0 max=68.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=16.75
LayerNorm output=1 dtype=torch.bfloat16 min=-37.5 max=16.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-576.0 max=68.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.1875 max=13.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-1.7578125 max=4.96875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6328125 max=1.71875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.109375 max=1.265625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.421875 max=6.65625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-20.625 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-20.625 max=19.75
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=2848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=21.125
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=21.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5120.0 max=2848.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-11.0 max=15.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.92578125 max=0.69140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.6875 max=2.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-20.625 max=19.75
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-14.8125 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-14.75 max=9.75
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=16.0
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=16.0
Linear input=0 dtype=torch.bfloat16 min=-13.1875 max=13.125
Linear output=0 dtype=torch.bfloat16 min=-8.8125 max=8.75
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=6.09375
Linear output=1 dtype=torch.bfloat16 min=-7.28125 max=6.34375
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=5.34375
Linear output=1 dtype=torch.bfloat16 min=-6.65625 max=5.40625
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-7.9375 max=9.875
Linear output=1 dtype=torch.bfloat16 min=-8.625 max=10.125
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=6.1875
Linear output=0 dtype=torch.bfloat16 min=-32.25 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-32.5 max=7.25
Dropout input=0 dtype=torch.bfloat16 min=-32.5 max=7.34375
Dropout output=0 dtype=torch.bfloat16 min=-32.25 max=7.34375
Dropout output=1 dtype=torch.bfloat16 min=-32.5 max=7.25
Linear input=0 dtype=torch.bfloat16 min=-8.125 max=9.9375
Linear output=0 dtype=torch.bfloat16 min=-34.5 max=11.6875
Linear output=1 dtype=torch.bfloat16 min=-35.0 max=12.625
Attention output=0 dtype=torch.bfloat16 min=-32.5 max=7.34375
Attention output=1 dtype=torch.bfloat16 min=-35.0 max=12.625
LayerNorm input=0 dtype=torch.bfloat16 min=-688.0 max=66.5
LayerNorm output=0 dtype=torch.bfloat16 min=-38.0 max=10.1875
LayerNorm output=1 dtype=torch.bfloat16 min=-38.0 max=10.125
Linear input=0 dtype=torch.bfloat16 min=-4.90625 max=3.046875
Linear output=0 dtype=torch.bfloat16 min=-6.625 max=4.46875
Linear output=1 dtype=torch.bfloat16 min=-6.53125 max=4.3125
GELU input=0 dtype=torch.bfloat16 min=-4.90625 max=3.046875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.3125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.46875
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=26.25
Linear output=1 dtype=torch.bfloat16 min=-6.4375 max=26.625
FeedForward input=0 dtype=torch.bfloat16 min=-4.90625 max=3.046875
FeedForward output=0 dtype=torch.bfloat16 min=-6.5 max=26.25
FeedForward output=1 dtype=torch.bfloat16 min=-6.4375 max=26.625
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=2800.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=20.875
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=20.75
Linear input=0 dtype=torch.bfloat16 min=-9.75 max=7.6875
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=8.5625
Linear output=1 dtype=torch.bfloat16 min=-9.4375 max=7.875
GELU input=0 dtype=torch.bfloat16 min=-9.75 max=7.6875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5625
Linear output=0 dtype=torch.bfloat16 min=-73.0 max=84.5
Linear output=1 dtype=torch.bfloat16 min=-72.0 max=82.5
FeedForward input=0 dtype=torch.bfloat16 min=-9.75 max=7.6875
FeedForward output=0 dtype=torch.bfloat16 min=-73.0 max=84.5
FeedForward output=1 dtype=torch.bfloat16 min=-72.0 max=82.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-576.0 max=68.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5120.0 max=2848.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5120.0 max=3920.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-524.0 max=71.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-6.71875 max=4.875
LayerNorm input=0 dtype=torch.bfloat16 min=-524.0 max=71.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=16.375
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=16.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-524.0 max=71.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.25 max=12.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.1875 max=4.84375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.40625 max=2.4375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.953125 max=0.71875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.71875 max=4.53125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.0 max=16.875
Linear output=1 dtype=torch.bfloat16 min=-17.125 max=17.0
LayerNorm input=0 dtype=torch.bfloat16 min=-5120.0 max=3920.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=22.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.25 max=22.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5120.0 max=3920.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.3125 max=5.40625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-12.875 max=17.0
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.1171875 max=0.8828125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.265625 max=3.09375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-17.125 max=16.75
Linear input=0 dtype=torch.bfloat16 min=-11.25 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-12.375 max=11.5625
Linear output=1 dtype=torch.bfloat16 min=-12.1875 max=11.625
Linear input=0 dtype=torch.bfloat16 min=-11.25 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-14.625 max=13.5625
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=13.625
Linear input=0 dtype=torch.bfloat16 min=-11.25 max=12.6875
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-9.125 max=9.4375
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-6.6875 max=6.0625
Linear output=1 dtype=torch.bfloat16 min=-6.46875 max=6.3125
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-11.3125 max=5.40625
Linear output=0 dtype=torch.bfloat16 min=-8.25 max=7.9375
Linear output=1 dtype=torch.bfloat16 min=-8.25 max=7.9375
Linear input=0 dtype=torch.bfloat16 min=-5.5 max=5.375
Linear output=0 dtype=torch.bfloat16 min=-35.75 max=8.375
Linear output=1 dtype=torch.bfloat16 min=-36.25 max=8.125
Dropout input=0 dtype=torch.bfloat16 min=-36.25 max=8.375
Dropout output=0 dtype=torch.bfloat16 min=-35.75 max=8.375
Dropout output=1 dtype=torch.bfloat16 min=-36.25 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-41.25 max=34.25
Linear output=1 dtype=torch.bfloat16 min=-43.0 max=34.0
Attention output=0 dtype=torch.bfloat16 min=-36.25 max=8.375
Attention output=1 dtype=torch.bfloat16 min=-43.0 max=34.25
LayerNorm input=0 dtype=torch.bfloat16 min=-616.0 max=71.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.5 max=10.0
LayerNorm output=1 dtype=torch.bfloat16 min=-37.75 max=9.875
Linear input=0 dtype=torch.bfloat16 min=-5.59375 max=1.8984375
Linear output=0 dtype=torch.bfloat16 min=-7.25 max=3.703125
Linear output=1 dtype=torch.bfloat16 min=-7.21875 max=3.625
GELU input=0 dtype=torch.bfloat16 min=-5.59375 max=1.8984375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.703125
Linear output=0 dtype=torch.bfloat16 min=-30.0 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-30.125 max=10.25
FeedForward input=0 dtype=torch.bfloat16 min=-5.59375 max=1.8984375
FeedForward output=0 dtype=torch.bfloat16 min=-30.0 max=10.3125
FeedForward output=1 dtype=torch.bfloat16 min=-30.125 max=10.25
LayerNorm input=0 dtype=torch.bfloat16 min=-5280.0 max=3648.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=20.375
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=20.375
Linear input=0 dtype=torch.bfloat16 min=-11.4375 max=8.5625
Linear output=0 dtype=torch.bfloat16 min=-15.5 max=9.8125
Linear output=1 dtype=torch.bfloat16 min=-15.9375 max=10.8125
GELU input=0 dtype=torch.bfloat16 min=-11.4375 max=8.5625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.8125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=9.8125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.8125
Linear output=0 dtype=torch.bfloat16 min=-66.5 max=40.75
Linear output=1 dtype=torch.bfloat16 min=-51.75 max=40.75
FeedForward input=0 dtype=torch.bfloat16 min=-11.4375 max=8.5625
FeedForward output=0 dtype=torch.bfloat16 min=-66.5 max=40.75
FeedForward output=1 dtype=torch.bfloat16 min=-51.75 max=40.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-524.0 max=71.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5120.0 max=3920.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-5984.0 max=4048.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-414.0 max=84.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.5 max=12.25
Linear output=1 dtype=torch.bfloat16 min=-6.5625 max=12.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-414.0 max=84.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.75 max=17.875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=17.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-414.0 max=84.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-19.25 max=15.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-5.4375 max=6.0625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.296875 max=3.34375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.3125 max=1.59375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.5625 max=12.3125
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-21.25 max=18.875
Linear output=1 dtype=torch.bfloat16 min=-21.25 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-5984.0 max=4048.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=24.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=24.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-5984.0 max=4048.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-13.875 max=17.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.1015625 max=1.1171875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.75 max=3.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.25 max=18.875
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=15.625
Linear output=0 dtype=torch.bfloat16 min=-14.75 max=18.0
Linear output=1 dtype=torch.bfloat16 min=-14.625 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=15.625
Linear output=0 dtype=torch.bfloat16 min=-23.25 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-23.125 max=22.75
Linear input=0 dtype=torch.bfloat16 min=-19.25 max=15.625
Linear output=0 dtype=torch.bfloat16 min=-11.5 max=12.875
Linear output=1 dtype=torch.bfloat16 min=-11.125 max=12.625
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-6.25 max=6.625
Linear output=1 dtype=torch.bfloat16 min=-6.25 max=6.90625
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=6.15625
Linear output=1 dtype=torch.bfloat16 min=-6.8125 max=6.21875
Linear input=0 dtype=torch.bfloat16 min=-5.28125 max=4.5625
Linear output=0 dtype=torch.bfloat16 min=-5.8125 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-6.9375 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-8.625 max=9.1875
Linear output=0 dtype=torch.bfloat16 min=-49.75 max=12.0625
Linear output=1 dtype=torch.bfloat16 min=-49.5 max=11.8125
Dropout input=0 dtype=torch.bfloat16 min=-49.75 max=12.0625
Dropout output=0 dtype=torch.bfloat16 min=-49.75 max=12.0625
Dropout output=1 dtype=torch.bfloat16 min=-49.5 max=11.8125
Linear input=0 dtype=torch.bfloat16 min=-6.5 max=8.9375
Linear output=0 dtype=torch.bfloat16 min=-54.25 max=8.875
Linear output=1 dtype=torch.bfloat16 min=-51.75 max=9.5
Attention output=0 dtype=torch.bfloat16 min=-49.75 max=12.0625
Attention output=1 dtype=torch.bfloat16 min=-54.25 max=9.5
LayerNorm input=0 dtype=torch.bfloat16 min=-648.0 max=90.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.25 max=8.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-37.5 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-3.09375 max=2.21875
Linear output=0 dtype=torch.bfloat16 min=-5.4375 max=3.9375
Linear output=1 dtype=torch.bfloat16 min=-5.375 max=3.890625
GELU input=0 dtype=torch.bfloat16 min=-3.09375 max=2.21875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.9375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.890625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.9375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.890625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.9375
Linear output=0 dtype=torch.bfloat16 min=-11.6875 max=14.375
Linear output=1 dtype=torch.bfloat16 min=-11.75 max=15.25
FeedForward input=0 dtype=torch.bfloat16 min=-3.09375 max=2.21875
FeedForward output=0 dtype=torch.bfloat16 min=-11.6875 max=14.375
FeedForward output=1 dtype=torch.bfloat16 min=-11.75 max=15.25
LayerNorm input=0 dtype=torch.bfloat16 min=-6080.0 max=4000.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.25 max=21.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=21.5
Linear input=0 dtype=torch.bfloat16 min=-9.5625 max=5.25
Linear output=0 dtype=torch.bfloat16 min=-18.125 max=11.6875
Linear output=1 dtype=torch.bfloat16 min=-19.125 max=10.5625
GELU input=0 dtype=torch.bfloat16 min=-9.5625 max=5.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=11.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=11.6875
Linear output=0 dtype=torch.bfloat16 min=-22.125 max=42.5
Linear output=1 dtype=torch.bfloat16 min=-22.375 max=38.0
FeedForward input=0 dtype=torch.bfloat16 min=-9.5625 max=5.25
FeedForward output=0 dtype=torch.bfloat16 min=-22.125 max=42.5
FeedForward output=1 dtype=torch.bfloat16 min=-22.375 max=38.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-414.0 max=84.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-5984.0 max=4048.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6336.0 max=4128.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-460.0 max=102.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-9.75 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-9.75 max=6.59375
LayerNorm input=0 dtype=torch.bfloat16 min=-460.0 max=102.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=13.3125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.0 max=13.375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-460.0 max=102.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-16.875 max=14.4375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.21875 max=4.53125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.375 max=2.734375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.8828125 max=2.734375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.75 max=6.59375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-26.375 max=23.5
Linear output=1 dtype=torch.bfloat16 min=-26.5 max=23.5
LayerNorm input=0 dtype=torch.bfloat16 min=-6336.0 max=4128.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=22.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=22.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6336.0 max=4128.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.75 max=4.125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-15.9375 max=11.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.98046875 max=0.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.5 max=4.1875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-26.5 max=23.5
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=14.4375
Linear output=0 dtype=torch.bfloat16 min=-11.1875 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-11.3125 max=9.6875
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=14.4375
Linear output=0 dtype=torch.bfloat16 min=-13.75 max=14.25
Linear output=1 dtype=torch.bfloat16 min=-13.625 max=14.25
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=14.4375
Linear output=0 dtype=torch.bfloat16 min=-12.5625 max=12.625
Linear output=1 dtype=torch.bfloat16 min=-12.0625 max=12.3125
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-8.6875 max=12.4375
Linear output=1 dtype=torch.bfloat16 min=-8.3125 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-7.65625 max=8.25
Linear output=1 dtype=torch.bfloat16 min=-7.5625 max=6.4375
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=4.125
Linear output=0 dtype=torch.bfloat16 min=-10.3125 max=11.0625
Linear output=1 dtype=torch.bfloat16 min=-10.3125 max=11.0625
Linear input=0 dtype=torch.bfloat16 min=-10.75 max=11.0
Linear output=0 dtype=torch.bfloat16 min=-40.5 max=15.125
Linear output=1 dtype=torch.bfloat16 min=-40.0 max=14.25
Dropout input=0 dtype=torch.bfloat16 min=-40.5 max=15.125
Dropout output=0 dtype=torch.bfloat16 min=-40.5 max=15.125
Dropout output=1 dtype=torch.bfloat16 min=-40.0 max=14.25
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-41.25 max=36.5
Linear output=1 dtype=torch.bfloat16 min=-39.25 max=38.0
Attention output=0 dtype=torch.bfloat16 min=-40.5 max=15.125
Attention output=1 dtype=torch.bfloat16 min=-41.25 max=38.0
LayerNorm input=0 dtype=torch.bfloat16 min=-588.0 max=96.0
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=8.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-37.25 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-3.078125 max=2.515625
Linear output=0 dtype=torch.bfloat16 min=-5.78125 max=3.671875
Linear output=1 dtype=torch.bfloat16 min=-5.78125 max=3.625
GELU input=0 dtype=torch.bfloat16 min=-3.078125 max=2.515625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.671875
Linear output=0 dtype=torch.bfloat16 min=-15.5625 max=5.4375
Linear output=1 dtype=torch.bfloat16 min=-17.0 max=5.34375
FeedForward input=0 dtype=torch.bfloat16 min=-3.078125 max=2.515625
FeedForward output=0 dtype=torch.bfloat16 min=-15.5625 max=5.4375
FeedForward output=1 dtype=torch.bfloat16 min=-17.0 max=5.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-6400.0 max=3952.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.0 max=20.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.0 max=20.5
Linear input=0 dtype=torch.bfloat16 min=-9.4375 max=12.75
Linear output=0 dtype=torch.bfloat16 min=-36.5 max=46.75
Linear output=1 dtype=torch.bfloat16 min=-24.375 max=34.25
GELU input=0 dtype=torch.bfloat16 min=-9.4375 max=12.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=46.75
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=34.25
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=46.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=46.75
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=34.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=46.75
Linear output=0 dtype=torch.bfloat16 min=-51.75 max=64.5
Linear output=1 dtype=torch.bfloat16 min=-51.5 max=61.75
FeedForward input=0 dtype=torch.bfloat16 min=-9.4375 max=12.75
FeedForward output=0 dtype=torch.bfloat16 min=-51.75 max=64.5
FeedForward output=1 dtype=torch.bfloat16 min=-51.5 max=61.75
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-460.0 max=102.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-6336.0 max=4128.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-6624.0 max=4576.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-432.0 max=99.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=7.875
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=7.875
LayerNorm input=0 dtype=torch.bfloat16 min=-432.0 max=99.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=11.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=11.5625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-432.0 max=99.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.0 max=12.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=4.75
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.84375 max=2.125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.9296875 max=2.453125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.0 max=7.875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.625 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-23.75 max=17.625
LayerNorm input=0 dtype=torch.bfloat16 min=-6624.0 max=4576.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=21.625
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=21.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-6624.0 max=4576.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.75 max=8.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.75 max=17.625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0625 max=1.9140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-6.15625 max=11.4375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.5 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-16.75 max=15.6875
Linear output=1 dtype=torch.bfloat16 min=-16.875 max=15.6875
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-23.125 max=26.125
Linear output=1 dtype=torch.bfloat16 min=-23.125 max=26.125
Linear input=0 dtype=torch.bfloat16 min=-15.0 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-18.0 max=16.0
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=15.875
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-6.1875 max=9.3125
Linear output=1 dtype=torch.bfloat16 min=-6.34375 max=8.6875
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-11.9375 max=16.375
Linear output=1 dtype=torch.bfloat16 min=-11.9375 max=16.375
Linear input=0 dtype=torch.bfloat16 min=-7.75 max=8.1875
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-15.625 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-11.8125 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-40.25 max=21.25
Linear output=1 dtype=torch.bfloat16 min=-39.75 max=21.25
Dropout input=0 dtype=torch.bfloat16 min=-40.25 max=21.25
Dropout output=0 dtype=torch.bfloat16 min=-40.25 max=21.25
Dropout output=1 dtype=torch.bfloat16 min=-39.75 max=21.25
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=15.3125
Linear output=0 dtype=torch.bfloat16 min=-64.5 max=62.75
Linear output=1 dtype=torch.bfloat16 min=-39.5 max=48.75
Attention output=0 dtype=torch.bfloat16 min=-40.25 max=21.25
Attention output=1 dtype=torch.bfloat16 min=-64.5 max=62.75
LayerNorm input=0 dtype=torch.bfloat16 min=-536.0 max=101.5
LayerNorm output=0 dtype=torch.bfloat16 min=-37.0 max=7.96875
LayerNorm output=1 dtype=torch.bfloat16 min=-37.0 max=8.0
Linear input=0 dtype=torch.bfloat16 min=-2.453125 max=2.328125
Linear output=0 dtype=torch.bfloat16 min=-6.46875 max=4.84375
Linear output=1 dtype=torch.bfloat16 min=-6.5 max=4.625
GELU input=0 dtype=torch.bfloat16 min=-2.453125 max=2.328125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Linear output=0 dtype=torch.bfloat16 min=-18.625 max=5.40625
Linear output=1 dtype=torch.bfloat16 min=-18.625 max=5.4375
FeedForward input=0 dtype=torch.bfloat16 min=-2.453125 max=2.328125
FeedForward output=0 dtype=torch.bfloat16 min=-18.625 max=5.40625
FeedForward output=1 dtype=torch.bfloat16 min=-18.625 max=5.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-6752.0 max=4800.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=22.625
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=22.5
Linear input=0 dtype=torch.bfloat16 min=-17.125 max=12.75
Linear output=0 dtype=torch.bfloat16 min=-156.0 max=153.0
Linear output=1 dtype=torch.bfloat16 min=-53.25 max=180.0
GELU input=0 dtype=torch.bfloat16 min=-17.125 max=12.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=153.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=180.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=180.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=153.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=180.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=180.0
Linear output=0 dtype=torch.bfloat16 min=-442.0 max=1080.0
Linear output=1 dtype=torch.bfloat16 min=-432.0 max=1192.0
FeedForward input=0 dtype=torch.bfloat16 min=-17.125 max=12.75
FeedForward output=0 dtype=torch.bfloat16 min=-442.0 max=1080.0
FeedForward output=1 dtype=torch.bfloat16 min=-432.0 max=1192.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-432.0 max=99.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-6624.0 max=4576.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-19072.0 max=9344.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-460.0 max=101.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=5.59375
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=5.625
LayerNorm input=0 dtype=torch.bfloat16 min=-460.0 max=101.0
LayerNorm output=0 dtype=torch.bfloat16 min=-35.0 max=10.5
LayerNorm output=1 dtype=torch.bfloat16 min=-35.0 max=10.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-460.0 max=101.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.3125 max=12.0
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.09375 max=4.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.125 max=1.625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.21875 max=2.015625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-8.125 max=5.625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.5 max=21.375
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=21.375
LayerNorm input=0 dtype=torch.bfloat16 min=-19072.0 max=9344.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.25 max=29.5
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=29.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-19072.0 max=9344.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.4375 max=4.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-17.5 max=21.375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.078125 max=2.234375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.15625 max=7.09375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-4.28125 max=6.8125
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-11.375 max=9.5625
Linear output=1 dtype=torch.bfloat16 min=-11.4375 max=9.5625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-14.0 max=15.8125
Linear output=1 dtype=torch.bfloat16 min=-13.8125 max=15.625
Linear input=0 dtype=torch.bfloat16 min=-13.3125 max=12.0
Linear output=0 dtype=torch.bfloat16 min=-15.5 max=15.1875
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-5.4375 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-8.1875 max=7.1875
Linear output=1 dtype=torch.bfloat16 min=-8.5 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-5.4375 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-7.875 max=7.09375
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=7.28125
Linear input=0 dtype=torch.bfloat16 min=-5.4375 max=4.375
Linear output=0 dtype=torch.bfloat16 min=-10.8125 max=9.125
Linear output=1 dtype=torch.bfloat16 min=-11.125 max=9.375
Linear input=0 dtype=torch.bfloat16 min=-12.0625 max=12.125
Linear output=0 dtype=torch.bfloat16 min=-29.75 max=14.0625
Linear output=1 dtype=torch.bfloat16 min=-30.5 max=14.1875
Dropout input=0 dtype=torch.bfloat16 min=-30.5 max=14.1875
Dropout output=0 dtype=torch.bfloat16 min=-29.75 max=14.0625
Dropout output=1 dtype=torch.bfloat16 min=-30.5 max=14.1875
Linear input=0 dtype=torch.bfloat16 min=-8.875 max=8.625
Linear output=0 dtype=torch.bfloat16 min=-53.0 max=17.5
Linear output=1 dtype=torch.bfloat16 min=-68.5 max=17.375
Attention output=0 dtype=torch.bfloat16 min=-30.5 max=14.1875
Attention output=1 dtype=torch.bfloat16 min=-68.5 max=17.5
LayerNorm input=0 dtype=torch.bfloat16 min=-504.0 max=99.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.5 max=8.125
LayerNorm output=1 dtype=torch.bfloat16 min=-36.5 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-2.78125 max=2.59375
Linear output=0 dtype=torch.bfloat16 min=-5.375 max=3.5625
Linear output=1 dtype=torch.bfloat16 min=-5.40625 max=3.609375
GELU input=0 dtype=torch.bfloat16 min=-2.78125 max=2.59375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.609375
Linear output=0 dtype=torch.bfloat16 min=-26.375 max=5.9375
Linear output=1 dtype=torch.bfloat16 min=-26.125 max=6.09375
FeedForward input=0 dtype=torch.bfloat16 min=-2.78125 max=2.59375
FeedForward output=0 dtype=torch.bfloat16 min=-26.375 max=5.9375
FeedForward output=1 dtype=torch.bfloat16 min=-26.125 max=6.09375
LayerNorm input=0 dtype=torch.bfloat16 min=-20480.0 max=9216.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.25 max=29.5
LayerNorm output=1 dtype=torch.bfloat16 min=-31.375 max=29.125
Linear input=0 dtype=torch.bfloat16 min=-47.0 max=45.75
Linear output=0 dtype=torch.bfloat16 min=-23.625 max=16.125
Linear output=1 dtype=torch.bfloat16 min=-19.875 max=16.25
GELU input=0 dtype=torch.bfloat16 min=-47.0 max=45.75
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=16.125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.25
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.25
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=16.125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.25
Linear output=0 dtype=torch.bfloat16 min=-608.0 max=1864.0
Linear output=1 dtype=torch.bfloat16 min=-616.0 max=1872.0
FeedForward input=0 dtype=torch.bfloat16 min=-47.0 max=45.75
FeedForward output=0 dtype=torch.bfloat16 min=-608.0 max=1864.0
FeedForward output=1 dtype=torch.bfloat16 min=-616.0 max=1872.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-460.0 max=101.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-19072.0 max=9344.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-24704.0 max=22016.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-364.0 max=100.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=7.5
Linear output=1 dtype=torch.bfloat16 min=-6.34375 max=7.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-364.0 max=100.5
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=11.75
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=11.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-364.0 max=100.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-13.9375 max=11.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.09375 max=2.015625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.7734375 max=1.90625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.125 max=1.859375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-6.34375 max=7.5625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.0 max=28.5
Linear output=1 dtype=torch.bfloat16 min=-23.125 max=28.5
LayerNorm input=0 dtype=torch.bfloat16 min=-24704.0 max=22016.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.75 max=34.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.0 max=34.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-24704.0 max=22016.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-5.46875 max=5.6875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.125 max=21.25
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.94140625 max=2.125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.0 max=12.125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-21.0 max=28.5
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=17.25
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=17.25
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-20.125 max=18.0
Linear output=1 dtype=torch.bfloat16 min=-20.0 max=18.0
Linear input=0 dtype=torch.bfloat16 min=-13.9375 max=11.625
Linear output=0 dtype=torch.bfloat16 min=-15.9375 max=16.75
Linear output=1 dtype=torch.bfloat16 min=-15.8125 max=16.875
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=5.6875
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=8.5625
Linear output=1 dtype=torch.bfloat16 min=-7.53125 max=8.125
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=5.6875
Linear output=0 dtype=torch.bfloat16 min=-6.75 max=7.03125
Linear output=1 dtype=torch.bfloat16 min=-6.8125 max=8.1875
Linear input=0 dtype=torch.bfloat16 min=-5.46875 max=5.6875
Linear output=0 dtype=torch.bfloat16 min=-7.78125 max=8.5625
Linear output=1 dtype=torch.bfloat16 min=-7.84375 max=8.3125
Linear input=0 dtype=torch.bfloat16 min=-13.875 max=16.375
Linear output=0 dtype=torch.bfloat16 min=-19.125 max=57.25
Linear output=1 dtype=torch.bfloat16 min=-18.875 max=56.25
Dropout input=0 dtype=torch.bfloat16 min=-19.125 max=57.25
Dropout output=0 dtype=torch.bfloat16 min=-19.125 max=57.25
Dropout output=1 dtype=torch.bfloat16 min=-18.875 max=56.25
Linear input=0 dtype=torch.bfloat16 min=-7.90625 max=8.5625
Linear output=0 dtype=torch.bfloat16 min=-34.5 max=50.0
Linear output=1 dtype=torch.bfloat16 min=-37.25 max=48.5
Attention output=0 dtype=torch.bfloat16 min=-19.125 max=57.25
Attention output=1 dtype=torch.bfloat16 min=-37.25 max=50.0
LayerNorm input=0 dtype=torch.bfloat16 min=-510.0 max=100.0
LayerNorm output=0 dtype=torch.bfloat16 min=-36.75 max=7.75
LayerNorm output=1 dtype=torch.bfloat16 min=-36.75 max=7.71875
Linear input=0 dtype=torch.bfloat16 min=-2.859375 max=2.765625
Linear output=0 dtype=torch.bfloat16 min=-6.09375 max=4.78125
Linear output=1 dtype=torch.bfloat16 min=-6.0 max=4.84375
GELU input=0 dtype=torch.bfloat16 min=-2.859375 max=2.765625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.78125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.84375
Linear output=0 dtype=torch.bfloat16 min=-5.875 max=26.75
Linear output=1 dtype=torch.bfloat16 min=-5.625 max=26.875
FeedForward input=0 dtype=torch.bfloat16 min=-2.859375 max=2.765625
FeedForward output=0 dtype=torch.bfloat16 min=-5.875 max=26.75
FeedForward output=1 dtype=torch.bfloat16 min=-5.625 max=26.875
LayerNorm input=0 dtype=torch.bfloat16 min=-25600.0 max=21376.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=34.25
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=34.25
Linear input=0 dtype=torch.bfloat16 min=-16.875 max=18.25
Linear output=0 dtype=torch.bfloat16 min=-31.625 max=15.6875
Linear output=1 dtype=torch.bfloat16 min=-30.875 max=16.125
GELU input=0 dtype=torch.bfloat16 min=-16.875 max=18.25
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.6875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.125
Linear output=0 dtype=torch.bfloat16 min=-111.0 max=84.0
Linear output=1 dtype=torch.bfloat16 min=-120.5 max=91.5
FeedForward input=0 dtype=torch.bfloat16 min=-16.875 max=18.25
FeedForward output=0 dtype=torch.bfloat16 min=-111.0 max=84.0
FeedForward output=1 dtype=torch.bfloat16 min=-120.5 max=91.5
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-364.0 max=100.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-24704.0 max=22016.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-28288.0 max=22272.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-352.0 max=100.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=12.375
Linear output=1 dtype=torch.bfloat16 min=-7.78125 max=12.4375
LayerNorm input=0 dtype=torch.bfloat16 min=-352.0 max=100.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=10.6875
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=10.6875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-352.0 max=100.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-19.625 max=13.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.34375 max=5.09375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.046875 max=1.890625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.921875 max=1.8515625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.78125 max=12.4375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-14.25 max=15.6875
Linear output=1 dtype=torch.bfloat16 min=-14.25 max=15.6875
LayerNorm input=0 dtype=torch.bfloat16 min=-28288.0 max=22272.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.375 max=34.75
LayerNorm output=1 dtype=torch.bfloat16 min=-31.875 max=34.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-28288.0 max=22272.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.0 max=5.03125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-14.25 max=15.6875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-0.9296875 max=0.75
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-5.59375 max=13.6875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-11.3125 max=10.5625
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-22.875 max=18.125
Linear output=1 dtype=torch.bfloat16 min=-22.625 max=17.875
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-40.0 max=43.5
Linear output=1 dtype=torch.bfloat16 min=-39.5 max=43.0
Linear input=0 dtype=torch.bfloat16 min=-19.625 max=13.8125
Linear output=0 dtype=torch.bfloat16 min=-18.375 max=19.375
Linear output=1 dtype=torch.bfloat16 min=-18.5 max=19.375
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-10.0 max=9.6875
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=9.125
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-5.90625 max=5.15625
Linear output=1 dtype=torch.bfloat16 min=-5.6875 max=5.46875
Linear input=0 dtype=torch.bfloat16 min=-6.0 max=5.03125
Linear output=0 dtype=torch.bfloat16 min=-7.28125 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=9.625
Linear input=0 dtype=torch.bfloat16 min=-12.125 max=11.375
Linear output=0 dtype=torch.bfloat16 min=-61.0 max=22.125
Linear output=1 dtype=torch.bfloat16 min=-60.25 max=21.625
Dropout input=0 dtype=torch.bfloat16 min=-61.0 max=22.125
Dropout output=0 dtype=torch.bfloat16 min=-61.0 max=22.125
Dropout output=1 dtype=torch.bfloat16 min=-60.25 max=21.625
Linear input=0 dtype=torch.bfloat16 min=-6.0625 max=8.0625
Linear output=0 dtype=torch.bfloat16 min=-47.5 max=12.0
Linear output=1 dtype=torch.bfloat16 min=-44.5 max=12.0
Attention output=0 dtype=torch.bfloat16 min=-61.0 max=22.125
Attention output=1 dtype=torch.bfloat16 min=-47.5 max=12.0
LayerNorm input=0 dtype=torch.bfloat16 min=-600.0 max=104.5
LayerNorm output=0 dtype=torch.bfloat16 min=-36.25 max=6.46875
LayerNorm output=1 dtype=torch.bfloat16 min=-36.25 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-3.375 max=2.703125
Linear output=0 dtype=torch.bfloat16 min=-5.0625 max=4.1875
Linear output=1 dtype=torch.bfloat16 min=-5.0 max=4.125
GELU input=0 dtype=torch.bfloat16 min=-3.375 max=2.703125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=4.125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=4.1875
Linear output=0 dtype=torch.bfloat16 min=-7.0625 max=13.6875
Linear output=1 dtype=torch.bfloat16 min=-7.0 max=13.875
FeedForward input=0 dtype=torch.bfloat16 min=-3.375 max=2.703125
FeedForward output=0 dtype=torch.bfloat16 min=-7.0625 max=13.6875
FeedForward output=1 dtype=torch.bfloat16 min=-7.0 max=13.875
LayerNorm input=0 dtype=torch.bfloat16 min=-28416.0 max=22016.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.125 max=35.0
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=34.75
Linear input=0 dtype=torch.bfloat16 min=-27.625 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-17.5 max=15.25
Linear output=1 dtype=torch.bfloat16 min=-17.25 max=16.875
GELU input=0 dtype=torch.bfloat16 min=-27.625 max=13.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.25
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.875
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.875
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=15.25
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=16.875
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=16.875
Linear output=0 dtype=torch.bfloat16 min=-376.0 max=636.0
Linear output=1 dtype=torch.bfloat16 min=-362.0 max=636.0
FeedForward input=0 dtype=torch.bfloat16 min=-27.625 max=13.375
FeedForward output=0 dtype=torch.bfloat16 min=-376.0 max=636.0
FeedForward output=1 dtype=torch.bfloat16 min=-362.0 max=636.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-352.0 max=100.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-28288.0 max=22272.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-26624.0 max=14848.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-508.0 max=105.5
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-8.0625 max=8.3125
Linear output=1 dtype=torch.bfloat16 min=-8.125 max=8.375
LayerNorm input=0 dtype=torch.bfloat16 min=-508.0 max=105.5
LayerNorm output=0 dtype=torch.bfloat16 min=-34.5 max=7.90625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=8.0625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-508.0 max=105.5
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-11.9375 max=10.5
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.65625 max=2.390625
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.8203125 max=2.171875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.953125 max=1.703125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-8.125 max=8.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-16.25 max=21.0
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=21.125
LayerNorm input=0 dtype=torch.bfloat16 min=-26624.0 max=14848.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.25 max=31.625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=31.625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-26624.0 max=14848.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-10.0625 max=9.375
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-16.375 max=21.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.078125 max=1.9453125
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-0.43359375 max=11.0625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-7.84375 max=5.5625
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-13.5 max=19.125
Linear output=1 dtype=torch.bfloat16 min=-13.625 max=19.125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-19.75 max=15.3125
Linear output=1 dtype=torch.bfloat16 min=-19.375 max=15.125
Linear input=0 dtype=torch.bfloat16 min=-11.9375 max=10.5
Linear output=0 dtype=torch.bfloat16 min=-10.6875 max=10.875
Linear output=1 dtype=torch.bfloat16 min=-10.6875 max=10.8125
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-6.78125 max=8.625
Linear output=1 dtype=torch.bfloat16 min=-6.65625 max=7.875
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-6.8125 max=6.9375
Linear output=1 dtype=torch.bfloat16 min=-6.78125 max=6.9375
Linear input=0 dtype=torch.bfloat16 min=-10.0625 max=9.375
Linear output=0 dtype=torch.bfloat16 min=-16.5 max=13.3125
Linear output=1 dtype=torch.bfloat16 min=-16.625 max=13.3125
Linear input=0 dtype=torch.bfloat16 min=-8.75 max=9.4375
Linear output=0 dtype=torch.bfloat16 min=-14.1875 max=25.375
Linear output=1 dtype=torch.bfloat16 min=-13.75 max=25.375
Dropout input=0 dtype=torch.bfloat16 min=-14.1875 max=25.375
Dropout output=0 dtype=torch.bfloat16 min=-14.1875 max=25.375
Dropout output=1 dtype=torch.bfloat16 min=-13.75 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-6.65625 max=7.03125
Linear output=0 dtype=torch.bfloat16 min=-22.875 max=15.1875
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=15.9375
Attention output=0 dtype=torch.bfloat16 min=-14.1875 max=25.375
Attention output=1 dtype=torch.bfloat16 min=-22.875 max=15.9375
LayerNorm input=0 dtype=torch.bfloat16 min=-580.0 max=105.5
LayerNorm output=0 dtype=torch.bfloat16 min=-35.5 max=6.625
LayerNorm output=1 dtype=torch.bfloat16 min=-35.5 max=6.65625
Linear input=0 dtype=torch.bfloat16 min=-5.03125 max=3.375
Linear output=0 dtype=torch.bfloat16 min=-5.125 max=3.828125
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=3.8125
GELU input=0 dtype=torch.bfloat16 min=-5.03125 max=3.375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.828125
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.828125
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=3.828125
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=3.8125
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=3.828125
Linear output=0 dtype=torch.bfloat16 min=-15.0 max=23.25
Linear output=1 dtype=torch.bfloat16 min=-15.0 max=23.125
FeedForward input=0 dtype=torch.bfloat16 min=-5.03125 max=3.375
FeedForward output=0 dtype=torch.bfloat16 min=-15.0 max=23.25
FeedForward output=1 dtype=torch.bfloat16 min=-15.0 max=23.125
LayerNorm input=0 dtype=torch.bfloat16 min=-26624.0 max=14784.0
LayerNorm output=0 dtype=torch.bfloat16 min=-34.0 max=31.5
LayerNorm output=1 dtype=torch.bfloat16 min=-34.5 max=31.625
Linear input=0 dtype=torch.bfloat16 min=-45.0 max=47.5
Linear output=0 dtype=torch.bfloat16 min=-8.9375 max=10.875
Linear output=1 dtype=torch.bfloat16 min=-8.875 max=10.9375
GELU input=0 dtype=torch.bfloat16 min=-45.0 max=47.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=10.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=10.9375
Linear output=0 dtype=torch.bfloat16 min=-812.0 max=560.0
Linear output=1 dtype=torch.bfloat16 min=-812.0 max=564.0
FeedForward input=0 dtype=torch.bfloat16 min=-45.0 max=47.5
FeedForward output=0 dtype=torch.bfloat16 min=-812.0 max=560.0
FeedForward output=1 dtype=torch.bfloat16 min=-812.0 max=564.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-508.0 max=105.5
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-26624.0 max=14848.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-26112.0 max=21120.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-424.0 max=104.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-9.0 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-9.0 max=11.0625
LayerNorm input=0 dtype=torch.bfloat16 min=-424.0 max=104.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=8.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.5 max=8.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-424.0 max=104.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-16.75 max=11.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-2.3125 max=4.34375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.46875 max=1.6796875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.515625 max=1.375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-9.0 max=11.0625
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-23.5 max=23.125
LayerNorm input=0 dtype=torch.bfloat16 min=-26112.0 max=21120.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=33.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=33.5
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-26112.0 max=21120.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-8.0 max=5.5625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-23.5 max=23.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.6328125 max=0.734375
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.875 max=11.875
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-16.5 max=16.25
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-20.875 max=19.625
Linear output=1 dtype=torch.bfloat16 min=-20.75 max=19.625
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-41.25 max=45.0
Linear output=1 dtype=torch.bfloat16 min=-41.25 max=45.0
Linear input=0 dtype=torch.bfloat16 min=-16.75 max=11.5625
Linear output=0 dtype=torch.bfloat16 min=-23.375 max=25.25
Linear output=1 dtype=torch.bfloat16 min=-23.25 max=25.375
Linear input=0 dtype=torch.bfloat16 min=-8.0 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-7.75 max=6.34375
Linear output=1 dtype=torch.bfloat16 min=-7.96875 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-8.0 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-8.125 max=6.09375
Linear output=1 dtype=torch.bfloat16 min=-8.0 max=6.15625
Linear input=0 dtype=torch.bfloat16 min=-8.0 max=5.5625
Linear output=0 dtype=torch.bfloat16 min=-6.0 max=5.1875
Linear output=1 dtype=torch.bfloat16 min=-6.21875 max=5.40625
Linear input=0 dtype=torch.bfloat16 min=-21.375 max=23.625
Linear output=0 dtype=torch.bfloat16 min=-44.75 max=32.75
Linear output=1 dtype=torch.bfloat16 min=-44.25 max=31.875
Dropout input=0 dtype=torch.bfloat16 min=-44.75 max=32.75
Dropout output=0 dtype=torch.bfloat16 min=-44.75 max=32.75
Dropout output=1 dtype=torch.bfloat16 min=-44.25 max=31.875
Linear input=0 dtype=torch.bfloat16 min=-11.375 max=10.8125
Linear output=0 dtype=torch.bfloat16 min=-54.75 max=61.0
Linear output=1 dtype=torch.bfloat16 min=-54.25 max=57.75
Attention output=0 dtype=torch.bfloat16 min=-44.75 max=32.75
Attention output=1 dtype=torch.bfloat16 min=-54.75 max=61.0
LayerNorm input=0 dtype=torch.bfloat16 min=-596.0 max=139.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.75 max=6.15625
LayerNorm output=1 dtype=torch.bfloat16 min=-33.5 max=6.28125
Linear input=0 dtype=torch.bfloat16 min=-4.59375 max=2.984375
Linear output=0 dtype=torch.bfloat16 min=-8.625 max=7.0
Linear output=1 dtype=torch.bfloat16 min=-8.6875 max=7.0
GELU input=0 dtype=torch.bfloat16 min=-4.59375 max=2.984375
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=7.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=7.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=7.0
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=13.875
Linear output=1 dtype=torch.bfloat16 min=-10.125 max=13.75
FeedForward input=0 dtype=torch.bfloat16 min=-4.59375 max=2.984375
FeedForward output=0 dtype=torch.bfloat16 min=-10.125 max=13.875
FeedForward output=1 dtype=torch.bfloat16 min=-10.125 max=13.75
LayerNorm input=0 dtype=torch.bfloat16 min=-26496.0 max=20480.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=33.0
LayerNorm output=1 dtype=torch.bfloat16 min=-33.25 max=33.0
Linear input=0 dtype=torch.bfloat16 min=-24.125 max=17.875
Linear output=0 dtype=torch.bfloat16 min=-28.125 max=19.875
Linear output=1 dtype=torch.bfloat16 min=-28.25 max=21.75
GELU input=0 dtype=torch.bfloat16 min=-24.125 max=17.875
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=21.75
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=21.75
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=19.875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=21.75
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=21.75
Linear output=0 dtype=torch.bfloat16 min=-284.0 max=528.0
Linear output=1 dtype=torch.bfloat16 min=-284.0 max=528.0
FeedForward input=0 dtype=torch.bfloat16 min=-24.125 max=17.875
FeedForward output=0 dtype=torch.bfloat16 min=-284.0 max=528.0
FeedForward output=1 dtype=torch.bfloat16 min=-284.0 max=528.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-424.0 max=104.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-26112.0 max=21120.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-26240.0 max=15232.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-516.0 max=117.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=10.3125
Linear output=1 dtype=torch.bfloat16 min=-10.1875 max=10.375
LayerNorm input=0 dtype=torch.bfloat16 min=-516.0 max=117.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.0 max=7.375
LayerNorm output=1 dtype=torch.bfloat16 min=-30.875 max=7.34375
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-516.0 max=117.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-15.25 max=10.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.25 max=2.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.7578125 max=3.140625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-2.703125 max=3.25
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-10.1875 max=10.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-22.0 max=18.75
Linear output=1 dtype=torch.bfloat16 min=-22.0 max=18.875
LayerNorm input=0 dtype=torch.bfloat16 min=-26240.0 max=15232.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=29.625
LayerNorm output=1 dtype=torch.bfloat16 min=-34.0 max=29.75
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-26240.0 max=15232.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-7.78125 max=7.28125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-22.0 max=18.875
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.171875 max=1.4296875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.140625 max=16.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.828125 max=5.4375
Linear input=0 dtype=torch.bfloat16 min=-15.25 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-21.125 max=20.125
Linear output=1 dtype=torch.bfloat16 min=-21.125 max=20.25
Linear input=0 dtype=torch.bfloat16 min=-15.25 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-33.75 max=43.0
Linear output=1 dtype=torch.bfloat16 min=-33.75 max=42.5
Linear input=0 dtype=torch.bfloat16 min=-15.25 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-17.25 max=14.5625
Linear output=1 dtype=torch.bfloat16 min=-17.5 max=14.6875
Linear input=0 dtype=torch.bfloat16 min=-7.78125 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-8.375 max=8.4375
Linear output=1 dtype=torch.bfloat16 min=-7.5625 max=9.3125
Linear input=0 dtype=torch.bfloat16 min=-7.78125 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-11.8125 max=13.625
Linear output=1 dtype=torch.bfloat16 min=-12.5 max=13.6875
Linear input=0 dtype=torch.bfloat16 min=-7.78125 max=7.28125
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=10.8125
Linear output=1 dtype=torch.bfloat16 min=-10.125 max=11.0
Linear input=0 dtype=torch.bfloat16 min=-17.0 max=14.125
Linear output=0 dtype=torch.bfloat16 min=-31.375 max=31.75
Linear output=1 dtype=torch.bfloat16 min=-30.75 max=31.375
Dropout input=0 dtype=torch.bfloat16 min=-31.375 max=31.75
Dropout output=0 dtype=torch.bfloat16 min=-31.375 max=31.75
Dropout output=1 dtype=torch.bfloat16 min=-30.75 max=31.375
Linear input=0 dtype=torch.bfloat16 min=-10.625 max=10.5625
Linear output=0 dtype=torch.bfloat16 min=-38.75 max=52.25
Linear output=1 dtype=torch.bfloat16 min=-41.25 max=55.25
Attention output=0 dtype=torch.bfloat16 min=-31.375 max=31.75
Attention output=1 dtype=torch.bfloat16 min=-41.25 max=55.25
LayerNorm input=0 dtype=torch.bfloat16 min=-612.0 max=122.0
LayerNorm output=0 dtype=torch.bfloat16 min=-31.625 max=7.0
LayerNorm output=1 dtype=torch.bfloat16 min=-31.5 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-5.375 max=3.265625
Linear output=0 dtype=torch.bfloat16 min=-5.46875 max=5.0
Linear output=1 dtype=torch.bfloat16 min=-5.5625 max=5.0625
GELU input=0 dtype=torch.bfloat16 min=-5.375 max=3.265625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=5.0625
Linear output=0 dtype=torch.bfloat16 min=-22.125 max=15.3125
Linear output=1 dtype=torch.bfloat16 min=-22.0 max=15.3125
FeedForward input=0 dtype=torch.bfloat16 min=-5.375 max=3.265625
FeedForward output=0 dtype=torch.bfloat16 min=-22.125 max=15.3125
FeedForward output=1 dtype=torch.bfloat16 min=-22.0 max=15.3125
LayerNorm input=0 dtype=torch.bfloat16 min=-26240.0 max=15168.0
LayerNorm output=0 dtype=torch.bfloat16 min=-33.5 max=29.5
LayerNorm output=1 dtype=torch.bfloat16 min=-33.75 max=29.625
Linear input=0 dtype=torch.bfloat16 min=-63.75 max=60.5
Linear output=0 dtype=torch.bfloat16 min=-22.125 max=38.0
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=33.5
GELU input=0 dtype=torch.bfloat16 min=-63.75 max=60.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=38.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=33.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=38.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=38.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=33.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=38.0
Linear output=0 dtype=torch.bfloat16 min=-1872.0 max=1872.0
Linear output=1 dtype=torch.bfloat16 min=-1400.0 max=1416.0
FeedForward input=0 dtype=torch.bfloat16 min=-63.75 max=60.5
FeedForward output=0 dtype=torch.bfloat16 min=-1872.0 max=1872.0
FeedForward output=1 dtype=torch.bfloat16 min=-1400.0 max=1416.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-516.0 max=117.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-26240.0 max=15232.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-25216.0 max=15552.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-498.0 max=127.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-14.6875 max=14.8125
Linear output=1 dtype=torch.bfloat16 min=-14.6875 max=14.875
LayerNorm input=0 dtype=torch.bfloat16 min=-498.0 max=127.0
LayerNorm output=0 dtype=torch.bfloat16 min=-27.25 max=8.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-27.0 max=8.5625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-498.0 max=127.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-14.125 max=10.625
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-3.390625 max=4.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.2421875 max=3.5625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.03125 max=2.75
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-14.6875 max=14.875
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.75 max=20.5
Linear output=1 dtype=torch.bfloat16 min=-17.875 max=20.5
LayerNorm input=0 dtype=torch.bfloat16 min=-25216.0 max=15552.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=29.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=29.125
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-25216.0 max=15552.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.71875 max=5.1875
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-17.875 max=20.5
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-2.640625 max=1.8671875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-3.65625 max=14.8125
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-1.609375 max=1.8046875
Linear input=0 dtype=torch.bfloat16 min=-14.125 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-14.375 max=15.25
Linear output=1 dtype=torch.bfloat16 min=-14.4375 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-14.125 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-38.25 max=26.25
Linear output=1 dtype=torch.bfloat16 min=-37.75 max=26.75
Linear input=0 dtype=torch.bfloat16 min=-14.125 max=10.625
Linear output=0 dtype=torch.bfloat16 min=-20.375 max=18.25
Linear output=1 dtype=torch.bfloat16 min=-20.0 max=18.375
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=5.1875
Linear output=0 dtype=torch.bfloat16 min=-7.28125 max=6.8125
Linear output=1 dtype=torch.bfloat16 min=-7.4375 max=6.53125
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=5.1875
Linear output=0 dtype=torch.bfloat16 min=-11.0 max=15.0
Linear output=1 dtype=torch.bfloat16 min=-10.4375 max=15.1875
Linear input=0 dtype=torch.bfloat16 min=-6.71875 max=5.1875
Linear output=0 dtype=torch.bfloat16 min=-9.3125 max=10.4375
Linear output=1 dtype=torch.bfloat16 min=-10.5625 max=10.5
Linear input=0 dtype=torch.bfloat16 min=-19.75 max=16.375
Linear output=0 dtype=torch.bfloat16 min=-39.5 max=29.5
Linear output=1 dtype=torch.bfloat16 min=-39.5 max=29.625
Dropout input=0 dtype=torch.bfloat16 min=-39.5 max=29.625
Dropout output=0 dtype=torch.bfloat16 min=-39.5 max=29.5
Dropout output=1 dtype=torch.bfloat16 min=-39.5 max=29.625
Linear input=0 dtype=torch.bfloat16 min=-9.625 max=10.375
Linear output=0 dtype=torch.bfloat16 min=-17.375 max=16.75
Linear output=1 dtype=torch.bfloat16 min=-19.75 max=18.0
Attention output=0 dtype=torch.bfloat16 min=-39.5 max=29.625
Attention output=1 dtype=torch.bfloat16 min=-19.75 max=18.0
LayerNorm input=0 dtype=torch.bfloat16 min=-568.0 max=156.0
LayerNorm output=0 dtype=torch.bfloat16 min=-28.5 max=7.03125
LayerNorm output=1 dtype=torch.bfloat16 min=-28.375 max=7.09375
Linear input=0 dtype=torch.bfloat16 min=-6.125 max=4.625
Linear output=0 dtype=torch.bfloat16 min=-5.59375 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-5.59375 max=8.5
GELU input=0 dtype=torch.bfloat16 min=-6.125 max=4.625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=8.5
Linear output=0 dtype=torch.bfloat16 min=-25.5 max=68.0
Linear output=1 dtype=torch.bfloat16 min=-25.625 max=68.5
FeedForward input=0 dtype=torch.bfloat16 min=-6.125 max=4.625
FeedForward output=0 dtype=torch.bfloat16 min=-25.5 max=68.0
FeedForward output=1 dtype=torch.bfloat16 min=-25.625 max=68.5
LayerNorm input=0 dtype=torch.bfloat16 min=-25216.0 max=15552.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.75 max=29.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=29.125
Linear input=0 dtype=torch.bfloat16 min=-90.5 max=73.5
Linear output=0 dtype=torch.bfloat16 min=-25.375 max=81.5
Linear output=1 dtype=torch.bfloat16 min=-27.25 max=83.5
GELU input=0 dtype=torch.bfloat16 min=-90.5 max=73.5
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=81.5
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=83.5
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=83.5
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=81.5
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=83.5
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=83.5
Linear output=0 dtype=torch.bfloat16 min=-4048.0 max=2304.0
Linear output=1 dtype=torch.bfloat16 min=-4128.0 max=2368.0
FeedForward input=0 dtype=torch.bfloat16 min=-90.5 max=73.5
FeedForward output=0 dtype=torch.bfloat16 min=-4048.0 max=2304.0
FeedForward output=1 dtype=torch.bfloat16 min=-4128.0 max=2368.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-498.0 max=127.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-25216.0 max=15552.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-24960.0 max=15424.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-222.0 max=170.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-17.625 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-17.75 max=17.375
LayerNorm input=0 dtype=torch.bfloat16 min=-222.0 max=170.0
LayerNorm output=0 dtype=torch.bfloat16 min=-15.1875 max=8.0
LayerNorm output=1 dtype=torch.bfloat16 min=-14.625 max=8.0625
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-222.0 max=170.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-17.625 max=14.8125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-4.78125 max=5.125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.09375 max=3.546875
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-4.78125 max=3.640625
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-17.75 max=17.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-22.5 max=22.0
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=22.0
LayerNorm input=0 dtype=torch.bfloat16 min=-24960.0 max=15424.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=29.0
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=29.0
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-24960.0 max=15424.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-6.59375 max=8.3125
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-22.5 max=14.4375
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-3.46875 max=4.75
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-8.6875 max=22.0
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-3.203125 max=2.640625
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=14.8125
Linear output=0 dtype=torch.bfloat16 min=-22.625 max=24.25
Linear output=1 dtype=torch.bfloat16 min=-22.5 max=24.0
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=14.8125
Linear output=0 dtype=torch.bfloat16 min=-32.75 max=37.75
Linear output=1 dtype=torch.bfloat16 min=-32.25 max=37.75
Linear input=0 dtype=torch.bfloat16 min=-17.625 max=14.8125
Linear output=0 dtype=torch.bfloat16 min=-24.875 max=25.875
Linear output=1 dtype=torch.bfloat16 min=-25.125 max=26.0
Linear input=0 dtype=torch.bfloat16 min=-6.59375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-7.125 max=6.4375
Linear output=1 dtype=torch.bfloat16 min=-7.125 max=6.59375
Linear input=0 dtype=torch.bfloat16 min=-6.59375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-16.625 max=16.25
Linear output=1 dtype=torch.bfloat16 min=-16.375 max=16.375
Linear input=0 dtype=torch.bfloat16 min=-6.59375 max=8.3125
Linear output=0 dtype=torch.bfloat16 min=-11.3125 max=9.75
Linear output=1 dtype=torch.bfloat16 min=-10.9375 max=9.9375
Linear input=0 dtype=torch.bfloat16 min=-13.4375 max=13.375
Linear output=0 dtype=torch.bfloat16 min=-30.25 max=31.375
Linear output=1 dtype=torch.bfloat16 min=-30.5 max=31.75
Dropout input=0 dtype=torch.bfloat16 min=-30.5 max=31.75
Dropout output=0 dtype=torch.bfloat16 min=-30.25 max=31.375
Dropout output=1 dtype=torch.bfloat16 min=-30.5 max=31.75
Linear input=0 dtype=torch.bfloat16 min=-6.25 max=7.5625
Linear output=0 dtype=torch.bfloat16 min=-11.375 max=23.0
Linear output=1 dtype=torch.bfloat16 min=-12.125 max=22.0
Attention output=0 dtype=torch.bfloat16 min=-30.5 max=31.75
Attention output=1 dtype=torch.bfloat16 min=-12.125 max=23.0
LayerNorm input=0 dtype=torch.bfloat16 min=-232.0 max=172.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.5 max=7.21875
LayerNorm output=1 dtype=torch.bfloat16 min=-11.875 max=7.0625
Linear input=0 dtype=torch.bfloat16 min=-5.125 max=4.15625
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=6.5625
Linear output=1 dtype=torch.bfloat16 min=-6.3125 max=6.5625
GELU input=0 dtype=torch.bfloat16 min=-5.125 max=4.15625
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.5625
Linear output=0 dtype=torch.bfloat16 min=-24.625 max=13.75
Linear output=1 dtype=torch.bfloat16 min=-24.75 max=13.75
FeedForward input=0 dtype=torch.bfloat16 min=-5.125 max=4.15625
FeedForward output=0 dtype=torch.bfloat16 min=-24.625 max=13.75
FeedForward output=1 dtype=torch.bfloat16 min=-24.75 max=13.75
LayerNorm input=0 dtype=torch.bfloat16 min=-24960.0 max=15424.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.5 max=28.875
LayerNorm output=1 dtype=torch.bfloat16 min=-32.75 max=29.0
Linear input=0 dtype=torch.bfloat16 min=-220.0 max=256.0
Linear output=0 dtype=torch.bfloat16 min=-77.0 max=280.0
Linear output=1 dtype=torch.bfloat16 min=-80.5 max=282.0
GELU input=0 dtype=torch.bfloat16 min=-220.0 max=256.0
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=280.0
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=282.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=282.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=280.0
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=282.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=282.0
Linear output=0 dtype=torch.bfloat16 min=-19456.0 max=30592.0
Linear output=1 dtype=torch.bfloat16 min=-19584.0 max=30336.0
FeedForward input=0 dtype=torch.bfloat16 min=-220.0 max=256.0
FeedForward output=0 dtype=torch.bfloat16 min=-19456.0 max=30592.0
FeedForward output=1 dtype=torch.bfloat16 min=-19584.0 max=30336.0
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-222.0 max=170.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-24960.0 max=15424.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=0 dtype=torch.bfloat16 min=-21376.0 max=7072.0
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-175.0 max=164.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-15.3125 max=17.375
Linear output=1 dtype=torch.bfloat16 min=-15.375 max=17.25
LayerNorm input=0 dtype=torch.bfloat16 min=-175.0 max=164.0
LayerNorm output=0 dtype=torch.bfloat16 min=-8.9375 max=6.59375
LayerNorm output=1 dtype=torch.bfloat16 min=-8.3125 max=6.71875
AdaLayerNormZero input=0 dtype=torch.bfloat16 min=-175.0 max=164.0
AdaLayerNormZero output=0 dtype=torch.bfloat16 min=-12.6875 max=10.75
AdaLayerNormZero output=1 dtype=torch.bfloat16 min=-7.46875 max=10.3125
AdaLayerNormZero output=2 dtype=torch.bfloat16 min=-1.0546875 max=1.265625
AdaLayerNormZero output=3 dtype=torch.bfloat16 min=-1.65625 max=3.984375
AdaLayerNormZero output=4 dtype=torch.bfloat16 min=-15.375 max=17.375
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-4.15625 max=6.59375
Linear output=1 dtype=torch.bfloat16 min=-4.15625 max=6.5625
LayerNorm input=0 dtype=torch.bfloat16 min=-21376.0 max=7072.0
LayerNorm output=0 dtype=torch.bfloat16 min=-32.25 max=17.5
LayerNorm output=1 dtype=torch.bfloat16 min=-32.25 max=17.25
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-21376.0 max=7072.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-23.25 max=7.09375
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-5.75 max=7.8125
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-5.71875 max=8.375
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-14.1875 max=20.0
Linear output=1 dtype=torch.bfloat16 min=-14.125 max=20.0
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-15.0 max=14.1875
Linear output=1 dtype=torch.bfloat16 min=-14.6875 max=14.0
Linear input=0 dtype=torch.bfloat16 min=-12.6875 max=10.75
Linear output=0 dtype=torch.bfloat16 min=-10.125 max=11.0
Linear output=1 dtype=torch.bfloat16 min=-10.125 max=11.125
Linear input=0 dtype=torch.bfloat16 min=-5.75 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-1.2109375 max=1.0625
Linear output=1 dtype=torch.bfloat16 min=-1.1328125 max=1.0
Linear input=0 dtype=torch.bfloat16 min=-5.75 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-9.625 max=8.5
Linear output=1 dtype=torch.bfloat16 min=-9.625 max=8.25
Linear input=0 dtype=torch.bfloat16 min=-5.75 max=8.375
Linear output=0 dtype=torch.bfloat16 min=-7.09375 max=7.34375
Linear output=1 dtype=torch.bfloat16 min=-7.125 max=7.4375
Linear input=0 dtype=torch.bfloat16 min=-8.25 max=8.125
Linear output=0 dtype=torch.bfloat16 min=-25.5 max=36.5
Linear output=1 dtype=torch.bfloat16 min=-25.375 max=36.5
Dropout input=0 dtype=torch.bfloat16 min=-25.5 max=36.5
Dropout output=0 dtype=torch.bfloat16 min=-25.5 max=36.5
Dropout output=1 dtype=torch.bfloat16 min=-25.375 max=36.5
Attention output=0 dtype=torch.bfloat16 min=-25.5 max=36.5
Attention output=1 dtype=torch.bfloat16 min=-6.1875 max=5.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-286.0 max=175.0
LayerNorm output=0 dtype=torch.bfloat16 min=-12.5625 max=7.28125
LayerNorm output=1 dtype=torch.bfloat16 min=-12.0625 max=7.375
Linear input=0 dtype=torch.bfloat16 min=-7.6875 max=4.03125
Linear output=0 dtype=torch.bfloat16 min=-5.09375 max=5.96875
Linear output=1 dtype=torch.bfloat16 min=-5.03125 max=6.0
GELU input=0 dtype=torch.bfloat16 min=-7.6875 max=4.03125
GELU output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.96875
GELU output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.0
Dropout input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.0
Dropout output=0 dtype=torch.bfloat16 min=-0.169921875 max=5.96875
Dropout output=1 dtype=torch.bfloat16 min=-0.169921875 max=6.0
Linear input=0 dtype=torch.bfloat16 min=-0.169921875 max=6.0
Linear output=0 dtype=torch.bfloat16 min=-29.125 max=57.75
Linear output=1 dtype=torch.bfloat16 min=-29.125 max=58.25
FeedForward input=0 dtype=torch.bfloat16 min=-7.6875 max=4.03125
FeedForward output=0 dtype=torch.bfloat16 min=-29.125 max=57.75
FeedForward output=1 dtype=torch.bfloat16 min=-29.125 max=58.25
JointTransformerBlock input=0 dtype=torch.bfloat16 min=-175.0 max=164.0
JointTransformerBlock input=1 dtype=torch.bfloat16 min=-21376.0 max=7072.0
JointTransformerBlock input=2 dtype=torch.bfloat16 min=-23.25 max=7.09375
JointTransformerBlock output=1 dtype=torch.bfloat16 min=-488.0 max=252.0
SiLU input=0 dtype=torch.bfloat16 min=-23.25 max=7.09375
SiLU output=0 dtype=torch.bfloat16 min=-0.27734375 max=7.09375
SiLU output=1 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Linear input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.09375
Linear output=0 dtype=torch.bfloat16 min=-1.5 max=6.3125
Linear output=1 dtype=torch.bfloat16 min=-1.5078125 max=6.34375
LayerNorm input=0 dtype=torch.bfloat16 min=-488.0 max=252.0
LayerNorm output=0 dtype=torch.bfloat16 min=-14.5625 max=7.4375
LayerNorm output=1 dtype=torch.bfloat16 min=-14.1875 max=7.4375
AdaLayerNormContinuous input=0 dtype=torch.bfloat16 min=-488.0 max=252.0
AdaLayerNormContinuous input=1 dtype=torch.bfloat16 min=-23.25 max=7.09375
AdaLayerNormContinuous output=0 dtype=torch.bfloat16 min=-5.0 max=3.703125
AdaLayerNormContinuous output=1 dtype=torch.bfloat16 min=-5.0625 max=3.625
Linear input=0 dtype=torch.bfloat16 min=-5.0625 max=3.703125
Linear output=0 dtype=torch.bfloat16 min=-2.84375 max=3.640625
Linear output=1 dtype=torch.bfloat16 min=-2.875 max=3.59375
SD3Transformer2DModel output=0 dtype=torch.bfloat16 min=-2.875 max=3.640625
0%| | 0/2 [00:52<?, ?it/s]
Conv2d input=0 dtype=torch.bfloat16 min=-3.21875 max=3.28125
Conv2d output=0 dtype=torch.bfloat16 min=-2.59375 max=2.28125
GroupNorm input=0 dtype=torch.bfloat16 min=-2.59375 max=2.28125
GroupNorm output=0 dtype=torch.bfloat16 min=-5.1875 max=5.15625
SiLU input=0 dtype=torch.bfloat16 min=-5.1875 max=5.15625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.125
Conv2d output=0 dtype=torch.bfloat16 min=-14.3125 max=2.859375
GroupNorm input=0 dtype=torch.bfloat16 min=-14.3125 max=2.859375
GroupNorm output=0 dtype=torch.bfloat16 min=-5.78125 max=1.859375
SiLU input=0 dtype=torch.bfloat16 min=-5.78125 max=1.859375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.609375
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.609375
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.609375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.609375
Conv2d output=0 dtype=torch.bfloat16 min=-1.421875 max=1.046875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-2.59375 max=2.28125
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.09375 max=2.515625
GroupNorm input=0 dtype=torch.bfloat16 min=-3.09375 max=2.515625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.34375 max=4.40625
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-6.34375 max=5.5
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-5.25 max=5.6875
Linear input=0 dtype=torch.bfloat16 min=-5.34375 max=4.40625
Linear output=0 dtype=torch.bfloat16 min=-2.484375 max=2.953125
Linear input=0 dtype=torch.bfloat16 min=-1.234375 max=1.4609375
Linear output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.78515625
Dropout input=0 dtype=torch.bfloat16 min=-0.8828125 max=0.78515625
Dropout output=0 dtype=torch.bfloat16 min=-0.8828125 max=0.78515625
Attention input=0 dtype=torch.bfloat16 min=-3.09375 max=2.515625
Attention output=0 dtype=torch.bfloat16 min=-3.84375 max=2.40625
GroupNorm input=0 dtype=torch.bfloat16 min=-3.84375 max=2.40625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.34375 max=3.796875
SiLU input=0 dtype=torch.bfloat16 min=-5.34375 max=3.796875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.71875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.71875
Conv2d output=0 dtype=torch.bfloat16 min=-7.15625 max=1.671875
GroupNorm input=0 dtype=torch.bfloat16 min=-7.15625 max=1.671875
GroupNorm output=0 dtype=torch.bfloat16 min=-5.5625 max=2.109375
SiLU input=0 dtype=torch.bfloat16 min=-5.5625 max=2.109375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.8828125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.8828125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.8828125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.8828125
Conv2d output=0 dtype=torch.bfloat16 min=-1.3359375 max=1.203125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.84375 max=2.40625
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-4.0 max=2.484375
UNetMidBlock2D input=0 dtype=torch.bfloat16 min=-2.59375 max=2.28125
UNetMidBlock2D output=0 dtype=torch.bfloat16 min=-4.0 max=2.484375
GroupNorm input=0 dtype=torch.bfloat16 min=-4.0 max=2.484375
GroupNorm output=0 dtype=torch.bfloat16 min=-5.34375 max=4.1875
SiLU input=0 dtype=torch.bfloat16 min=-5.34375 max=4.1875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.125
Conv2d output=0 dtype=torch.bfloat16 min=-6.46875 max=1.6171875
GroupNorm input=0 dtype=torch.bfloat16 min=-6.46875 max=1.6171875
GroupNorm output=0 dtype=torch.bfloat16 min=-5.625 max=2.15625
SiLU input=0 dtype=torch.bfloat16 min=-5.625 max=2.15625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.9296875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.9296875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.9296875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.9296875
Conv2d output=0 dtype=torch.bfloat16 min=-1.359375 max=1.40625
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-4.0 max=2.484375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-4.15625 max=2.625
GroupNorm input=0 dtype=torch.bfloat16 min=-4.15625 max=2.625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.46875 max=5.03125
SiLU input=0 dtype=torch.bfloat16 min=-5.46875 max=5.03125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.0
Conv2d output=0 dtype=torch.bfloat16 min=-7.5625 max=1.734375
GroupNorm input=0 dtype=torch.bfloat16 min=-7.5625 max=1.734375
GroupNorm output=0 dtype=torch.bfloat16 min=-5.75 max=2.984375
SiLU input=0 dtype=torch.bfloat16 min=-5.75 max=2.984375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.84375
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.84375
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=2.84375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=2.84375
Conv2d output=0 dtype=torch.bfloat16 min=-1.015625 max=1.1953125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-4.15625 max=2.625
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.796875 max=2.734375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.796875 max=2.734375
GroupNorm output=0 dtype=torch.bfloat16 min=-6.34375 max=5.46875
SiLU input=0 dtype=torch.bfloat16 min=-6.34375 max=5.46875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.4375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.4375
Conv2d output=0 dtype=torch.bfloat16 min=-7.53125 max=1.6015625
GroupNorm input=0 dtype=torch.bfloat16 min=-7.53125 max=1.6015625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.65625 max=1.921875
SiLU input=0 dtype=torch.bfloat16 min=-5.65625 max=1.921875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.6796875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.6796875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=1.6796875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=1.6796875
Conv2d output=0 dtype=torch.bfloat16 min=-1.015625 max=1.7109375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.796875 max=2.734375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-3.28125 max=2.8125
Conv2d input=0 dtype=torch.bfloat16 min=-3.28125 max=2.8125
Conv2d output=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
Upsample2D input=0 dtype=torch.bfloat16 min=-3.28125 max=2.8125
Upsample2D output=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-4.0 max=2.484375
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
GroupNorm output=0 dtype=torch.bfloat16 min=-13.125 max=13.3125
SiLU input=0 dtype=torch.bfloat16 min=-13.125 max=13.3125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=13.3125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=13.3125
Conv2d output=0 dtype=torch.bfloat16 min=-6.34375 max=2.3125
GroupNorm input=0 dtype=torch.bfloat16 min=-6.34375 max=2.3125
GroupNorm output=0 dtype=torch.bfloat16 min=-10.75 max=4.71875
SiLU input=0 dtype=torch.bfloat16 min=-10.75 max=4.71875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=4.6875
Conv2d output=0 dtype=torch.bfloat16 min=-2.859375 max=1.3984375
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.5 max=3.4375
GroupNorm input=0 dtype=torch.bfloat16 min=-5.5 max=3.4375
GroupNorm output=0 dtype=torch.bfloat16 min=-7.34375 max=7.03125
SiLU input=0 dtype=torch.bfloat16 min=-7.34375 max=7.03125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.03125
Conv2d output=0 dtype=torch.bfloat16 min=-3.921875 max=2.65625
GroupNorm input=0 dtype=torch.bfloat16 min=-3.921875 max=2.65625
GroupNorm output=0 dtype=torch.bfloat16 min=-10.0 max=7.53125
SiLU input=0 dtype=torch.bfloat16 min=-10.0 max=7.53125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.53125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.53125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.53125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.53125
Conv2d output=0 dtype=torch.bfloat16 min=-2.90625 max=2.03125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.5 max=3.4375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.8125 max=4.0
GroupNorm input=0 dtype=torch.bfloat16 min=-5.8125 max=4.0
GroupNorm output=0 dtype=torch.bfloat16 min=-8.375 max=5.65625
SiLU input=0 dtype=torch.bfloat16 min=-8.375 max=5.65625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=5.625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=5.625
Conv2d output=0 dtype=torch.bfloat16 min=-3.578125 max=2.921875
GroupNorm input=0 dtype=torch.bfloat16 min=-3.578125 max=2.921875
GroupNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=9.5625
SiLU input=0 dtype=torch.bfloat16 min=-9.1875 max=9.5625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.5625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.5625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.5625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.5625
Conv2d output=0 dtype=torch.bfloat16 min=-2.15625 max=3.21875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-5.8125 max=4.0
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-5.34375 max=3.859375
Conv2d input=0 dtype=torch.bfloat16 min=-5.34375 max=3.859375
Conv2d output=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
Upsample2D input=0 dtype=torch.bfloat16 min=-5.34375 max=3.859375
Upsample2D output=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-3.875 max=2.859375
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
GroupNorm input=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
GroupNorm output=0 dtype=torch.bfloat16 min=-9.375 max=8.8125
SiLU input=0 dtype=torch.bfloat16 min=-9.375 max=8.8125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.8125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.8125
Conv2d output=0 dtype=torch.bfloat16 min=-5.90625 max=3.46875
GroupNorm input=0 dtype=torch.bfloat16 min=-5.90625 max=3.46875
GroupNorm output=0 dtype=torch.bfloat16 min=-9.5 max=6.34375
SiLU input=0 dtype=torch.bfloat16 min=-9.5 max=6.34375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.34375
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.34375
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.34375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.34375
Conv2d output=0 dtype=torch.bfloat16 min=-3.046875 max=2.53125
Conv2d input=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
Conv2d output=0 dtype=torch.bfloat16 min=-4.15625 max=3.46875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-4.84375 max=4.59375
GroupNorm input=0 dtype=torch.bfloat16 min=-4.84375 max=4.59375
GroupNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=6.46875
SiLU input=0 dtype=torch.bfloat16 min=-9.1875 max=6.46875
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.46875
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.46875
Conv2d output=0 dtype=torch.bfloat16 min=-3.265625 max=2.53125
GroupNorm input=0 dtype=torch.bfloat16 min=-3.265625 max=2.53125
GroupNorm output=0 dtype=torch.bfloat16 min=-9.9375 max=8.125
SiLU input=0 dtype=torch.bfloat16 min=-9.9375 max=8.125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Conv2d output=0 dtype=torch.bfloat16 min=-2.890625 max=1.828125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-4.84375 max=4.59375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-6.03125 max=5.375
GroupNorm input=0 dtype=torch.bfloat16 min=-6.03125 max=5.375
GroupNorm output=0 dtype=torch.bfloat16 min=-12.9375 max=6.8125
SiLU input=0 dtype=torch.bfloat16 min=-12.9375 max=6.8125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=6.8125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=6.8125
Conv2d output=0 dtype=torch.bfloat16 min=-2.65625 max=2.828125
GroupNorm input=0 dtype=torch.bfloat16 min=-2.65625 max=2.828125
GroupNorm output=0 dtype=torch.bfloat16 min=-7.53125 max=9.8125
SiLU input=0 dtype=torch.bfloat16 min=-7.53125 max=9.8125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.8125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.8125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=9.8125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=9.8125
Conv2d output=0 dtype=torch.bfloat16 min=-2.59375 max=2.796875
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-6.03125 max=5.375
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-7.3125 max=5.75
Conv2d input=0 dtype=torch.bfloat16 min=-7.3125 max=5.75
Conv2d output=0 dtype=torch.bfloat16 min=-6.125 max=7.5
Upsample2D input=0 dtype=torch.bfloat16 min=-7.3125 max=5.75
Upsample2D output=0 dtype=torch.bfloat16 min=-6.125 max=7.5
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-6.03125 max=5.0
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-6.125 max=7.5
GroupNorm input=0 dtype=torch.bfloat16 min=-6.125 max=7.5
GroupNorm output=0 dtype=torch.bfloat16 min=-9.3125 max=8.5625
SiLU input=0 dtype=torch.bfloat16 min=-9.3125 max=8.5625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.5625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.5625
Conv2d output=0 dtype=torch.bfloat16 min=-3.515625 max=3.4375
GroupNorm input=0 dtype=torch.bfloat16 min=-3.515625 max=3.4375
GroupNorm output=0 dtype=torch.bfloat16 min=-11.0625 max=7.625
SiLU input=0 dtype=torch.bfloat16 min=-11.0625 max=7.625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.625
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.625
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.625
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.625
Conv2d output=0 dtype=torch.bfloat16 min=-3.90625 max=2.46875
Conv2d input=0 dtype=torch.bfloat16 min=-6.125 max=7.5
Conv2d output=0 dtype=torch.bfloat16 min=-4.46875 max=4.5
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-6.125 max=7.5
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-6.6875 max=5.875
GroupNorm input=0 dtype=torch.bfloat16 min=-6.6875 max=5.875
GroupNorm output=0 dtype=torch.bfloat16 min=-9.1875 max=3.703125
SiLU input=0 dtype=torch.bfloat16 min=-9.1875 max=3.703125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.609375
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.609375
Conv2d output=0 dtype=torch.bfloat16 min=-1.671875 max=1.46875
GroupNorm input=0 dtype=torch.bfloat16 min=-1.671875 max=1.46875
GroupNorm output=0 dtype=torch.bfloat16 min=-7.25 max=8.125
SiLU input=0 dtype=torch.bfloat16 min=-7.25 max=8.125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=8.125
Conv2d output=0 dtype=torch.bfloat16 min=-3.359375 max=2.578125
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-6.6875 max=5.875
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-9.25 max=5.65625
GroupNorm input=0 dtype=torch.bfloat16 min=-9.25 max=5.65625
GroupNorm output=0 dtype=torch.bfloat16 min=-9.5625 max=3.859375
SiLU input=0 dtype=torch.bfloat16 min=-9.5625 max=3.859375
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.78125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.78125
Conv2d output=0 dtype=torch.bfloat16 min=-1.5546875 max=1.4765625
GroupNorm input=0 dtype=torch.bfloat16 min=-1.5546875 max=1.4765625
GroupNorm output=0 dtype=torch.bfloat16 min=-12.0625 max=7.3125
SiLU input=0 dtype=torch.bfloat16 min=-12.0625 max=7.3125
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.3125
Dropout input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.3125
Dropout output=0 dtype=torch.bfloat16 min=-0.279296875 max=7.3125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=7.3125
Conv2d output=0 dtype=torch.bfloat16 min=-5.96875 max=3.5
ResnetBlock2D input=0 dtype=torch.bfloat16 min=-9.25 max=5.65625
ResnetBlock2D output=0 dtype=torch.bfloat16 min=-11.125 max=6.0625
UpDecoderBlock2D input=0 dtype=torch.bfloat16 min=-6.125 max=7.5
UpDecoderBlock2D output=0 dtype=torch.bfloat16 min=-11.125 max=6.0625
GroupNorm input=0 dtype=torch.bfloat16 min=-11.125 max=6.0625
GroupNorm output=0 dtype=torch.bfloat16 min=-5.21875 max=3.90625
SiLU input=0 dtype=torch.bfloat16 min=-5.21875 max=3.90625
SiLU output=0 dtype=torch.bfloat16 min=-0.279296875 max=3.828125
Conv2d input=0 dtype=torch.bfloat16 min=-0.279296875 max=3.828125
Conv2d output=0 dtype=torch.bfloat16 min=-1.796875 max=1.6171875
Decoder input=0 dtype=torch.bfloat16 min=-3.21875 max=3.28125
Decoder output=0 dtype=torch.bfloat16 min=-1.796875 max=1.6171875
average inference time=66.66518592834473
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment