Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/27224f60f14c2e05dc90adfee91d04bd to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/27224f60f14c2e05dc90adfee91d04bd to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Regressed Latencies 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 25.184 (vs. 21.994, 14.50%↑) 25.384 0.886
BertForMaskedLMTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 451.777 (vs. 395.203, 14.32%↑) 452.648 6.183
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 5.025 (vs. 4.686, 7.25%↑) 5.090 0.412
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 2.693 (vs. 2.547, 5.74%↑) 2.693 0.064
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 6.736 (vs. 6.399, 5.27%↑) 6.448 0.933

Improved Latencies 🎉

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 1018.159 (vs. 1239.801, 17.88%↓) 1014.282 21.102
MobileNetV2\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 5003.644 (vs. 5851.031, 14.48%↓) 5047.260 145.482
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 273.181 (vs. 300.340, 9.04%↓) 272.799 1.888
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 11.075 (vs. 11.958, 7.38%↓) 10.925 0.595
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 7.706 (vs. 8.212, 6.16%↓) 7.732 0.081
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 6.122 (vs. 6.504, 5.87%↓) 6.120 0.192
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 324.744 (vs. 344.717, 5.79%↓) 326.664 6.939
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 42.425 (vs. 44.738, 5.17%↓) 42.660 0.817
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 15.185 (vs. 16.005, 5.12%↓) 15.196 0.407

Similar Latencies

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 92.571 (vs. 86.043, 7.59%↑) 92.609 0.630
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 9.464 (vs. 8.957, 5.66%↑) 9.642 0.306
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 31.912 (vs. 33.591, 5.00%↓) 31.965 0.232
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 46.202 (vs. 48.609, 4.95%↓) 46.332 0.360
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 121.025 (vs. 127.048, 4.74%↓) 121.889 2.990
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.871 (vs. 8.261, 4.73%↓) 7.702 0.363
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 99.708 (vs. 104.634, 4.71%↓) 99.677 0.149
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 21.152 (vs. 22.192, 4.69%↓) 21.217 0.184
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 369.209 (vs. 387.217, 4.65%↓) 369.195 4.225
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 181.664 (vs. 190.350, 4.56%↓) 182.477 6.842
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 20.445 (vs. 21.347, 4.22%↓) 20.727 0.718
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 845.193 (vs. 811.117, 4.20%↑) 843.972 6.116
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 257.664 (vs. 268.956, 4.20%↓) 258.443 1.396
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.164 (vs. 6.876, 4.19%↑) 6.935 0.361
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 352.665 (vs. 367.970, 4.16%↓) 352.643 0.141
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 11.432 (vs. 11.908, 4.00%↓) 12.126 1.416
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 643.116 (vs. 618.438, 3.99%↑) 618.341 79.335
MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 8.161 (vs. 7.863, 3.79%↑) 8.099 0.464
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 211.354 (vs. 219.621, 3.76%↓) 209.957 8.336
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 50.896 (vs. 52.854, 3.71%↓) 50.934 0.127
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 8.008 (vs. 7.725, 3.66%↑) 7.791 0.433
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.025 (vs. 6.778, 3.64%↑) 6.770 0.357
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 55.780 (vs. 53.840, 3.60%↑) 54.890 1.799
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 10.097 (vs. 10.464, 3.51%↓) 10.886 1.174
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 21.818 (vs. 21.080, 3.50%↑) 21.728 0.305
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 26.833 (vs. 27.791, 3.45%↓) 26.978 0.761
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 18.809 (vs. 19.476, 3.43%↓) 18.846 0.451
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 4.438 (vs. 4.582, 3.15%↓) 4.418 0.086
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 56.971 (vs. 58.813, 3.13%↓) 57.279 0.840
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 479.950 (vs. 495.461, 3.13%↓) 480.845 5.151
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 57.311 (vs. 59.150, 3.11%↓) 57.589 1.004
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 4.892 (vs. 5.039, 2.91%↓) 5.063 0.457
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 10.233 (vs. 10.530, 2.82%↓) 10.247 0.242
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 24.227 (vs. 24.925, 2.80%↓) 24.263 0.359
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 6.540 (vs. 6.728, 2.80%↓) 6.536 0.038
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 39.426 (vs. 40.559, 2.79%↓) 39.485 0.353
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 195.186 (vs. 189.918, 2.77%↑) 195.515 1.395
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 241.963 (vs. 248.705, 2.71%↓) 240.892 4.186
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 6.263 (vs. 6.100, 2.67%↑) 6.305 0.148
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 29.770 (vs. 29.011, 2.62%↑) 28.050 5.884
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 2.680 (vs. 2.612, 2.59%↑) 2.707 0.107
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 37.488 (vs. 38.476, 2.57%↓) 37.604 0.599
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 25.101 (vs. 25.761, 2.56%↓) 25.080 0.093
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 37.749 (vs. 38.737, 2.55%↓) 38.011 0.600
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 21.416 (vs. 21.942, 2.40%↓) 21.479 0.277
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 1295.181 (vs. 1326.161, 2.34%↓) 1295.767 1.715
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 1.818 (vs. 1.777, 2.30%↑) 1.751 0.115
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 18.594 (vs. 19.025, 2.27%↓) 18.638 0.310
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 81.217 (vs. 79.460, 2.21%↑) 81.214 1.004
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 36.481 (vs. 37.300, 2.19%↓) 36.410 0.143
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 13.162 (vs. 12.886, 2.15%↑) 13.164 0.096
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 3.197 (vs. 3.130, 2.14%↑) 3.232 0.094
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 72.687 (vs. 74.267, 2.13%↓) 74.583 4.202
BertLargeTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 1516.830 (vs. 1547.914, 2.01%↓) 1514.660 18.387
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 22.567 (vs. 22.139, 1.93%↑) 22.555 0.127
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 14.569 (vs. 14.856, 1.93%↓) 14.643 0.330
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 485.873 (vs. 495.388, 1.92%↓) 488.209 7.413
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 35.009 (vs. 34.375, 1.84%↑) 34.952 0.221
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 535.609 (vs. 525.960, 1.83%↑) 542.823 25.968
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 3963.248 (vs. 3893.661, 1.79%↑) 3950.517 49.093
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.539 (vs. 7.409, 1.75%↑) 7.430 0.225
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 34.006 (vs. 34.613, 1.75%↓) 33.898 0.411
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 21.837 (vs. 21.468, 1.72%↑) 21.918 0.645
Resnet50TFBatch1(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 30.970 (vs. 30.452, 1.70%↑) 30.960 0.074
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.878 (vs. 3.815, 1.66%↑) 3.876 0.019
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 77.822 (vs. 79.106, 1.62%↓) 77.796 0.139
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 233.766 (vs. 237.543, 1.59%↓) 232.952 1.405
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 76.849 (vs. 75.648, 1.59%↑) 76.870 0.629
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 21.161 (vs. 21.492, 1.54%↓) 21.153 0.168
MobileNetV3Small\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 1116.491 (vs. 1099.564, 1.54%↑) 1118.032 23.983
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 3949.077 (vs. 3889.325, 1.54%↑) 3940.776 40.363
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 31.927 (vs. 32.424, 1.53%↓) 32.475 1.832
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 6.912 (vs. 7.018, 1.52%↓) 6.937 0.103
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.487 (vs. 3.539, 1.47%↓) 3.473 0.052
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 104.746 (vs. 103.238, 1.46%↑) 104.716 0.428
BertLargeTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 1523.207 (vs. 1545.175, 1.42%↓) 1521.239 21.811
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 37.247 (vs. 36.740, 1.38%↑) 37.264 0.063
EfficientNetV2SPT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 401.812 (vs. 396.535, 1.33%↑) 401.737 0.877
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 14.205 (vs. 14.021, 1.31%↑) 14.212 0.083
DeepLabV3\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 43.251 (vs. 42.697, 1.30%↑) 43.824 1.500
MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 51.545 (vs. 50.888, 1.29%↑) 50.088 3.067
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 23.287 (vs. 22.990, 1.29%↑) 23.288 0.079
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 5.393 (vs. 5.464, 1.29%↓) 5.392 0.007
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 4.496 (vs. 4.553, 1.26%↓) 4.501 0.032
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 90.779 (vs. 89.670, 1.24%↑) 90.565 0.418
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags,demote-f32-to-f16] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 86.409 (vs. 87.475, 1.22%↓) 87.329 2.979
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 440.692 (vs. 435.386, 1.22%↑) 440.050 2.681
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 8.997 (vs. 8.889, 1.21%↑) 9.015 0.087
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 10.876 (vs. 10.745, 1.21%↑) 10.881 0.019
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 1534.385 (vs. 1552.991, 1.20%↓) 1534.311 2.237
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 202.100 (vs. 199.764, 1.17%↑) 202.070 0.490
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 16.576 (vs. 16.769, 1.15%↓) 16.560 0.093
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 14.994 (vs. 14.826, 1.13%↑) 14.996 0.038
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 74.301 (vs. 75.141, 1.12%↓) 74.442 0.280
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 598.536 (vs. 591.991, 1.11%↑) 597.308 5.246
MiniLML12H384Uncased(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 1.504 (vs. 1.488, 1.08%↑) 1.504 0.003
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 29.367 (vs. 29.070, 1.02%↑) 30.244 1.771
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 67.568 (vs. 68.255, 1.01%↓) 67.581 0.080
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 217.580 (vs. 219.739, 0.98%↓) 220.785 6.936
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 57.637 (vs. 57.082, 0.97%↑) 56.558 1.698
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 4.674 (vs. 4.719, 0.96%↓) 4.670 0.032
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 155.919 (vs. 157.409, 0.95%↓) 155.899 0.569
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 198.955 (vs. 200.857, 0.95%↓) 199.079 0.398
EfficientNetV2STF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 7.701 (vs. 7.774, 0.94%↓) 7.701 0.002
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 10.576 (vs. 10.677, 0.94%↓) 10.579 0.039
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 27.828 (vs. 27.571, 0.93%↑) 27.759 0.184
ClipTextSeqLen64PT(linalg) [nvidia-pascal-vulkan\_linux-vulkan\_spirv][experimental-flags,simt] vulkan(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 10.439 (vs. 10.537, 0.93%↓) 10.442 0.025
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.765 (vs. 3.731, 0.92%↑) 3.767 0.014
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 24.243 (vs. 24.029, 0.89%↑) 24.259 0.065
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 76.558 (vs. 75.892, 0.88%↑) 76.677 0.687
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 15.118 (vs. 14.989, 0.86%↑) 15.205 0.402
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 34.276 (vs. 33.996, 0.82%↑) 34.260 0.115
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 32.368 (vs. 32.637, 0.82%↓) 32.248 0.378
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 1325.434 (vs. 1314.737, 0.81%↑) 1325.026 4.558
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 104.909 (vs. 104.073, 0.80%↑) 104.774 0.496
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 253.383 (vs. 251.394, 0.79%↑) 255.551 8.519
EfficientNetV2SPT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 84.937 (vs. 84.281, 0.78%↑) 84.449 1.579
Resnet50TFBatch1(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 187.646 (vs. 186.269, 0.74%↑) 188.181 1.454
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 1301.214 (vs. 1291.755, 0.73%↑) 1309.448 34.216
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 355.666 (vs. 353.090, 0.73%↑) 355.262 2.072
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 20.257 (vs. 20.112, 0.72%↑) 20.446 0.771
MobileBertSquad\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 283.146 (vs. 285.132, 0.70%↓) 290.620 14.424
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 194.620 (vs. 195.963, 0.69%↓) 194.563 1.382
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 404.490 (vs. 401.768, 0.68%↑) 405.183 1.990
DeepLabV3\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 42.273 (vs. 42.554, 0.66%↓) 42.550 0.713
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 305.186 (vs. 307.213, 0.66%↓) 304.836 1.474
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 66.748 (vs. 66.313, 0.66%↑) 66.747 0.077
ClipTextSeqLen64PT(linalg) [nvidia-ampere-vulkan\_linux-vulkan\_spirv][experimental-flags,tensorcore] vulkan(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 10.578 (vs. 10.511, 0.64%↑) 10.524 0.117
MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 200.803 (vs. 199.550, 0.63%↑) 200.771 1.325
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 22.661 (vs. 22.522, 0.62%↑) 22.649 0.149
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 200.956 (vs. 202.188, 0.61%↓) 201.013 0.724
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 560.970 (vs. 557.593, 0.61%↑) 559.976 8.185
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 103.518 (vs. 104.141, 0.60%↓) 103.574 0.541
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 12.433 (vs. 12.506, 0.59%↓) 12.080 1.112
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 30.101 (vs. 30.279, 0.58%↓) 30.126 0.121
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 31.439 (vs. 31.619, 0.57%↓) 31.460 0.039
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 2858.489 (vs. 2873.794, 0.53%↓) 2859.981 5.268
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.367 (vs. 7.406, 0.53%↓) 7.347 0.067
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 7.104 (vs. 7.068, 0.51%↑) 7.094 0.035
BertForMaskedLMTF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 7.090 (vs. 7.054, 0.51%↑) 7.071 0.062
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 4162.536 (vs. 4142.274, 0.49%↑) 4159.897 11.288
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 99.895 (vs. 100.383, 0.49%↓) 99.852 0.201
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 67.292 (vs. 67.606, 0.46%↓) 67.277 0.199
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 7.906 (vs. 7.942, 0.46%↓) 7.936 0.128
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 307.955 (vs. 306.563, 0.45%↑) 307.263 2.042
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 14.095 (vs. 14.159, 0.45%↓) 14.111 0.054
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 591.422 (vs. 593.960, 0.43%↓) 589.816 4.865
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 63.907 (vs. 63.647, 0.41%↑) 63.902 0.046
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 3410.980 (vs. 3397.179, 0.41%↑) 3417.904 26.490
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 59.833 (vs. 60.077, 0.40%↓) 59.829 0.410
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 113.571 (vs. 113.113, 0.40%↑) 113.397 0.489
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 59.261 (vs. 59.498, 0.40%↓) 59.324 0.271
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 76.904 (vs. 76.608, 0.39%↑) 76.898 0.085
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 869.160 (vs. 865.890, 0.38%↑) 868.792 1.222
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 36.434 (vs. 36.298, 0.37%↑) 36.431 0.061
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 151.093 (vs. 150.537, 0.37%↑) 151.098 0.211
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 25.399 (vs. 25.492, 0.36%↓) 25.720 0.877
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 13.467 (vs. 13.515, 0.36%↓) 13.463 0.018
matmul\_3456x1024x2048\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.064 (vs. 0.065, 0.35%↓) 0.064 0.000
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 82.470 (vs. 82.189, 0.34%↑) 82.469 0.088
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 13.084 (vs. 13.040, 0.33%↑) 13.086 0.011
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 27.064 (vs. 26.976, 0.32%↑) 27.168 0.773
BertLargeTF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 10.501 (vs. 10.468, 0.31%↑) 10.500 0.003
EfficientNetB7PT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 618.217 (vs. 616.281, 0.31%↑) 617.077 3.752
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 274.728 (vs. 275.588, 0.31%↓) 273.318 3.526
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 181.695 (vs. 182.253, 0.31%↓) 181.725 0.391
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 413.460 (vs. 414.722, 0.30%↓) 413.502 1.065
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.798 (vs. 0.800, 0.30%↓) 0.797 0.010
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 48.855 (vs. 48.709, 0.30%↑) 48.838 0.055
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 457.452 (vs. 458.817, 0.30%↓) 457.435 1.275
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 12.321 (vs. 12.286, 0.28%↑) 12.326 0.045
matmul\_2560x2560x2560\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.333 (vs. 0.332, 0.28%↑) 0.333 0.000
ClipTextSeqLen64PT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 8.592 (vs. 8.615, 0.27%↓) 8.591 0.003
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 402.880 (vs. 403.934, 0.26%↓) 402.864 0.865
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,demote-f32-to-f16] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 86.971 (vs. 87.197, 0.26%↓) 87.192 0.906
MobileNetV2\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 5777.743 (vs. 5792.709, 0.26%↓) 5818.778 129.053
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.469 (vs. 7.487, 0.24%↓) 7.387 0.219
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 125.756 (vs. 126.043, 0.23%↓) 125.745 0.439
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 400.706 (vs. 399.808, 0.22%↑) 400.149 2.498
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 10.268 (vs. 10.291, 0.22%↓) 10.266 0.023
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 8.922 (vs. 8.903, 0.22%↑) 8.922 0.157
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 1614.306 (vs. 1610.799, 0.22%↑) 1615.763 8.171
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 87.884 (vs. 88.076, 0.22%↓) 87.299 1.029
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 193.356 (vs. 193.774, 0.22%↓) 193.363 0.097
matmul\_3456x1024x2048\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.135 (vs. 0.135, 0.21%↑) 0.135 0.000
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 11.724 (vs. 11.748, 0.21%↓) 11.705 0.079
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 398.325 (vs. 399.134, 0.20%↓) 398.736 1.086
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 670.337 (vs. 669.001, 0.20%↑) 670.371 1.379
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 829.437 (vs. 827.791, 0.20%↑) 829.735 2.019
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 2875.817 (vs. 2870.530, 0.18%↑) 2875.000 3.519
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 181.625 (vs. 181.292, 0.18%↑) 181.675 0.468
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 580.878 (vs. 579.821, 0.18%↑) 580.672 0.447
EfficientNetB7PT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 86.813 (vs. 86.971, 0.18%↓) 86.492 0.988
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 2765.973 (vs. 2770.920, 0.18%↓) 2765.760 4.490
Unet2dPT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 316.039 (vs. 315.515, 0.17%↑) 315.909 0.301
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 2770.065 (vs. 2774.584, 0.16%↓) 2781.651 22.903
Unet2dPT(linalg) [nvidia-ampere-vulkan\_linux-vulkan\_spirv][experimental-flags,tensorcore] vulkan(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 114.226 (vs. 114.411, 0.16%↓) 114.205 0.121
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 464.123 (vs. 463.431, 0.15%↑) 464.076 0.299
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 358.282 (vs. 357.750, 0.15%↑) 358.231 0.259
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 192.661 (vs. 192.944, 0.15%↓) 192.638 0.233
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 1080.253 (vs. 1078.712, 0.14%↑) 1080.313 0.677
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 516.392 (vs. 515.662, 0.14%↑) 516.144 1.309
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 26.025 (vs. 26.056, 0.12%↓) 26.055 0.119
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 1652.900 (vs. 1651.003, 0.11%↑) 1653.101 3.081
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 52.126 (vs. 52.068, 0.11%↑) 51.406 2.162
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 1326.330 (vs. 1324.897, 0.11%↑) 1327.532 6.277
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 5.595 (vs. 5.601, 0.10%↓) 5.576 0.080
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 7.239 (vs. 7.246, 0.09%↓) 7.249 0.028
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 56.514 (vs. 56.463, 0.09%↑) 56.491 0.097
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 660.736 (vs. 660.157, 0.09%↑) 660.973 0.653
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 4.330 (vs. 4.334, 0.09%↓) 4.327 0.015
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 85.901 (vs. 85.827, 0.09%↑) 85.901 0.073
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 75.989 (vs. 75.926, 0.08%↑) 75.955 0.237
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 19.130 (vs. 19.146, 0.08%↓) 19.101 0.172
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 11.517 (vs. 11.508, 0.08%↑) 11.517 0.005
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 31.752 (vs. 31.776, 0.08%↓) 31.734 0.059
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 7.594 (vs. 7.599, 0.07%↓) 7.631 0.092
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 350.498 (vs. 350.732, 0.07%↓) 351.746 2.357
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.716 (vs. 0.716, 0.06%↑) 0.716 0.001
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 15.923 (vs. 15.913, 0.06%↑) 15.901 0.084
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 1554.655 (vs. 1555.576, 0.06%↓) 1554.141 9.009
matmul\_128x256x8192\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul,splitk] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.044 (vs. 0.044, 0.06%↓) 0.044 0.000
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel,demote-f32-to-f16] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 80.257 (vs. 80.211, 0.06%↑) 80.310 0.437
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 152.844 (vs. 152.918, 0.05%↓) 153.075 1.209
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 618.025 (vs. 617.728, 0.05%↑) 618.051 1.548
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.792 (vs. 0.792, 0.04%↓) 0.792 0.002
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 101.668 (vs. 101.624, 0.04%↑) 101.686 0.196
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 263.756 (vs. 263.870, 0.04%↓) 263.834 0.655
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 396.427 (vs. 396.282, 0.04%↑) 396.456 0.265
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 3293.578 (vs. 3292.454, 0.03%↑) 3291.755 5.451
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 1265.254 (vs. 1265.628, 0.03%↓) 1265.077 0.789
matmul\_128x256x8192\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul,splitk] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.024 (vs. 0.024, 0.03%↑) 0.024 0.000
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 67.005 (vs. 67.025, 0.03%↓) 66.180 1.758
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 4133.634 (vs. 4132.615, 0.02%↑) 4132.097 7.551
MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 27.683 (vs. 27.690, 0.02%↓) 27.694 0.114
MobileBertSquad\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 282.226 (vs. 282.170, 0.02%↑) 290.631 14.273
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 31.199 (vs. 31.195, 0.01%↑) 31.420 0.800
Unet2dPT(linalg) [nvidia-pascal-vulkan\_linux-vulkan\_spirv][experimental-flags,simt] vulkan(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 144.911 (vs. 144.926, 0.01%↓) 144.906 0.078
matmul\_2560x2560x2560\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.147 (vs. 0.147, 0.01%↓) 0.147 0.000
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 72.542 (vs. 72.538, 0.01%↑) 72.538 0.084
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 29.841 (vs. 29.842, 0.00%↓) 29.806 0.413

All Compilation Metrics

Benchmark Name Compilation Time (ms) Total Dispatch Size (bytes) Total Artifact Size (bytes)
PersonDetect_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 27881 (vs. 28207, 1.16%↓) 156472 (vs. 156472, 0.00%) 416325 (vs. 416325, 0.00%)
MobileNetV3Small_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 39660 (vs. 32271, 22.90%↑) 208680 (vs. 208680, 0.00%) 10430021 (vs. 10430021, 0.00%)
DeepLabV3_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 33001 (vs. 26810, 23.09%↑) 141096 (vs. 141096, 0.00%) 2922117 (vs. 2922117, 0.00%)
EfficientNet_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 70639 (vs. 55671, 26.89%↑) 503880 (vs. 503880, 0.00%) 5538629 (vs. 5538629, 0.00%)
MobileNetV1_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 18100 (vs. 18873, 4.10%↓) 88320 (vs. 88320, 0.00%) 17004677 (vs. 17004677, 0.00%)
MobileNetV2_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 31198 (vs. 29304, 6.46%↑) 142248 (vs. 142248, 0.00%) 14125509 (vs. 14125509, 0.00%)
MobileNetV2_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 70759 (vs. 64488, 9.72%↑) 476664 (vs. 476664, 0.00%) 4129413 (vs. 4129413, 0.00%)
MobileSSD_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 50268 (vs. 43427, 15.75%↑) 247112 (vs. 247112, 0.00%) 18187077 (vs. 18187077, 0.00%)
PoseNet_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 16936 (vs. 14918, 13.53%↑) 83672 (vs. 83672, 0.00%) 5136709 (vs. 5136709, 0.00%)
MobileBertSquad_fp16(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 58220 (vs. 58778, 0.95%↓) 79072 (vs. 79072, 0.00%) 99936645 (vs. 99936645, 0.00%)
MobileBertSquad_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 64112 (vs. 57066, 12.35%↑) 82336 (vs. 82336, 0.00%) 98480901 (vs. 98480901, 0.00%)
MobileBertSquad_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 339751 (vs. 320920, 5.87%↑) 5987712 (vs. 5987712, 0.00%) 31196805 (vs. 31196805, 0.00%)
EfficientNetV2STF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 105954 (vs. 93057, 13.86%↑) 235000 (vs. 235000, 0.00%) 164123217 (vs. 164123217, 0.00%)
MiniLML12H384Uncased(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 34584 (vs. 31552, 9.61%↑) 66032 (vs. 66032, 0.00%) 133766228 (vs. 133766228, 0.00%)
Resnet50TFBatch1(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 51942 (vs. 46675, 11.28%↑) 135896 (vs. 135896, 0.00%) 137643325 (vs. 137643325, 0.00%)
EfficientNetV2SPT(linalg) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 92162 (vs. 80634, 14.30%↑) 397536 (vs. 397536, 0.00%) 86909381 (vs. 86909381, 0.00%)
BertForMaskedLMTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 38327 (vs. 33596, 14.08%↑) 60256 (vs. 60256, 0.00%) 438459409 (vs. 438459409, 0.00%)
BertLargeTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 36066 (vs. 38285, 5.80%↓) 58384 (vs. 58384, 0.00%) 1336032523 (vs. 1336032523, 0.00%)
EfficientNetB7PT(linalg) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 147116 (vs. 114234, 28.78%↑) 581120 (vs. 581120, 0.00%) 267319877 (vs. 267319877, 0.00%)
PersonDetect_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 26960 (vs. 25726, 4.80%↑) 158456 (vs. 158456, 0.00%) 415429 (vs. 415429, 0.00%)
MobileNetV3Small_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 32885 (vs. 35861, 8.30%↓) 207032 (vs. 207032, 0.00%) 10426757 (vs. 10426757, 0.00%)
DeepLabV3_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 27088 (vs. 20435, 32.56%↑) 146440 (vs. 146440, 0.00%) 2922501 (vs. 2922501, 0.00%)
EfficientNet_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 57688 (vs. 46725, 23.46%↑) 516056 (vs. 516056, 0.00%) 5546821 (vs. 5546821, 0.00%)
MobileNetV2_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 24650 (vs. 24133, 2.14%↑) 149192 (vs. 149192, 0.00%) 14127749 (vs. 14127749, 0.00%)
MobileNetV2_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 80386 (vs. 69961, 14.90%↑) 476664 (vs. 476664, 0.00%) 4129413 (vs. 4129413, 0.00%)
MobileSSD_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 41948 (vs. 36367, 15.35%↑) 254792 (vs. 254792, 0.00%) 18189381 (vs. 18189381, 0.00%)
PoseNet_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 15616 (vs. 11681, 33.69%↑) 86216 (vs. 86216, 0.00%) 5134789 (vs. 5134789, 0.00%)
MobileBertSquad_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 58044 (vs. 51766, 12.13%↑) 82336 (vs. 82336, 0.00%) 98480901 (vs. 98480901, 0.00%)
MobileBertSquad_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 362517 (vs. 344379, 5.27%↑) 5987712 (vs. 5987712, 0.00%) 31196805 (vs. 31196805, 0.00%)
EfficientNetV2STF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 105759 (vs. 93783, 12.77%↑) 229688 (vs. 229688, 0.00%) 164113937 (vs. 164113937, 0.00%)
MiniLML12H384Uncased(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 32258 (vs. 28469, 13.31%↑) 66032 (vs. 66032, 0.00%) 133766228 (vs. 133766228, 0.00%)
BertLargeTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 37197 (vs. 35298, 5.38%↑) 58384 (vs. 58384, 0.00%) 1336032523 (vs. 1336032523, 0.00%)
EfficientNetV2STF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 109634 (vs. 95380, 14.94%↑) 944068 (vs. 944068, 0.00%) 164907877 (vs. 164907877, 0.00%)
MiniLML12H384Uncased(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 33833 (vs. 28017, 20.76%↑) 159016 (vs. 159016, 0.00%) 133877563 (vs. 133877563, 0.00%)
BertForMaskedLMTF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 29617 (vs. 32079, 7.67%↓) 204014 (vs. 204014, 0.00%) 438620817 (vs. 438620817, 0.00%)
BertLargeTF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 37800 (vs. 32529, 16.20%↑) 142970 (vs. 142970, 0.00%) 1336122859 (vs. 1336122859, 0.00%)
ClipTextSeqLen64PT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 77305 (vs. 68182, 13.38%↑) 88394 (vs. 88394, 0.00%) 492381747 (vs. 492381747, 0.00%)
Unet2dPT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 259759 (vs. 230104, 12.89%↑) 1746196 (vs. 1746196, 0.00%) 3440116838 (vs. 3440116838, 0.00%)
EfficientNetB7PT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 85464 (vs. 81430, 4.95%↑) 2701600 (vs. 2701600, 0.00%) 269500177 (vs. 269500177, 0.00%)
matmul_3456x1024x2048_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1135 (vs. 965, 17.62%↑) 34574 (vs. 34574, 0.00%) 44977 (vs. 44977, 0.00%)
matmul_3456x1024x2048_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1392 (vs. 1412, 1.42%↓) 41782 (vs. 41782, 0.00%) 52185 (vs. 52185, 0.00%)
matmul_2560x2560x2560_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 886 (vs. 762, 16.27%↑) 33642 (vs. 33642, 0.00%) 43981 (vs. 43981, 0.00%)
matmul_2560x2560x2560_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1182 (vs. 1452, 18.60%↓) 40506 (vs. 40506, 0.00%) 50845 (vs. 50845, 0.00%)
matmul_128x256x8192_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,splitk,compile-stats] 490 (vs. 477, 2.73%↑) 9464 (vs. 9464, 0.00%) 26635 (vs. 26635, 0.00%)
matmul_128x256x8192_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,splitk,compile-stats] 519 (vs. 480, 8.12%↑) 10252 (vs. 10252, 0.00%) 27411 (vs. 27411, 0.00%)
Resnet50TFBatch1(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 40651 (vs. 42931, 5.31%↓) 394080 (vs. 394080, 0.00%) 137932747 (vs. 137932747, 0.00%)
Resnet50TFBatch8(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 40606 (vs. 37058, 9.57%↑) 382630 (vs. 382630, 0.00%) 103165825 (vs. 103165825, 0.00%)
Resnet50TFBatch64(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 47687 (vs. 40307, 18.31%↑) 433070 (vs. 433070, 0.00%) 103216193 (vs. 103216193, 0.00%)
Resnet50TFBatch128(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 42386 (vs. 42140, 0.58%↑) 433710 (vs. 433710, 0.00%) 103216833 (vs. 103216833, 0.00%)
Resnet50TFBatch256(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 47539 (vs. 39840, 19.32%↑) 433722 (vs. 433722, 0.00%) 103216897 (vs. 103216897, 0.00%)
Resnet50TFBatch2048(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 45251 (vs. 40986, 10.41%↑) 434226 (vs. 434226, 0.00%) 103217345 (vs. 103217345, 0.00%)
BertLargeTFBatch1(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 78236 (vs. 69365, 12.79%↑) 114680 (vs. 114680, 0.00%) 1337218466 (vs. 1337218466, 0.00%)
BertLargeTFBatch16(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 79948 (vs. 77957, 2.55%↑) 438030 (vs. 438030, 0.00%) 1337545102 (vs. 1337545102, 0.00%)
BertLargeTFBatch24(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 83437 (vs. 75408, 10.65%↑) 438374 (vs. 438374, 0.00%) 1337545422 (vs. 1337545422, 0.00%)
BertLargeTFBatch32(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 82847 (vs. 80644, 2.73%↑) 438466 (vs. 438466, 0.00%) 1337545486 (vs. 1337545486, 0.00%)
BertLargeTFBatch48(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 80427 (vs. 79365, 1.34%↑) 438466 (vs. 438466, 0.00%) 1337545550 (vs. 1337545550, 0.00%)
BertLargeTFBatch64(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 84143 (vs. 87500, 3.84%↓) 438514 (vs. 438514, 0.00%) 1337545614 (vs. 1337545614, 0.00%)
BertLargeTFBatch512(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 80791 (vs. 71044, 13.72%↑) 438738 (vs. 438738, 0.00%) 1337545870 (vs. 1337545870, 0.00%)
BertLargeTFBatch1024(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 82092 (vs. 79001, 3.91%↑) 438782 (vs. 438782, 0.00%) 1337545806 (vs. 1337545806, 0.00%)
BertLargeTFBatch1280(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 79268 (vs. 78803, 0.59%↑) 438782 (vs. 438782, 0.00%) 1337545870 (vs. 1337545870, 0.00%)
T5LargeTFBatch1(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 128480 (vs. 105090, 22.26%↑) 119084 (vs. 119084, 0.00%) 2954683426 (vs. 2954683426, 0.00%)
T5LargeTFBatch16(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 141390 (vs. 134588, 5.05%↑) 275464 (vs. 275464, 0.00%) 2953798014 (vs. 2953798014, 0.00%)
T5LargeTFBatch24(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 139696 (vs. 118589, 17.80%↑) 275548 (vs. 275548, 0.00%) 2953798206 (vs. 2953798206, 0.00%)
T5LargeTFBatch32(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 116938 (vs. 108247, 8.03%↑) 275548 (vs. 275548, 0.00%) 2953798142 (vs. 2953798142, 0.00%)
T5LargeTFBatch48(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 142039 (vs. 132404, 7.28%↑) 275548 (vs. 275548, 0.00%) 2953798142 (vs. 2953798142, 0.00%)
T5LargeTFBatch64(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 137879 (vs. 135504, 1.75%↑) 275604 (vs. 275604, 0.00%) 2953798270 (vs. 2953798270, 0.00%)
T5LargeTFBatch512(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 141197 (vs. 132162, 6.84%↑) 274268 (vs. 274268, 0.00%) 2953796542 (vs. 2953796542, 0.00%)
BertLargePTBatch1(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 228709 (vs. 203929, 12.15%↑) 113038 (vs. 113038, 0.00%) 1336569944 (vs. 1336569944, 0.00%)
BertLargePTBatch16(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 233316 (vs. 220407, 5.86%↑) 490282 (vs. 490282, 0.00%) 1336958672 (vs. 1336958672, 0.00%)
BertLargePTBatch24(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 232424 (vs. 224274, 3.63%↑) 490682 (vs. 490682, 0.00%) 1336959120 (vs. 1336959120, 0.00%)
BertLargePTBatch32(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 215078 (vs. 208219, 3.29%↑) 490774 (vs. 490774, 0.00%) 1336959120 (vs. 1336959120, 0.00%)
BertLargePTBatch48(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 190492 (vs. 183752, 3.67%↑) 490774 (vs. 490774, 0.00%) 1336959184 (vs. 1336959184, 0.00%)
BertLargePTBatch64(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 217164 (vs. 185998, 16.76%↑) 490802 (vs. 490802, 0.00%) 1336959248 (vs. 1336959248, 0.00%)
BertLargePTBatch512(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 233351 (vs. 206785, 12.85%↑) 490994 (vs. 490994, 0.00%) 1336959440 (vs. 1336959440, 0.00%)
BertLargePTBatch1024(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 233466 (vs. 206618, 12.99%↑) 491038 (vs. 491038, 0.00%) 1336959376 (vs. 1336959376, 0.00%)
BertLargePTBatch1280(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 234324 (vs. 218472, 7.26%↑) 491042 (vs. 491042, 0.00%) 1336959504 (vs. 1336959504, 0.00%)
Resnet50PTBatch1(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 34116 (vs. 25236, 35.19%↑) 616424 (vs. 616424, 0.00%) 103088167 (vs. 103088167, 0.00%)
Resnet50PTBatch8(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 30464 (vs. 22834, 33.42%↑) 299342 (vs. 299342, 0.00%) 102771221 (vs. 102771221, 0.00%)
Resnet50PTBatch64(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 26901 (vs. 22803, 17.97%↑) 353858 (vs. 353858, 0.00%) 102825749 (vs. 102825749, 0.00%)
Resnet50PTBatch128(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 31745 (vs. 23419, 35.55%↑) 354338 (vs. 354338, 0.00%) 102826133 (vs. 102826133, 0.00%)
Resnet50PTBatch256(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 23268 (vs. 25316, 8.09%↓) 354370 (vs. 354370, 0.00%) 102826197 (vs. 102826197, 0.00%)
Resnet50PTBatch2048(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 30252 (vs. 23824, 26.98%↑) 354862 (vs. 354862, 0.00%) 102826709 (vs. 102826709, 0.00%)
DeepLabV3_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 20679 (vs. 15399, 34.29%↑) 48152 (vs. 48152, 0.00%) 2829191 (vs. 2829191, 0.00%)
MobileBertSquad_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 52285 (vs. 46373, 12.75%↑) 38456 (vs. 38456, 0.00%) 98436999 (vs. 98436999, 0.00%)
MobileNetV1_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 16330 (vs. 14277, 14.38%↑) 53064 (vs. 53064, 0.00%) 16969287 (vs. 16969287, 0.00%)
MobileBertSquad_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 254337 (vs. 238002, 6.86%↑) 2170080 (vs. 2170080, 0.00%) 27379207 (vs. 27379207, 0.00%)
PersonDetect_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 22831 (vs. 20979, 8.83%↑) 72232 (vs. 72232, 0.00%) 332103 (vs. 332103, 0.00%)
EfficientNet_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 45190 (vs. 42687, 5.86%↑) 170088 (vs. 170088, 0.00%) 5204871 (vs. 5204871, 0.00%)
MobileNetV2_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 55910 (vs. 49771, 12.33%↑) 184976 (vs. 184976, 0.00%) 3837703 (vs. 3837703, 0.00%)
EfficientNet_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 37503 (vs. 42379, 11.51%↓) 300652 (vs. 300652, 0.00%) 5335431 (vs. 5335431, 0.00%)
MobileBertSquad_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 306986 (vs. 302684, 1.42%↑) 2297668 (vs. 2297668, 0.00%) 27506759 (vs. 27506759, 0.00%)
PersonDetect_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 24165 (vs. 22135, 9.17%↑) 174140 (vs. 174140, 0.00%) 433991 (vs. 433991, 0.00%)
MobileNetV2_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 64157 (vs. 55877, 14.82%↑) 254340 (vs. 254340, 0.00%) 3907079 (vs. 3907079, 0.00%)
DeepLabV3_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 44236 (vs. 33515, 31.99%↑) 96440 (vs. 96440, 0.00%) 2877445 (vs. 2877445, 0.00%)
MobileSSD_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 95722 (vs. 79041, 21.10%↑) 284360 (vs. 284360, 0.00%) 18224325 (vs. 18224325, 0.00%)
PoseNet_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 20310 (vs. 16521, 22.93%↑) 33816 (vs. 33816, 0.00%) 5086853 (vs. 5086853, 0.00%)
MobileBertSquad_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 70123 (vs. 62031, 13.05%↑) 113576 (vs. 113576, 0.00%) 98512133 (vs. 98512133, 0.00%)
MobileNetV2_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 37784 (vs. 37656, 0.34%↑) 104216 (vs. 104216, 0.00%) 14087493 (vs. 14087493, 0.00%)
MobileNetV3Small_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 43838 (vs. 42012, 4.35%↑) 94808 (vs. 94808, 0.00%) 10316165 (vs. 10316165, 0.00%)
MobileBertSquad_int8(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 282364 (vs. 268199, 5.28%↑) 2064888 (vs. 2064888, 0.00%) 27273989 (vs. 27273989, 0.00%)
DeepLabV3_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 39865 (vs. 36890, 8.06%↑) 95504 (vs. 95504, 0.00%) 2892229 (vs. 2892229, 0.00%)
MobileSSD_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 65519 (vs. 54196, 20.89%↑) 133568 (vs. 133568, 0.00%) 18092805 (vs. 18092805, 0.00%)
PoseNet_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 21144 (vs. 21792, 2.97%↓) 43920 (vs. 43920, 0.00%) 5102789 (vs. 5102789, 0.00%)
MobileBertSquad_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 74158 (vs. 62696, 18.28%↑) 29248 (vs. 29248, 0.00%) 98591621 (vs. 98591621, 0.00%)
MobileNetV2_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 36433 (vs. 36838, 1.10%↓) 92528 (vs. 92528, 0.00%) 14089221 (vs. 14089221, 0.00%)
MobileNetV3Small_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 53654 (vs. 47428, 13.13%↑) 127296 (vs. 127296, 0.00%) 10365893 (vs. 10365893, 0.00%)
MobileBertSquad_int8(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,dotprod,compile-stats] 270015 (vs. 248538, 8.64%↑) 1615424 (vs. 1615424, 0.00%) 27009797 (vs. 27009797, 0.00%)
DeepLabV3_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 14029 (vs. 15624, 10.21%↓) 389704 (vs. 389704, 0.00%) 3189487 (vs. 3189487, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 27437 (vs. 24420, 12.35%↑) 543662 (vs. 543662, 0.00%) 18517723 (vs. 18517723, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 10574 (vs. 6874, 53.83%↑) 137332 (vs. 137332, 0.00%) 5201323 (vs. 5201323, 0.00%)
MobileBertSquad_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 47689 (vs. 42914, 11.13%↑) 224190 (vs. 224190, 0.00%) 98633946 (vs. 98633946, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 14997 (vs. 12283, 22.10%↑) 330534 (vs. 330534, 0.00%) 14332098 (vs. 14332098, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 26993 (vs. 20792, 29.82%↑) 395100 (vs. 395100, 0.00%) 10648489 (vs. 10648489, 0.00%)
DeepLabV3_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 14835 (vs. 16554, 10.38%↓) 411244 (vs. 411244, 0.00%) 3202952 (vs. 3202952, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 24438 (vs. 17935, 36.26%↑) 603028 (vs. 603028, 0.00%) 18566530 (vs. 18566530, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 9228 (vs. 8577, 7.59%↑) 158108 (vs. 158108, 0.00%) 5215325 (vs. 5215325, 0.00%)
MobileBertSquad_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 48148 (vs. 49690, 3.10%↓) 224190 (vs. 224190, 0.00%) 98633946 (vs. 98633946, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 21744 (vs. 16002, 35.88%↑) 373464 (vs. 373464, 0.00%) 14366036 (vs. 14366036, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 21510 (vs. 20140, 6.80%↑) 410130 (vs. 410130, 0.00%) 10659246 (vs. 10659246, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 27370 (vs. 24152, 13.32%↑) 603028 (vs. 603028, 0.00%) 18649730 (vs. 18649730, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 6864 (vs. 7574, 9.37%↓) 158108 (vs. 158108, 0.00%) 5243933 (vs. 5243933, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 17471 (vs. 17062, 2.40%↑) 373464 (vs. 373464, 0.00%) 14412308 (vs. 14412308, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 24714 (vs. 26293, 6.01%↓) 410130 (vs. 410130, 0.00%) 10738222 (vs. 10738222, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 19367 (vs. 16774, 15.46%↑) 178828 (vs. 178828, 0.00%) 2978735 (vs. 2978735, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 25533 (vs. 26363, 3.15%↓) 356186 (vs. 356186, 0.00%) 18330395 (vs. 18330395, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 8433 (vs. 9713, 13.18%↓) 95836 (vs. 95836, 0.00%) 5159851 (vs. 5159851, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 46228 (vs. 44290, 4.38%↑) 127750 (vs. 127750, 0.00%) 98537498 (vs. 98537498, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 17860 (vs. 16353, 9.22%↑) 196098 (vs. 196098, 0.00%) 14197826 (vs. 14197826, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 21686 (vs. 25019, 13.32%↓) 259492 (vs. 259492, 0.00%) 10513001 (vs. 10513001, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,demote-f32-to-f16,compile-stats] 80507 (vs. 79453, 1.33%↑) 2977872 (vs. 2977872, 0.00%) 52959520 (vs. 52959520, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 200104 (vs. 182125, 9.87%↑) 7470618 (vs. 7470618, 0.00%) 32920384 (vs. 32920384, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 31524 (vs. 28508, 10.58%↑) 523406 (vs. 523406, 0.00%) 5600293 (vs. 5600293, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 19768 (vs. 17791, 11.11%↑) 326524 (vs. 326524, 0.00%) 612979 (vs. 612979, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 12690 (vs. 13273, 4.39%↓) 201944 (vs. 201944, 0.00%) 2993736 (vs. 2993736, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 23113 (vs. 21561, 7.20%↑) 377916 (vs. 377916, 0.00%) 18341634 (vs. 18341634, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 6717 (vs. 7898, 14.95%↓) 117288 (vs. 117288, 0.00%) 5174557 (vs. 5174557, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 48017 (vs. 42902, 11.92%↑) 127750 (vs. 127750, 0.00%) 98537498 (vs. 98537498, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 12001 (vs. 11745, 2.18%↑) 213196 (vs. 213196, 0.00%) 14205972 (vs. 14205972, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 24888 (vs. 16420, 51.57%↑) 282506 (vs. 282506, 0.00%) 10531758 (vs. 10531758, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,demote-f32-to-f16,compile-stats] 101475 (vs. 85847, 18.20%↑) 2977872 (vs. 2977872, 0.00%) 52959520 (vs. 52959520, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 214211 (vs. 217699, 1.60%↓) 7470618 (vs. 7470618, 0.00%) 32920384 (vs. 32920384, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 33249 (vs. 27779, 19.69%↑) 548328 (vs. 548328, 0.00%) 5616147 (vs. 5616147, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 19170 (vs. 14762, 29.86%↑) 340138 (vs. 340138, 0.00%) 620254 (vs. 620254, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 18945 (vs. 14755, 28.40%↑) 201944 (vs. 201944, 0.00%) 3097864 (vs. 3097864, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 34097 (vs. 28744, 18.62%↑) 377916 (vs. 377916, 0.00%) 18513474 (vs. 18513474, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 8198 (vs. 9139, 10.30%↓) 117288 (vs. 117288, 0.00%) 5233629 (vs. 5233629, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 75390 (vs. 60823, 23.95%↑) 127750 (vs. 127750, 0.00%) 99801306 (vs. 99801306, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 12748 (vs. 11437, 11.46%↑) 213196 (vs. 213196, 0.00%) 14301460 (vs. 14301460, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 23863 (vs. 22137, 7.80%↑) 282506 (vs. 282506, 0.00%) 10694958 (vs. 10694958, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,demote-f32-to-f16,compile-stats] 179972 (vs. 177816, 1.21%↑) 2977872 (vs. 2977872, 0.00%) 54223328 (vs. 54223328, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 300719 (vs. 272213, 10.47%↑) 7470618 (vs. 7470618, 0.00%) 35441088 (vs. 35441088, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 34598 (vs. 39470, 12.34%↓) 548328 (vs. 548328, 0.00%) 5824467 (vs. 5824467, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 21380 (vs. 20940, 2.10%↑) 340138 (vs. 340138, 0.00%) 748766 (vs. 748766, 0.00%)
ClipTextSeqLen64PT(linalg) [nvidia-ampere-vulkan_linux-vulkan_spirv][experimental-flags,tensorcore,compile-stats] 83012 (vs. 74299, 11.73%↑) 202696 (vs. 202696, 0.00%) 492522977 (vs. 492522977, 0.00%)
Unet2dPT(linalg) [nvidia-ampere-vulkan_linux-vulkan_spirv][experimental-flags,tensorcore,compile-stats] 269571 (vs. 251674, 7.11%↑) 3787496 (vs. 3787496, 0.00%) 3442195216 (vs. 3442195216, 0.00%)
ClipTextSeqLen64PT(linalg) [nvidia-pascal-vulkan_linux-vulkan_spirv][experimental-flags,simt,compile-stats] 80262 (vs. 74744, 7.38%↑) 202696 (vs. 202696, 0.00%) 492522977 (vs. 492522977, 0.00%)
Unet2dPT(linalg) [nvidia-pascal-vulkan_linux-vulkan_spirv][experimental-flags,simt,compile-stats] 264384 (vs. 250021, 5.74%↑) 3778208 (vs. 3778208, 0.00%) 3442185936 (vs. 3442185936, 0.00%)
MobileNetV2_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags,compile-stats] 16631 (vs. 16720, 0.53%↓) 95861 (vs. 95861, 0.00%) 14079102 (vs. 14079102, 0.00%)
MobileNetV3Small_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags,compile-stats] 30811 (vs. 26241, 17.42%↑) 164533 (vs. 164533, 0.00%) 10385854 (vs. 10385854, 0.00%)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment