Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/7e8b3fe7afd5abb30380f07f786a3a19 to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/7e8b3fe7afd5abb30380f07f786a3a19 to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Regressed Latencies 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 9.140 (vs. 7.381, 23.83%↑) 9.193 0.285
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 100.675 (vs. 87.382, 15.21%↑) 101.622 2.615
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 8.672 (vs. 7.929, 9.37%↑) 8.632 0.133
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 9.546 (vs. 8.766, 8.90%↑) 9.546 0.073
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags,demote-f32-to-f16] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 83.014 (vs. 76.381, 8.68%↑) 83.023 0.430
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 15.357 (vs. 14.165, 8.41%↑) 15.285 0.210
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 15.834 (vs. 14.801, 6.98%↑) 15.839 0.061
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,demote-f32-to-f16] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 93.109 (vs. 87.522, 6.38%↑) 93.177 3.079
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 10.922 (vs. 10.350, 5.52%↑) 10.940 0.061
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 38.953 (vs. 37.050, 5.14%↑) 36.267 8.488

Improved Latencies 🎉

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 4698.424 (vs. 5017.274, 6.36%↓) 4736.794 130.781
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 33.894 (vs. 36.030, 5.93%↓) 33.925 0.194
MobileNetV3Small\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 952.656 (vs. 1012.464, 5.91%↓) 960.969 37.459
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 40.073 (vs. 42.302, 5.27%↓) 40.323 0.658
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 209.978 (vs. 221.476, 5.19%↓) 210.856 7.072
MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 26.925 (vs. 28.058, 4.04%↓) 26.982 0.231

Similar Latencies

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.599 (vs. 4.537, 20.67%↓) 3.600 0.027
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.780 (vs. 4.571, 17.32%↓) 3.779 0.030
MobileNetV1\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.472 (vs. 8.259, 9.52%↓) 7.256 0.438
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 13.051 (vs. 14.259, 8.47%↓) 13.024 0.144
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.682 (vs. 8.232, 6.68%↓) 7.679 0.049
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 8.098 (vs. 8.597, 5.81%↓) 8.081 0.051
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.435 (vs. 7.825, 4.98%↓) 7.369 0.163
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 81.016 (vs. 77.198, 4.95%↑) 80.754 0.957
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 37.129 (vs. 39.025, 4.86%↓) 37.162 0.176
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 17.032 (vs. 17.861, 4.64%↓) 17.186 0.408
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 10.260 (vs. 10.737, 4.44%↓) 10.262 0.124
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 12.753 (vs. 12.280, 3.85%↑) 12.774 0.093
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 2.720 (vs. 2.621, 3.77%↑) 2.685 0.093
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 15.019 (vs. 14.498, 3.59%↑) 15.030 0.085
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 4.658 (vs. 4.829, 3.54%↓) 4.611 0.145
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 20.811 (vs. 21.561, 3.48%↓) 20.546 0.831
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 11.997 (vs. 12.426, 3.46%↓) 12.082 0.325
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 7.909 (vs. 8.190, 3.43%↓) 7.933 0.071
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 57.544 (vs. 59.558, 3.38%↓) 57.801 0.785
MobileNetV3Small\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 4.207 (vs. 4.070, 3.36%↑) 4.194 0.025
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 27.871 (vs. 28.835, 3.34%↓) 27.964 0.524
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 45.875 (vs. 47.410, 3.24%↓) 45.990 0.270
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 16.846 (vs. 16.324, 3.20%↑) 16.839 0.103
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 21.245 (vs. 21.945, 3.19%↓) 21.583 0.749
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 6.098 (vs. 6.298, 3.17%↓) 6.118 0.139
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 6.737 (vs. 6.955, 3.14%↓) 6.738 0.031
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 66.259 (vs. 64.295, 3.05%↑) 66.043 1.687
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 100.236 (vs. 97.297, 3.02%↑) 100.187 0.358
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 282.410 (vs. 291.197, 3.02%↓) 282.100 1.933
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 58.661 (vs. 56.954, 3.00%↑) 59.078 1.048
BertLargeTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 1635.239 (vs. 1685.465, 2.98%↓) 1640.565 17.354
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 50.274 (vs. 51.757, 2.86%↓) 50.246 0.113
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 4.445 (vs. 4.573, 2.80%↓) 4.434 0.065
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 7.113 (vs. 7.317, 2.78%↓) 7.145 0.099
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 273.645 (vs. 281.473, 2.78%↓) 273.727 1.697
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 69.482 (vs. 71.449, 2.75%↓) 69.495 0.088
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 18.731 (vs. 19.260, 2.74%↓) 18.863 0.373
PoseNet\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 16.324 (vs. 15.895, 2.70%↑) 16.296 0.141
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 14.537 (vs. 14.159, 2.67%↑) 14.525 0.127
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 30.653 (vs. 29.868, 2.63%↑) 30.645 0.159
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 37.435 (vs. 38.437, 2.61%↓) 37.376 0.502
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 223.751 (vs. 218.086, 2.60%↑) 223.901 1.075
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 21.274 (vs. 21.825, 2.52%↓) 21.336 0.204
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 376.116 (vs. 385.780, 2.51%↓) 375.890 2.570
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 106.191 (vs. 103.617, 2.48%↑) 106.003 0.580
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 25.118 (vs. 25.750, 2.45%↓) 25.129 0.038
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 4.928 (vs. 4.811, 2.45%↑) 5.061 0.394
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.168 (vs. 7.345, 2.41%↓) 7.146 0.059
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 3392.504 (vs. 3471.553, 2.28%↓) 3392.824 10.812
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 21.753 (vs. 22.248, 2.22%↓) 21.719 0.199
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 42.024 (vs. 42.975, 2.21%↓) 42.133 0.461
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 23.751 (vs. 24.273, 2.15%↓) 23.743 0.086
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 85.192 (vs. 87.051, 2.13%↓) 85.159 0.282
matmul\_128x256x8192\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul,splitk] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.025 (vs. 0.024, 2.12%↑) 0.025 0.000
EfficientNetB7PT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 622.387 (vs. 609.574, 2.10%↑) 620.809 16.601
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 78.117 (vs. 76.515, 2.09%↑) 78.222 0.746
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 33.527 (vs. 34.241, 2.08%↓) 33.582 0.483
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 545.189 (vs. 556.718, 2.07%↓) 548.353 15.906
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 401.875 (vs. 393.794, 2.05%↑) 401.133 2.696
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 108.256 (vs. 106.090, 2.04%↑) 105.454 4.976
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 21.430 (vs. 21.873, 2.02%↓) 21.431 0.233
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 6.582 (vs. 6.717, 2.01%↓) 6.586 0.031
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 24.697 (vs. 25.189, 1.95%↓) 24.751 0.399
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 54.798 (vs. 55.879, 1.94%↓) 54.011 1.739
MobileBertSquad\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 279.481 (vs. 284.948, 1.92%↓) 288.664 16.012
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 79.889 (vs. 78.430, 1.86%↑) 80.001 0.333
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 1.792 (vs. 1.760, 1.83%↑) 1.795 0.023
BertLargeTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 1623.547 (vs. 1653.636, 1.82%↓) 1624.712 21.620
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 34.245 (vs. 34.858, 1.76%↓) 34.236 0.146
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 460.446 (vs. 452.670, 1.72%↑) 456.025 14.405
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 251.745 (vs. 256.114, 1.71%↓) 251.801 0.111
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 42.088 (vs. 42.780, 1.62%↓) 42.193 0.648
MobileBertSquad\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 278.985 (vs. 283.497, 1.59%↓) 289.032 15.931
EfficientNet\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.449 (vs. 7.333, 1.58%↑) 7.344 0.202
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 503.319 (vs. 511.394, 1.58%↓) 502.577 1.392
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 4.716 (vs. 4.643, 1.58%↑) 5.022 0.539
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 18.993 (vs. 19.296, 1.57%↓) 19.072 0.329
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.786 (vs. 0.798, 1.57%↓) 0.786 0.002
MobileBertSquad\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 86.955 (vs. 88.314, 1.54%↓) 86.770 0.601
EfficientNet\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 10.764 (vs. 10.603, 1.51%↑) 10.753 0.051
MobileNetV2\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 3.555 (vs. 3.502, 1.50%↑) 3.547 0.154
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 34.669 (vs. 35.175, 1.44%↓) 34.557 0.265
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 1560.935 (vs. 1540.722, 1.31%↑) 1556.715 13.164
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 198.495 (vs. 201.054, 1.27%↓) 198.522 0.779
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 6.142 (vs. 6.071, 1.18%↑) 6.111 0.098
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 76.221 (vs. 77.126, 1.17%↓) 76.233 0.068
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 23.291 (vs. 23.559, 1.14%↓) 23.226 0.165
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 199.744 (vs. 197.513, 1.13%↑) 199.869 0.929
PersonDetect\_int8(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 2.609 (vs. 2.581, 1.09%↑) 2.618 0.045
matmul\_128x256x8192\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul,splitk] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.045 (vs. 0.044, 1.08%↑) 0.045 0.000
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 34.933 (vs. 35.314, 1.08%↓) 34.956 0.246
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 3952.313 (vs. 3994.654, 1.06%↓) 3951.777 4.277
MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 48.101 (vs. 48.614, 1.06%↓) 47.989 0.361
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 1524.579 (vs. 1509.531, 1.00%↑) 1524.503 1.938
MobileNetV2\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.302 (vs. 7.375, 0.99%↓) 7.171 0.247
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 4077.373 (vs. 4118.018, 0.99%↓) 4077.034 3.116
MobileNetV2\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 8.918 (vs. 8.831, 0.98%↑) 8.927 0.126
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 451.552 (vs. 447.184, 0.98%↑) 451.839 5.817
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 103.639 (vs. 104.602, 0.92%↓) 104.231 1.456
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 24.385 (vs. 24.612, 0.92%↓) 24.875 0.794
MobileNetV3Small\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 4.551 (vs. 4.510, 0.91%↑) 4.553 0.027
MobileBertSquad\_fp16(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,demote-f32-to-f16] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 81.294 (vs. 80.566, 0.90%↑) 81.207 1.006
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 199.142 (vs. 197.365, 0.90%↑) 199.351 1.069
EfficientNetV2SPT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 43.284 (vs. 43.675, 0.90%↓) 43.112 0.596
EfficientNetV2STF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 265.097 (vs. 267.461, 0.88%↓) 264.908 1.127
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 10.900 (vs. 10.809, 0.84%↑) 10.897 0.017
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 352.774 (vs. 355.666, 0.81%↓) 352.485 1.518
MobileNetV2\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 5462.454 (vs. 5418.966, 0.80%↑) 5500.550 130.655
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 351.210 (vs. 354.046, 0.80%↓) 330.757 70.210
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 31.724 (vs. 31.475, 0.79%↑) 32.699 1.501
MobileNetV3Small\_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags] local\_task(vmvx\_module)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 1049.374 (vs. 1057.432, 0.76%↓) 1049.040 20.619
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 5.482 (vs. 5.524, 0.75%↓) 5.481 0.018
MobileSSD\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 9.502 (vs. 9.573, 0.74%↓) 9.494 0.075
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 392.001 (vs. 389.195, 0.72%↑) 392.068 0.584
BertLargefp16PTBatch1(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 9.660 (vs. 9.729, 0.71%↓) 9.655 0.010
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 59.133 (vs. 59.551, 0.70%↓) 59.466 0.802
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 517.637 (vs. 521.248, 0.69%↓) 501.341 48.126
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 15.245 (vs. 15.140, 0.69%↑) 15.200 0.370
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 1329.944 (vs. 1320.929, 0.68%↑) 1328.677 3.941
DeepLabV3\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 41.948 (vs. 42.235, 0.68%↓) 43.055 1.691
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 407.216 (vs. 404.482, 0.68%↑) 406.904 2.243
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 1657.426 (vs. 1668.700, 0.68%↓) 1659.082 6.737
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 21.329 (vs. 21.471, 0.66%↓) 21.764 0.726
EfficientNetV2Sfp16PT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 11.610 (vs. 11.535, 0.65%↑) 11.589 0.067
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 250.948 (vs. 252.580, 0.65%↓) 250.517 3.104
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 68.511 (vs. 68.945, 0.63%↓) 68.435 0.404
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ pixel-6-pro[gpu] 35.238 (vs. 35.020, 0.62%↑) 35.171 0.294
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 28.784 (vs. 28.961, 0.61%↓) 29.761 1.601
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 2791.665 (vs. 2774.679, 0.61%↑) 2793.331 8.911
matmul\_2564x2564x2564\_f32t\_f32t\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.896 (vs. 0.902, 0.60%↓) 0.896 0.000
MobileBertSquad\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 126.229 (vs. 125.508, 0.57%↑) 126.535 0.771
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 2887.469 (vs. 2904.099, 0.57%↓) 2887.340 2.848
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 105.551 (vs. 106.155, 0.57%↓) 105.367 0.638
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 60.123 (vs. 60.456, 0.55%↓) 60.115 0.286
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 103.054 (vs. 102.491, 0.55%↑) 103.094 0.187
PoseNet\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 5.614 (vs. 5.644, 0.54%↓) 5.613 0.014
matmul\_3456x1024x2048\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.060 (vs. 0.059, 0.53%↑) 0.060 0.000
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 13.670 (vs. 13.742, 0.52%↓) 13.648 0.055
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 588.028 (vs. 585.024, 0.51%↑) 588.367 3.815
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 86.629 (vs. 86.192, 0.51%↑) 86.629 0.086
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 182.963 (vs. 183.855, 0.49%↓) 184.448 4.687
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 2800.992 (vs. 2814.617, 0.48%↓) 2799.040 6.482
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 846.158 (vs. 850.225, 0.48%↓) 846.175 1.836
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 1311.539 (vs. 1317.731, 0.47%↓) 1310.745 4.598
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 257.187 (vs. 256.017, 0.46%↑) 258.965 7.033
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 161.181 (vs. 160.464, 0.45%↑) 160.944 0.543
MiniLML12H384Uncased(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 20.817 (vs. 20.909, 0.44%↓) 20.529 0.589
MobileSSD\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 30.060 (vs. 30.189, 0.42%↓) 30.102 0.148
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 202.471 (vs. 201.655, 0.40%↑) 202.200 1.207
MobileNetV3Small\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 3.147 (vs. 3.160, 0.40%↓) 3.160 0.064
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 3973.502 (vs. 3989.363, 0.40%↓) 3972.848 6.428
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 62.201 (vs. 62.447, 0.39%↓) 62.203 0.032
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 277.593 (vs. 278.672, 0.39%↓) 274.794 4.819
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 93.251 (vs. 92.891, 0.39%↑) 93.443 0.515
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 11.818 (vs. 11.863, 0.38%↓) 11.813 0.027
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 197.221 (vs. 196.494, 0.37%↑) 197.114 0.440
MobileSSD\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 19.075 (vs. 19.145, 0.37%↓) 19.182 0.289
MiniLML12H384Uncased(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 1.477 (vs. 1.471, 0.36%↑) 1.476 0.002
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 77.131 (vs. 77.400, 0.35%↓) 77.890 2.133
BertLargeTF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 10.508 (vs. 10.473, 0.34%↑) 10.507 0.005
DeepLabV3\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 41.899 (vs. 42.033, 0.32%↓) 41.892 0.247
PoseNet\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 26.026 (vs. 26.108, 0.31%↓) 26.115 0.167
BertForMaskedLMTF(stablehlo) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 410.980 (vs. 409.700, 0.31%↑) 409.493 5.002
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 159.119 (vs. 158.629, 0.31%↑) 159.128 0.164
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.710 (vs. 0.712, 0.30%↓) 0.710 0.001
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 81.661 (vs. 81.423, 0.29%↑) 81.685 0.100
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 70.431 (vs. 70.635, 0.29%↓) 70.404 0.078
DeepLabV3\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 7.396 (vs. 7.375, 0.29%↑) 7.328 0.140
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 2887.514 (vs. 2895.853, 0.29%↓) 2888.529 3.630
MobileBertSquad\_fp32(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[8-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 50.889 (vs. 51.034, 0.29%↓) 50.529 0.921
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 422.313 (vs. 421.159, 0.27%↑) 422.408 1.611
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 114.416 (vs. 114.111, 0.27%↑) 114.242 0.670
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 555.158 (vs. 553.680, 0.27%↑) 558.224 7.923
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 4130.804 (vs. 4141.770, 0.26%↓) 4126.269 13.298
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 1304.568 (vs. 1307.909, 0.26%↓) 1308.027 11.818
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 406.049 (vs. 405.076, 0.24%↑) 405.805 0.763
EfficientNetB7PT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 79.660 (vs. 79.470, 0.24%↑) 79.656 0.014
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 3286.243 (vs. 3294.067, 0.24%↓) 3286.968 2.655
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 23.149 (vs. 23.095, 0.23%↑) 23.192 0.199
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 677.657 (vs. 676.136, 0.22%↑) 677.468 0.644
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 181.690 (vs. 181.289, 0.22%↑) 181.830 0.420
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 77.599 (vs. 77.430, 0.22%↑) 77.600 0.147
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 31.781 (vs. 31.850, 0.22%↓) 31.785 0.053
ClipTextSeqLen64PT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 8.643 (vs. 8.661, 0.21%↓) 8.621 0.070
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 31.282 (vs. 31.217, 0.21%↑) 31.424 0.755
matmul\_2562x2561x2561\_f32t\_f32t\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 1.312 (vs. 1.309, 0.21%↑) 1.309 0.007
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 10.319 (vs. 10.340, 0.21%↓) 10.713 0.912
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 411.271 (vs. 412.089, 0.20%↓) 411.333 1.158
EfficientNetV2STF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 7.505 (vs. 7.520, 0.19%↓) 7.488 0.053
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[big-core] 632.607 (vs. 633.790, 0.19%↓) 632.451 0.802
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 162.009 (vs. 162.311, 0.19%↓) 162.035 0.810
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 22.187 (vs. 22.149, 0.18%↑) 22.413 0.753
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][default-flags] vulkan(none)[full-inference,default-flags] with zeros @ moto-edge-x30[gpu] 11.320 (vs. 11.301, 0.17%↑) 12.055 1.156
matmul\_3456x1024x2048\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.126 (vs. 0.126, 0.17%↑) 0.126 0.000
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 874.279 (vs. 875.769, 0.17%↓) 872.610 5.923
EfficientNetV2SPT(linalg) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 205.955 (vs. 206.305, 0.17%↓) 205.591 1.200
MobileBertSquad\_fp16(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 193.677 (vs. 193.381, 0.15%↑) 193.984 0.950
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 203.974 (vs. 204.286, 0.15%↓) 204.018 0.892
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 13.334 (vs. 13.314, 0.15%↑) 13.333 0.006
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 590.014 (vs. 589.141, 0.15%↑) 588.791 3.470
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 27.913 (vs. 27.954, 0.15%↓) 27.905 0.689
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 1324.021 (vs. 1325.975, 0.15%↓) 1324.474 2.054
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 632.918 (vs. 633.842, 0.15%↓) 632.644 0.966
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 414.045 (vs. 414.640, 0.14%↓) 413.982 0.271
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 138.840 (vs. 139.027, 0.13%↓) 139.367 4.038
matmul\_2562x2564x2562\_f32t\_f32t\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 1.076 (vs. 1.074, 0.13%↑) 1.074 0.005
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 32.761 (vs. 32.721, 0.12%↑) 32.755 0.021
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 24.882 (vs. 24.911, 0.12%↓) 24.889 0.242
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 193.495 (vs. 193.722, 0.12%↓) 193.427 0.162
matmul\_2560x2560x2560\_f16t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.137 (vs. 0.137, 0.12%↓) 0.137 0.000
matmul\_123x2561x2561\_f32t\_f32t\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.188 (vs. 0.188, 0.11%↑) 0.188 0.000
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 879.451 (vs. 880.404, 0.11%↓) 879.458 0.771
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[big-core] 182.255 (vs. 182.086, 0.09%↑) 182.376 0.377
DeepLabV3\_fp32(tflite) [arm-valhall-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ pixel-6-pro[gpu] 11.768 (vs. 11.757, 0.09%↑) 11.752 0.112
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 37.446 (vs. 37.412, 0.09%↑) 37.442 0.039
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 667.652 (vs. 668.247, 0.09%↓) 667.604 0.648
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 11.717 (vs. 11.727, 0.09%↓) 11.714 0.005
MobileNetV3Small\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 7.169 (vs. 7.164, 0.08%↑) 7.162 0.020
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 1673.362 (vs. 1672.133, 0.07%↑) 1672.920 3.992
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d,dotprod] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 482.297 (vs. 482.647, 0.07%↓) 482.348 0.497
Unet2dPT(linalg) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 316.293 (vs. 316.081, 0.07%↑) 316.250 0.318
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[little-core] 349.896 (vs. 350.129, 0.07%↓) 349.865 0.681
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 1089.907 (vs. 1089.210, 0.06%↑) 1089.739 0.889
MobileBertSquad\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 584.228 (vs. 583.879, 0.06%↑) 584.181 0.478
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 64.871 (vs. 64.832, 0.06%↑) 64.851 0.063
MobileBertSquad\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 505.383 (vs. 505.683, 0.06%↓) 505.537 1.287
matmul\_2560x2560x2560\_f32t\_tile\_config\_default(linalg) [cuda-sm\_80-linux\_gnu-cuda][ukernel,matmul] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 0.277 (vs. 0.277, 0.05%↑) 0.277 0.000
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 1274.921 (vs. 1275.583, 0.05%↓) 1275.118 1.393
MobileBertSquad\_int8(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 363.652 (vs. 363.464, 0.05%↑) 363.747 0.288
DeepLabV3\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-6-pro[little-core] 352.986 (vs. 352.834, 0.04%↑) 352.436 2.342
BertForMaskedLMTF(stablehlo) [cuda-sm\_80-linux\_gnu-cuda][default-flags] cuda(none)[full-inference,default-flags] with zeros @ a2-highgpu-1g[gpu] 6.721 (vs. 6.718, 0.04%↑) 6.696 0.075
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 37.526 (vs. 37.512, 0.04%↑) 37.510 0.060
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 56.263 (vs. 56.247, 0.03%↑) 56.194 0.126
PoseNet\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-6-pro[little-core] 347.698 (vs. 347.624, 0.02%↑) 347.754 1.119
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_sync(embedded\_elf)[full-inference,default-flags] with zeros @ pixel-4[big-core] 82.670 (vs. 82.656, 0.02%↑) 82.636 0.099
MobileSSD\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][experimental-flags,mmt4d] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with zeros @ pixel-4[big-core] 72.585 (vs. 72.575, 0.01%↑) 72.525 0.164
PersonDetect\_int8(tflite) [x86\_64-cascadelake-linux\_gnu-llvm\_cpu][experimental-flags,fuse-padding] local\_task(embedded\_elf)[1-thread,full-inference,default-flags] with zeros @ c2-standard-16[cpu] 0.782 (vs. 0.782, 0.01%↓) 0.778 0.007
MobileNetV2\_fp32(tflite) [qualcomm-adreno-vulkan\_android31-vulkan\_spirv][experimental-flags,fuse-padding,repeated-kernel] vulkan(none)[full-inference,experimental-flags] with zeros @ moto-edge-x30[gpu] 7.617 (vs. 7.618, 0.01%↓) 7.648 0.101
MobileNetV2\_fp32(tflite) [armv8.2-a-generic-linux\_android29-llvm\_cpu][default-flags] local\_task(embedded\_elf)[4-thread,full-inference,system-scheduling] with zeros @ pixel-4[little-core] 60.302 (vs. 60.302, 0.00%↓) 60.293 0.202

All Compilation Metrics

Benchmark Name Compilation Time (ms) Total Dispatch Size (bytes) Total Artifact Size (bytes) Stream IR Dispatch Count (# of cmd.dispatch ops)
PersonDetect_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 33280 (vs. 32931, 1.06%↑) 155016 (vs. 155016, 0.00%) 414853 (vs. 414853, 0.00%) 87 (vs. 87, 0.00%)
MobileNetV3Small_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 38402 (vs. 39384, 2.49%↓) 220888 (vs. 220888, 0.00%) 10442245 (vs. 10442245, 0.00%) 98 (vs. 98, 0.00%)
DeepLabV3_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 33291 (vs. 28917, 15.13%↑) 144024 (vs. 144024, 0.00%) 2925061 (vs. 2925061, 0.00%) 78 (vs. 78, 0.00%)
EfficientNet_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 64509 (vs. 53445, 20.70%↑) 513848 (vs. 513848, 0.00%) 5548613 (vs. 5548613, 0.00%) 137 (vs. 137, 0.00%)
MobileNetV1_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 18562 (vs. 18291, 1.48%↑) 89176 (vs. 89176, 0.00%) 17005573 (vs. 17005573, 0.00%) 57 (vs. 57, 0.00%)
MobileNetV2_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 22401 (vs. 22594, 0.85%↓) 145016 (vs. 145016, 0.00%) 14128261 (vs. 14128261, 0.00%) 73 (vs. 73, 0.00%)
MobileNetV2_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 56225 (vs. 53059, 5.97%↑) 494152 (vs. 494152, 0.00%) 4146949 (vs. 4146949, 0.00%) 196 (vs. 196, 0.00%)
MobileSSD_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 34641 (vs. 34430, 0.61%↑) 248968 (vs. 248968, 0.00%) 18190917 (vs. 18190917, 0.00%) 121 (vs. 121, 0.00%)
PoseNet_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 15045 (vs. 14249, 5.59%↑) 85176 (vs. 85176, 0.00%) 5138181 (vs. 5138181, 0.00%) 48 (vs. 48, 0.00%)
MobileBertSquad_fp16(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 47896 (vs. 45118, 6.16%↑) 81704 (vs. 81704, 0.00%) 99926853 (vs. 99926853, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 58086 (vs. 56171, 3.41%↑) 97480 (vs. 97480, 0.00%) 98496901 (vs. 98496901, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 261381 (vs. 246291, 6.13%↑) 6367608 (vs. 6367608, 0.00%) 31592261 (vs. 31592261, 0.00%) 1452 (vs. 1452, 0.00%)
EfficientNetV2STF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 107794 (vs. 98738, 9.17%↑) 230488 (vs. 230488, 0.00%) 164118609 (vs. 164118609, 0.00%) 275 (vs. 275, 0.00%)
MiniLML12H384Uncased(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 35641 (vs. 31402, 13.50%↑) 75752 (vs. 75752, 0.00%) 133774740 (vs. 133774740, 0.00%) 211 (vs. 211, 0.00%)
EfficientNetV2SPT(linalg) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 99869 (vs. 93019, 7.36%↑) 351336 (vs. 351336, 0.00%) 86863237 (vs. 86863237, 0.00%) 303 (vs. 303, 0.00%)
BertForMaskedLMTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 40584 (vs. 39536, 2.65%↑) 65560 (vs. 65560, 0.00%) 438464657 (vs. 438464657, 0.00%) 213 (vs. 213, 0.00%)
BertLargeTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 41525 (vs. 36291, 14.42%↑) 63656 (vs. 63656, 0.00%) 1336037707 (vs. 1336037707, 0.00%) 413 (vs. 413, 0.00%)
EfficientNetB7PT(linalg) [x86_64-cascadelake-linux_gnu-llvm_cpu][default-flags,compile-stats] 130972 (vs. 117529, 11.44%↑) 552120 (vs. 552120, 0.00%) 267290949 (vs. 267290949, 0.00%) 496 (vs. 496, 0.00%)
PersonDetect_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 30493 (vs. 28711, 6.21%↑) 160440 (vs. 160440, 0.00%) 417413 (vs. 417413, 0.00%) 74 (vs. 74, 0.00%)
MobileNetV3Small_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 28647 (vs. 40609, 29.46%↓) 220584 (vs. 220584, 0.00%) 10440325 (vs. 10440325, 0.00%) 94 (vs. 94, 0.00%)
DeepLabV3_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 33559 (vs. 28392, 18.20%↑) 149656 (vs. 149656, 0.00%) 2925765 (vs. 2925765, 0.00%) 60 (vs. 60, 0.00%)
EfficientNet_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 65828 (vs. 52485, 25.42%↑) 528984 (vs. 528984, 0.00%) 5559749 (vs. 5559749, 0.00%) 120 (vs. 120, 0.00%)
MobileNetV2_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 26052 (vs. 26428, 1.42%↓) 156104 (vs. 156104, 0.00%) 14134661 (vs. 14134661, 0.00%) 55 (vs. 55, 0.00%)
MobileNetV2_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 61453 (vs. 59886, 2.62%↑) 494152 (vs. 494152, 0.00%) 4146949 (vs. 4146949, 0.00%) 196 (vs. 196, 0.00%)
MobileSSD_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 36220 (vs. 33380, 8.51%↑) 260952 (vs. 260952, 0.00%) 18197701 (vs. 18197701, 0.00%) 99 (vs. 99, 0.00%)
PoseNet_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 12718 (vs. 11966, 6.28%↑) 88200 (vs. 88200, 0.00%) 5136773 (vs. 5136773, 0.00%) 34 (vs. 34, 0.00%)
MobileBertSquad_fp32(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 40819 (vs. 40440, 0.94%↑) 97480 (vs. 97480, 0.00%) 98496901 (vs. 98496901, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_int8(tflite) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 260139 (vs. 240078, 8.36%↑) 6367608 (vs. 6367608, 0.00%) 31592261 (vs. 31592261, 0.00%) 1452 (vs. 1452, 0.00%)
EfficientNetV2STF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 105923 (vs. 94616, 11.95%↑) 234264 (vs. 234264, 0.00%) 164118161 (vs. 164118161, 0.00%) 264 (vs. 264, 0.00%)
MiniLML12H384Uncased(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 34804 (vs. 32225, 8.00%↑) 75752 (vs. 75752, 0.00%) 133774740 (vs. 133774740, 0.00%) 211 (vs. 211, 0.00%)
BertLargeTF(stablehlo) [x86_64-cascadelake-linux_gnu-llvm_cpu][experimental-flags,fuse-padding,compile-stats] 28268 (vs. 26407, 7.05%↑) 63656 (vs. 63656, 0.00%) 1336037707 (vs. 1336037707, 0.00%) 413 (vs. 413, 0.00%)
EfficientNetV2STF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 109141 (vs. 100782, 8.29%↑) 941532 (vs. 941532, 0.00%) 164905189 (vs. 164905189, 0.00%) 275 (vs. 275, 0.00%)
MiniLML12H384Uncased(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 31927 (vs. 31385, 1.73%↑) 158160 (vs. 158160, 0.00%) 133875451 (vs. 133875451, 0.00%) 211 (vs. 211, 0.00%)
BertForMaskedLMTF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 40185 (vs. 33822, 18.81%↑) 203130 (vs. 203130, 0.00%) 438619857 (vs. 438619857, 0.00%) 213 (vs. 213, 0.00%)
BertLargeTF(stablehlo) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 34067 (vs. 35161, 3.11%↓) 138154 (vs. 138154, 0.00%) 1336117931 (vs. 1336117931, 0.00%) 413 (vs. 413, 0.00%)
ClipTextSeqLen64PT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 85559 (vs. 77414, 10.52%↑) 87318 (vs. 87318, 0.00%) 492376883 (vs. 492376883, 0.00%) 197 (vs. 197, 0.00%)
Unet2dPT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 215113 (vs. 197224, 9.07%↑) 1738374 (vs. 1738374, 0.00%) 3440111738 (vs. 3440111738, 0.00%) 1045 (vs. 1045, 0.00%)
EfficientNetB7PT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 96049 (vs. 85952, 11.75%↑) 2520844 (vs. 2520844, 0.00%) 269319505 (vs. 269319505, 0.00%) 496 (vs. 496, 0.00%)
BertLargefp16PTBatch1(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 199157 (vs. 170025, 17.13%↑) 115018 (vs. 115018, 0.00%) 668415468 (vs. 668415468, 0.00%) 582 (vs. 582, 0.00%)
EfficientNetV2Sfp16PT(linalg) [cuda-sm_80-linux_gnu-cuda][default-flags,compile-stats] 56931 (vs. 55086, 3.35%↑) 1323808 (vs. 1323808, 0.00%) 44638398 (vs. 44638398, 0.00%) 304 (vs. 304, 0.00%)
matmul_3456x1024x2048_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1180 (vs. 1102, 7.08%↑) 29518 (vs. 29518, 0.00%) 39921 (vs. 39921, 0.00%) 1 (vs. 1, 0.00%)
matmul_3456x1024x2048_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1655 (vs. 1682, 1.61%↓) 42254 (vs. 42254, 0.00%) 52657 (vs. 52657, 0.00%) 1 (vs. 1, 0.00%)
matmul_2560x2560x2560_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1052 (vs. 1170, 10.09%↓) 28582 (vs. 28582, 0.00%) 38921 (vs. 38921, 0.00%) 1 (vs. 1, 0.00%)
matmul_2560x2560x2560_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1610 (vs. 1445, 11.42%↑) 40978 (vs. 40978, 0.00%) 51317 (vs. 51317, 0.00%) 1 (vs. 1, 0.00%)
matmul_2564x2564x2564_f32t_f32t_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1748 (vs. 1692, 3.31%↑) 83854 (vs. 83854, 0.00%) 94193 (vs. 94193, 0.00%) 1 (vs. 1, 0.00%)
matmul_2562x2564x2562_f32t_f32t_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1958 (vs. 1656, 18.24%↑) 86342 (vs. 86342, 0.00%) 96681 (vs. 96681, 0.00%) 1 (vs. 1, 0.00%)
matmul_2562x2561x2561_f32t_f32t_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 2051 (vs. 2014, 1.84%↑) 87734 (vs. 87734, 0.00%) 98073 (vs. 98073, 0.00%) 1 (vs. 1, 0.00%)
matmul_123x2561x2561_f32t_f32t_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,compile-stats] 1626 (vs. 1475, 10.24%↑) 52466 (vs. 52466, 0.00%) 62804 (vs. 62804, 0.00%) 1 (vs. 1, 0.00%)
matmul_128x256x8192_f16t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,splitk,compile-stats] 638 (vs. 543, 17.50%↑) 9348 (vs. 9348, 0.00%) 26519 (vs. 26519, 0.00%) 2 (vs. 2, 0.00%)
matmul_128x256x8192_f32t_tile_config_default(linalg) [cuda-sm_80-linux_gnu-cuda][ukernel,matmul,splitk,compile-stats] 565 (vs. 542, 4.24%↑) 10252 (vs. 10252, 0.00%) 27411 (vs. 27411, 0.00%) 2 (vs. 2, 0.00%)
DeepLabV3_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 24358 (vs. 21558, 12.99%↑) 41608 (vs. 41608, 0.00%) 2822599 (vs. 2822599, 0.00%) 78 (vs. 78, 0.00%)
MobileBertSquad_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 41436 (vs. 38878, 6.58%↑) 33688 (vs. 33688, 0.00%) 98433159 (vs. 98433159, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV1_fp32(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 16116 (vs. 15666, 2.87%↑) 46960 (vs. 46960, 0.00%) 16963207 (vs. 16963207, 0.00%) 57 (vs. 57, 0.00%)
MobileBertSquad_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 239514 (vs. 218706, 9.51%↑) 2192912 (vs. 2192912, 0.00%) 27417607 (vs. 27417607, 0.00%) 1452 (vs. 1452, 0.00%)
PersonDetect_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 30511 (vs. 25354, 20.34%↑) 73496 (vs. 73496, 0.00%) 333383 (vs. 333383, 0.00%) 87 (vs. 87, 0.00%)
EfficientNet_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 42134 (vs. 37815, 11.42%↑) 144312 (vs. 144312, 0.00%) 5179079 (vs. 5179079, 0.00%) 137 (vs. 137, 0.00%)
MobileNetV2_int8(tflite) [riscv_64-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 53000 (vs. 48153, 10.07%↑) 171936 (vs. 171936, 0.00%) 3824775 (vs. 3824775, 0.00%) 196 (vs. 196, 0.00%)
EfficientNet_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 53539 (vs. 47219, 13.38%↑) 279336 (vs. 279336, 0.00%) 5314119 (vs. 5314119, 0.00%) 137 (vs. 137, 0.00%)
MobileBertSquad_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 275352 (vs. 251840, 9.34%↑) 2342048 (vs. 2342048, 0.00%) 27566791 (vs. 27566791, 0.00%) 1452 (vs. 1452, 0.00%)
PersonDetect_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 32661 (vs. 28696, 13.82%↑) 174264 (vs. 174264, 0.00%) 434119 (vs. 434119, 0.00%) 87 (vs. 87, 0.00%)
MobileNetV2_int8(tflite) [riscv_32-generic-linux_gnu-llvm_cpu][default-flags,compile-stats] 61570 (vs. 54934, 12.08%↑) 246164 (vs. 246164, 0.00%) 3899015 (vs. 3899015, 0.00%) 196 (vs. 196, 0.00%)
DeepLabV3_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 44903 (vs. 44135, 1.74%↑) 99096 (vs. 99096, 0.00%) 2880133 (vs. 2880133, 0.00%) 78 (vs. 78, 0.00%)
MobileSSD_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 89371 (vs. 81759, 9.31%↑) 290728 (vs. 290728, 0.00%) 18232709 (vs. 18232709, 0.00%) 121 (vs. 121, 0.00%)
PoseNet_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 18826 (vs. 18699, 0.68%↑) 34232 (vs. 34232, 0.00%) 5087237 (vs. 5087237, 0.00%) 48 (vs. 48, 0.00%)
MobileBertSquad_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 67845 (vs. 65279, 3.93%↑) 113160 (vs. 113160, 0.00%) 98512581 (vs. 98512581, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 40580 (vs. 39033, 3.96%↑) 106984 (vs. 106984, 0.00%) 14090245 (vs. 14090245, 0.00%) 73 (vs. 73, 0.00%)
MobileNetV3Small_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 42286 (vs. 42165, 0.29%↑) 90152 (vs. 90152, 0.00%) 10311493 (vs. 10311493, 0.00%) 98 (vs. 98, 0.00%)
MobileBertSquad_int8(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][default-flags,compile-stats] 235482 (vs. 216344, 8.85%↑) 2033800 (vs. 2033800, 0.00%) 27258437 (vs. 27258437, 0.00%) 1452 (vs. 1452, 0.00%)
DeepLabV3_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 43680 (vs. 41877, 4.31%↑) 97120 (vs. 97120, 0.00%) 2894021 (vs. 2894021, 0.00%) 200 (vs. 200, 0.00%)
MobileSSD_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 55426 (vs. 54118, 2.42%↑) 136352 (vs. 136352, 0.00%) 18098501 (vs. 18098501, 0.00%) 282 (vs. 282, 0.00%)
PoseNet_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 21731 (vs. 19340, 12.36%↑) 42560 (vs. 42560, 0.00%) 5102021 (vs. 5102021, 0.00%) 102 (vs. 102, 0.00%)
MobileBertSquad_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 75549 (vs. 68825, 9.77%↑) 28992 (vs. 28992, 0.00%) 98608837 (vs. 98608837, 0.00%) 2197 (vs. 2197, 0.00%)
MobileNetV2_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 38959 (vs. 34911, 11.60%↑) 95632 (vs. 95632, 0.00%) 14092293 (vs. 14092293, 0.00%) 178 (vs. 178, 0.00%)
MobileNetV3Small_fp32(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,compile-stats] 51950 (vs. 48067, 8.08%↑) 124336 (vs. 124336, 0.00%) 10362949 (vs. 10362949, 0.00%) 237 (vs. 237, 0.00%)
MobileBertSquad_int8(tflite) [armv8.2-a-generic-linux_android29-llvm_cpu][experimental-flags,mmt4d,dotprod,compile-stats] 209714 (vs. 193715, 8.26%↑) 1559264 (vs. 1559264, 0.00%) 26977925 (vs. 26977925, 0.00%) 2940 (vs. 2940, 0.00%)
DeepLabV3_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 21891 (vs. 19486, 12.34%↑) 389512 (vs. 389512, 0.00%) 3189295 (vs. 3189295, 0.00%) 78 (vs. 78, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 27047 (vs. 25069, 7.89%↑) 542766 (vs. 542766, 0.00%) 18518683 (vs. 18518683, 0.00%) 121 (vs. 121, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 8466 (vs. 7592, 11.51%↑) 137220 (vs. 137220, 0.00%) 5201131 (vs. 5201131, 0.00%) 48 (vs. 48, 0.00%)
MobileBertSquad_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 56706 (vs. 48885, 16.00%↑) 230322 (vs. 230322, 0.00%) 98640794 (vs. 98640794, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 16976 (vs. 14459, 17.41%↑) 330334 (vs. 330334, 0.00%) 14331842 (vs. 14331842, 0.00%) 73 (vs. 73, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 24513 (vs. 24372, 0.58%↑) 394772 (vs. 394772, 0.00%) 10648169 (vs. 10648169, 0.00%) 98 (vs. 98, 0.00%)
DeepLabV3_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 19676 (vs. 19566, 0.56%↑) 411084 (vs. 411084, 0.00%) 3202760 (vs. 3202760, 0.00%) 60 (vs. 60, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 24267 (vs. 22826, 6.31%↑) 602624 (vs. 602624, 0.00%) 18568194 (vs. 18568194, 0.00%) 99 (vs. 99, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 8571 (vs. 7375, 16.22%↑) 158020 (vs. 158020, 0.00%) 5215197 (vs. 5215197, 0.00%) 34 (vs. 34, 0.00%)
MobileBertSquad_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 54052 (vs. 52603, 2.75%↑) 230322 (vs. 230322, 0.00%) 98640794 (vs. 98640794, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 20730 (vs. 13814, 50.07%↑) 373308 (vs. 373308, 0.00%) 14365844 (vs. 14365844, 0.00%) 55 (vs. 55, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,compile-stats] 22512 (vs. 23658, 4.84%↓) 409830 (vs. 409830, 0.00%) 10659054 (vs. 10659054, 0.00%) 94 (vs. 94, 0.00%)
MobileSSD_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 27723 (vs. 26213, 5.76%↑) 602624 (vs. 602624, 0.00%) 18651394 (vs. 18651394, 0.00%) 99 (vs. 99, 0.00%)
PoseNet_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 12430 (vs. 7736, 60.68%↑) 158020 (vs. 158020, 0.00%) 5243741 (vs. 5243741, 0.00%) 34 (vs. 34, 0.00%)
MobileNetV2_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 18801 (vs. 20335, 7.54%↓) 373308 (vs. 373308, 0.00%) 14412116 (vs. 14412116, 0.00%) 55 (vs. 55, 0.00%)
MobileNetV3Small_fp32(tflite) [qualcomm-adreno-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,repeated-kernel,compile-stats] 24094 (vs. 28199, 14.56%↓) 409830 (vs. 409830, 0.00%) 10738030 (vs. 10738030, 0.00%) 94 (vs. 94, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 17285 (vs. 19149, 9.73%↓) 178636 (vs. 178636, 0.00%) 2978543 (vs. 2978543, 0.00%) 78 (vs. 78, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 24259 (vs. 22770, 6.54%↑) 355370 (vs. 355370, 0.00%) 18331483 (vs. 18331483, 0.00%) 121 (vs. 121, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 13116 (vs. 8435, 55.49%↑) 95724 (vs. 95724, 0.00%) 5159659 (vs. 5159659, 0.00%) 48 (vs. 48, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 52998 (vs. 52804, 0.37%↑) 131626 (vs. 131626, 0.00%) 98542170 (vs. 98542170, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 13728 (vs. 14059, 2.35%↓) 195898 (vs. 195898, 0.00%) 14197634 (vs. 14197634, 0.00%) 73 (vs. 73, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 23946 (vs. 23044, 3.91%↑) 259164 (vs. 259164, 0.00%) 10512745 (vs. 10512745, 0.00%) 98 (vs. 98, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,demote-f32-to-f16,compile-stats] 84008 (vs. 78028, 7.66%↑) 2973676 (vs. 2973676, 0.00%) 52944160 (vs. 52944160, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 157644 (vs. 142236, 10.83%↑) 7478274 (vs. 7478274, 0.00%) 32942720 (vs. 32942720, 0.00%) 1452 (vs. 1452, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 32369 (vs. 30529, 6.03%↑) 522966 (vs. 522966, 0.00%) 5599845 (vs. 5599845, 0.00%) 137 (vs. 137, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][default-flags,compile-stats] 24581 (vs. 22227, 10.59%↑) 326240 (vs. 326240, 0.00%) 612595 (vs. 612595, 0.00%) 87 (vs. 87, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 16674 (vs. 16080, 3.69%↑) 201784 (vs. 201784, 0.00%) 2993608 (vs. 2993608, 0.00%) 60 (vs. 60, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 22602 (vs. 23272, 2.88%↓) 377620 (vs. 377620, 0.00%) 18341250 (vs. 18341250, 0.00%) 99 (vs. 99, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 10337 (vs. 10890, 5.08%↓) 117200 (vs. 117200, 0.00%) 5174365 (vs. 5174365, 0.00%) 34 (vs. 34, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 56189 (vs. 51422, 9.27%↑) 128986 (vs. 128986, 0.00%) 98538714 (vs. 98538714, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 15177 (vs. 17625, 13.89%↓) 213040 (vs. 213040, 0.00%) 14205844 (vs. 14205844, 0.00%) 55 (vs. 55, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 21099 (vs. 24023, 12.17%↓) 282206 (vs. 282206, 0.00%) 10531566 (vs. 10531566, 0.00%) 94 (vs. 94, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,demote-f32-to-f16,compile-stats] 95786 (vs. 90542, 5.79%↑) 2977704 (vs. 2977704, 0.00%) 52960096 (vs. 52960096, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 182252 (vs. 167044, 9.10%↑) 7478338 (vs. 7478338, 0.00%) 32927168 (vs. 32927168, 0.00%) 1452 (vs. 1452, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 33239 (vs. 26834, 23.87%↑) 547940 (vs. 547940, 0.00%) 5615699 (vs. 5615699, 0.00%) 120 (vs. 120, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,compile-stats] 21229 (vs. 21118, 0.53%↑) 339890 (vs. 339890, 0.00%) 619998 (vs. 619998, 0.00%) 74 (vs. 74, 0.00%)
DeepLabV3_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 17791 (vs. 20197, 11.91%↓) 201784 (vs. 201784, 0.00%) 3097736 (vs. 3097736, 0.00%) 60 (vs. 60, 0.00%)
MobileSSD_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 32690 (vs. 29140, 12.18%↑) 377620 (vs. 377620, 0.00%) 18513090 (vs. 18513090, 0.00%) 99 (vs. 99, 0.00%)
PoseNet_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 9853 (vs. 8251, 19.42%↑) 117200 (vs. 117200, 0.00%) 5233437 (vs. 5233437, 0.00%) 34 (vs. 34, 0.00%)
MobileBertSquad_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 90881 (vs. 78934, 15.14%↑) 128986 (vs. 128986, 0.00%) 99802522 (vs. 99802522, 0.00%) 728 (vs. 728, 0.00%)
MobileNetV2_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 19208 (vs. 19521, 1.60%↓) 213040 (vs. 213040, 0.00%) 14301332 (vs. 14301332, 0.00%) 55 (vs. 55, 0.00%)
MobileNetV3Small_fp32(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 26494 (vs. 28682, 7.63%↓) 282206 (vs. 282206, 0.00%) 10694766 (vs. 10694766, 0.00%) 94 (vs. 94, 0.00%)
MobileBertSquad_fp16(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,demote-f32-to-f16,compile-stats] 158424 (vs. 150872, 5.01%↑) 2977704 (vs. 2977704, 0.00%) 54223904 (vs. 54223904, 0.00%) 728 (vs. 728, 0.00%)
MobileBertSquad_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 326743 (vs. 297321, 9.90%↑) 7478338 (vs. 7478338, 0.00%) 35447872 (vs. 35447872, 0.00%) 1452 (vs. 1452, 0.00%)
EfficientNet_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 39209 (vs. 36084, 8.66%↑) 547940 (vs. 547940, 0.00%) 5824019 (vs. 5824019, 0.00%) 120 (vs. 120, 0.00%)
PersonDetect_int8(tflite) [arm-valhall-vulkan_android31-vulkan_spirv][experimental-flags,fuse-padding,max-concurrency,repeated-kernel,compile-stats] 25537 (vs. 25495, 0.16%↑) 339890 (vs. 339890, 0.00%) 748510 (vs. 748510, 0.00%) 74 (vs. 74, 0.00%)
ClipTextSeqLen64PT(linalg) [nvidia-ampere-vulkan_linux-vulkan_spirv][experimental-flags,tensorcore,compile-stats] 89103 (vs. 80865, 10.19%↑) 201624 (vs. 201624, 0.00%) 492521249 (vs. 492521249, 0.00%) 317 (vs. 317, 0.00%)
Unet2dPT(linalg) [nvidia-ampere-vulkan_linux-vulkan_spirv][experimental-flags,tensorcore,compile-stats] 213373 (vs. 196537, 8.57%↑) 3739948 (vs. 3739948, 0.00%) 3442153507 (vs. 3442153507, 0.00%) 1205 (vs. 1205, 0.00%)
ClipTextSeqLen64PT(linalg) [nvidia-pascal-vulkan_linux-vulkan_spirv][experimental-flags,simt,compile-stats] 92042 (vs. 83395, 10.37%↑) 201624 (vs. 201624, 0.00%) 492521249 (vs. 492521249, 0.00%) 317 (vs. 317, 0.00%)
Unet2dPT(linalg) [nvidia-pascal-vulkan_linux-vulkan_spirv][experimental-flags,simt,compile-stats] 195496 (vs. 182623, 7.05%↑) 3735296 (vs. 3735296, 0.00%) 3442148835 (vs. 3442148835, 0.00%) 1205 (vs. 1205, 0.00%)
MobileNetV2_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags,compile-stats] 17212 (vs. 19030, 9.55%↓) 95861 (vs. 95861, 0.00%) 14079102 (vs. 14079102, 0.00%) 73 (vs. 73, 0.00%)
MobileNetV3Small_fp32(tflite) [vmvx-generic-vmvx-vmvx][experimental-flags,compile-stats] 26414 (vs. 25338, 4.25%↑) 164533 (vs. 164533, 0.00%) 10385854 (vs. 10385854, 0.00%) 98 (vs. 98, 0.00%)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment