Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/5a95e2a6e1b86781aefb4c125db8245b to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/5a95e2a6e1b86781aefb4c125db8245b to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Regressed Benchmarks 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 898 (vs. 729, 23.18%↑) 897 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 23 (vs. 20, 15.00%↑) 23 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 241 (vs. 228, 5.70%↑) 242 28

Improved Benchmarks πŸŽ‰

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 15 (vs. 17, 11.76%↓) 14 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 393 (vs. 433, 9.24%↓) 389 28
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 66 (vs. 70, 5.71%↓) 65 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 33 (vs. 35, 5.71%↓) 33 1

Similar Benchmarks

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 71 (vs. 74, 4.05%↓) 71 0
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 5616 (vs. 5848, 3.97%↓) 5890 392
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 38 (vs. 39, 2.56%↓) 38 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 77 (vs. 79, 2.53%↓) 81 10
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 81 (vs. 83, 2.41%↓) 82 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1003 (vs. 983, 2.03%↑) 1033 47
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 51 (vs. 50, 2.00%↑) 51 0
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 212 (vs. 216, 1.85%↓) 211 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 61 (vs. 60, 1.67%↑) 61 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 377 (vs. 371, 1.62%↑) 377 1
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 142 (vs. 144, 1.39%↓) 141 6
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 172 (vs. 170, 1.18%↑) 171 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 193 (vs. 191, 1.05%↑) 195 7
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 387 (vs. 391, 1.02%↓) 389 7
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 2072 (vs. 2090, 0.86%↓) 2099 91
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 5872 (vs. 5822, 0.86%↑) 5916 100
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 122 (vs. 121, 0.83%↑) 124 8
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 376 (vs. 373, 0.80%↑) 376 5
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 1262 (vs. 1252, 0.80%↑) 1266 10
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1345 (vs. 1335, 0.75%↑) 1346 7
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 16824 (vs. 16944, 0.71%↓) 16820 19
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 1344 (vs. 1335, 0.67%↑) 1345 7
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 18976 (vs. 19100, 0.65%↓) 18975 11
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 316 (vs. 314, 0.64%↑) 318 6
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 1265 (vs. 1257, 0.64%↑) 1266 4
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 63008 (vs. 63381, 0.59%↓) 63022 52
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 343 (vs. 345, 0.58%↓) 343 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 71175 (vs. 71540, 0.51%↓) 71177 55
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 793 (vs. 789, 0.51%↑) 790 11
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 211 (vs. 210, 0.48%↑) 210 2
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 512 (vs. 510, 0.39%↑) 513 37
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 2064 (vs. 2071, 0.34%↓) 2061 14
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 5618 (vs. 5634, 0.28%↓) 5619 5
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 729 (vs. 731, 0.27%↓) 729 2
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 875 (vs. 877, 0.23%↓) 870 15
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 5637 (vs. 5632, 0.09%↑) 5637 9
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 873 (vs. 873, 0.00%) 876 11
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 140 (vs. 140, 0.00%) 140 1
MobileBertSquad [fp16] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 156 (vs. 156, 0.00%) 156 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 80 (vs. 80, 0.00%) 80 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 46, 0.00%) 46 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 46, 0.00%) 46 0
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 782 (vs. 782, 0.00%) 781 13
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 390 (vs. 390, 0.00%) 389 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment