Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/02d84e2b31b4a044057e2dbe8c7cb69d to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/02d84e2b31b4a044057e2dbe8c7cb69d to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Regressed Benchmarks 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 894 (vs. 726, 23.14%↑) 893 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 356 (vs. 318, 11.95%↑) 352 29
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 998 (vs. 946, 5.50%↑) 995 50
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 118 (vs. 112, 5.36%↑) 121 8

Improved Benchmarks πŸŽ‰

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 14 (vs. 18, 22.22%↓) 14 0
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 77 (vs. 87, 11.49%↓) 83 10
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 65 (vs. 70, 7.14%↓) 65 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 33 (vs. 35, 5.71%↓) 33 1
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 70 (vs. 74, 5.41%↓) 70 0

Similar Benchmarks

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 19 (vs. 20, 5.00%↓) 19 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 242 (vs. 249, 2.81%↓) 236 26
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 37 (vs. 38, 2.63%↓) 37 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 79 (vs. 81, 2.47%↓) 80 4
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 210 (vs. 215, 2.33%↓) 210 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 45 (vs. 46, 2.17%↓) 45 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 51 (vs. 50, 2.00%↑) 51 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 61 (vs. 60, 1.67%↑) 61 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 138 (vs. 140, 1.43%↓) 138 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 295 (vs. 291, 1.37%↑) 293 5
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 883 (vs. 872, 1.26%↑) 883 4
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 1268 (vs. 1253, 1.20%↑) 1266 5
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 331 (vs. 328, 0.91%↑) 329 7
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 783 (vs. 790, 0.89%↓) 783 2
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 1265 (vs. 1254, 0.88%↑) 1265 3
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 2409 (vs. 2430, 0.86%↓) 2419 79
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 16151 (vs. 16290, 0.85%↓) 16152 9
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 14368 (vs. 14487, 0.82%↓) 14376 33
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 374 (vs. 371, 0.81%↑) 375 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 375 (vs. 372, 0.81%↑) 376 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 389 (vs. 386, 0.78%↑) 390 2
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 769 (vs. 775, 0.77%↓) 768 8
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 5629 (vs. 5586, 0.77%↑) 5630 10
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 62306 (vs. 62696, 0.62%↓) 62300 35
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 70153 (vs. 70591, 0.62%↓) 70163 44
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 5833 (vs. 5869, 0.61%↓) 5920 175
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 171 (vs. 170, 0.59%↑) 171 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 389 (vs. 387, 0.52%↑) 390 2
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 213 (vs. 212, 0.47%↑) 213 2
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 873 (vs. 870, 0.34%↑) 867 17
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 5920 (vs. 5933, 0.22%↓) 5917 10
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1328 (vs. 1326, 0.15%↑) 1332 8
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 5559 (vs. 5553, 0.11%↑) 5557 9
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 1337 (vs. 1336, 0.07%↑) 1338 4
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1996 (vs. 1997, 0.05%↓) 1997 4
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 198 (vs. 198, 0.00%) 198 5
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 140 (vs. 140, 0.00%) 138 4
MobileBertSquad [fp16] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 156 (vs. 156, 0.00%) 155 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 78 (vs. 78, 0.00%) 77 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 45 (vs. 45, 0.00%) 45 0
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 437 (vs. 437, 0.00%) 435 26
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 728 (vs. 728, 0.00%) 728 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment