Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/eb3ab5036f83dc79f9dbeb0517e76335 to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/eb3ab5036f83dc79f9dbeb0517e76335 to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Improved Benchmarks πŸŽ‰

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 15 (vs. 17, 11.76%↓) 14 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 141 (vs. 156, 9.62%↓) 152 24
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 268 (vs. 290, 7.59%↓) 272 28
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 65 (vs. 70, 7.14%↓) 65 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 33 (vs. 35, 5.71%↓) 33 1
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 70 (vs. 74, 5.41%↓) 70 0

Similar Benchmarks

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 19 (vs. 20, 5.00%↓) 19 0
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 503 (vs. 522, 3.64%↓) 493 35
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 357 (vs. 370, 3.51%↓) 376 30
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 975 (vs. 1010, 3.47%↓) 950 63
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 211 (vs. 217, 2.76%↓) 211 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 446 (vs. 434, 2.76%↑) 422 64
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 37 (vs. 38, 2.63%↓) 37 0
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 2139 (vs. 2088, 2.44%↑) 2170 71
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 81 (vs. 82, 1.22%↓) 82 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 196 (vs. 194, 1.03%↑) 196 6
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 1263 (vs. 1251, 0.96%↑) 1264 9
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 316 (vs. 313, 0.96%↑) 317 6
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 5844 (vs. 5900, 0.95%↓) 5906 141
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 878 (vs. 870, 0.92%↑) 878 7
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 375 (vs. 372, 0.81%↑) 376 4
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 16829 (vs. 16961, 0.78%↓) 16822 25
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 1328 (vs. 1318, 0.76%↑) 1337 22
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 18970 (vs. 19114, 0.75%↓) 18970 26
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 140 (vs. 141, 0.71%↓) 140 1
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 142 (vs. 141, 0.71%↑) 141 6
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 71067 (vs. 71516, 0.63%↓) 71059 54
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 342 (vs. 344, 0.58%↓) 342 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 63032 (vs. 63392, 0.57%↓) 63035 38
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 382 (vs. 380, 0.53%↑) 383 8
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 382 (vs. 384, 0.52%↓) 378 8
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 1258 (vs. 1252, 0.48%↑) 1265 15
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 783 (vs. 786, 0.38%↓) 785 9
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 788 (vs. 791, 0.38%↓) 788 3
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 725 (vs. 727, 0.28%↓) 725 1
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1323 (vs. 1320, 0.23%↑) 1333 20
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 5533 (vs. 5524, 0.16%↑) 5534 47
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 894 (vs. 895, 0.11%↓) 893 3
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 5851 (vs. 5857, 0.10%↓) 5886 122
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 2025 (vs. 2026, 0.05%↓) 2034 33
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 5491 (vs. 5490, 0.02%↑) 5480 44
MobileBertSquad [fp16] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 155 (vs. 155, 0.00%) 155 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 80 (vs. 80, 0.00%) 81 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 79 (vs. 79, 0.00%) 79 4
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 46, 0.00%) 46 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 46, 0.00%) 46 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 170 (vs. 170, 0.00%) 170 0
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 211 (vs. 211, 0.00%) 210 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 51 (vs. 51, 0.00%) 51 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 60 (vs. 60, 0.00%) 61 0
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 871 (vs. 871, 0.00%) 867 14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment