Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save iree-github-actions-bot/e5097c0b0ac20f13e293f33483e3a9a5 to your computer and use it in GitHub Desktop.
Save iree-github-actions-bot/e5097c0b0ac20f13e293f33483e3a9a5 to your computer and use it in GitHub Desktop.

Full Benchmark Summary

Regressed Benchmarks 🚩

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 134 (vs. 122, 9.84%↑) 132 15
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 18 (vs. 17, 5.88%↑) 18 0

Improved Benchmarks πŸŽ‰

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 1130 (vs. 1255, 9.96%↓) 1092 70
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 787 (vs. 870, 9.54%↓) 757 54

Similar Benchmarks

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 227 (vs. 237, 4.22%↓) 227 13
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 79 (vs. 77, 2.60%↑) 78 3
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 81 (vs. 79, 2.53%↑) 81 2
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 84 (vs. 82, 2.44%↑) 85 5
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 45, 2.22%↑) 46 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 195 (vs. 191, 2.09%↑) 195 5
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 431 (vs. 440, 2.05%↓) 420 51
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 60 (vs. 61, 1.64%↓) 61 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 374 (vs. 368, 1.63%↑) 374 5
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 141 (vs. 139, 1.44%↑) 142 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 372 (vs. 375, 0.80%↓) 373 3
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 141 (vs. 140, 0.71%↑) 142 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 388 (vs. 386, 0.52%↑) 388 2
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 966 (vs. 962, 0.42%↑) 959 30
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 1328 (vs. 1325, 0.23%↑) 1328 4
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 63313 (vs. 63408, 0.15%↓) 63413 216
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ SM-G980F (CPU-ARMv8.2-A) 16938 (vs. 16952, 0.08%↓) 16943 20
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 19119 (vs. 19134, 0.08%↓) 19127 27
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 1330 (vs. 1329, 0.08%↑) 1328 4
MobileNetV2 [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-VMVX @ Pixel-4 (CPU-ARMv8.2-A) 71581 (vs. 71565, 0.02%↑) 71555 69
MobileNetV2 [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 1252 (vs. 1252, 0.00%) 1251 5
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 20 (vs. 20, 0.00%) 20 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 315 (vs. 315, 0.00%) 315 6
MobileNetV3Small [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 46 (vs. 46, 0.00%) 46 0
MobileNetV2 [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 170 (vs. 170, 0.00%) 170 0
MobileNetV2 [fp32,imagenet] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 211 (vs. 211, 0.00%) 210 2
MobileNetV2 [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 74 (vs. 74, 0.00%) 74 0
MobileNetV2 [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 70 (vs. 70, 0.00%) 70 1
MobileNetV3Small [fp32,imagenet] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 51 (vs. 51, 0.00%) 51 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 388 (vs. 388, 0.00%) 388 2
MobileNetV3Small [fp32,imagenet] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 38 (vs. 38, 0.00%) 38 0
MobileNetV3Small [fp32,imagenet] (TensorFlow) kernel-execution with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 35 (vs. 35, 0.00%) 34 1

Similar Benchmarks

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileBertSquad [fp16] (TensorFlow) kernel-execution with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 157 156 1
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 2091 2132 91
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 5419 5453 497
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ SM-G980F (GPU-Mali-G77) 215 215 2
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 543 547 42
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ SM-G980F (CPU-ARMv8.2-A) 786 789 9
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 5827 5867 118
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ SM-G980F (CPU-ARMv8.2-A) 790 785 19
MobileBertSquad [fp32] (TensorFlow) 3-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 2065 2062 16
MobileBertSquad [fp32] (TensorFlow) 1-thread,little-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 5611 5610 6
MobileBertSquad [fp32] (TensorFlow) 3-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 345 345 1
MobileBertSquad [fp32] (TensorFlow) 1-thread,big-core,full-inference with IREE-Dylib @ Pixel-4 (CPU-ARMv8.2-A) 729 728 4
MobileBertSquad [fp32] (TensorFlow) little-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 5640 5641 7
MobileBertSquad [fp32] (TensorFlow) big-core,full-inference with IREE-Dylib-Sync @ Pixel-4 (CPU-ARMv8.2-A) 900 899 2
MobileBertSquad [fp32] (TensorFlow) full-inference with IREE-Vulkan @ Pixel-4 (GPU-Adreno-640) 866 860 19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment