Lots of projects need to test android apps, and use GitHub Actions infrastructure to do so.
This document intends to show current timings for a sample workload, to inform what emulators are a good match for testing.
- We use the AnkiDroid androidTests
- they are long enough to execute that they form a nice balance between cold start emulator time so neither dominates
- they are open source, feel free to inspect
- We use the macos runners, specfically macos-11 (though we reference it as macos-latest at the moment)
- the macos runner family is the only family that enables virtual machine hardware acceleration, a hard requirement
- Build time of app under test is removed from testing, to focus on emulator performance only
- We iterate a few times for each emulator style, as there can be quite a bit of variance
There are a wide variety of emulators available from Google's Android project. We will focus on:
- API 25 minimm: 21 would be better but I have problems with 21-24, and I want to limit matrix job expansion to < 256 limit
- API 32 maximum: this is the current maximum Android API available on stable channel
- Arches x86 and x86_64: they have different performance characteristics, and different per-system-image failure modes
- Target default and google_apis: they have different performance characteristics, and some workloads require Google APIs
No attempt has been made to test the matrix for different RAM or Disk sizes, but I would welcome further work from someone if they were interested in performing it. I imagine smaller disk, within reason, would speed up emulator creation and more RAM would speed up emulator execution up to a point.
Performance hypothesis:
- the x86 arch, on a default (non google_apis) target, somewhere in the high 20s (perhaps API28) will be the fastest emulator.
- older APIs will have complete failures for various bitrot-related reasons that offer low-value with regard to diagnosis and are best ignored
I have attempted to separate performance in to the major components:
- "create+cold boot time" - time taken to install/create/start emulator. Emulator likey still performing background first boot tasks
- "test execution time" - time taken to execute tests. May be affected by background first-boot tasks if any, running concurrently
API29 appears stable and fast (a great combo) and is my choice for single API runs.
It looks like several APIs are flaky. API32 is useful to test the newest version but is incredibly slow. Lower APIs are also flaky unfortunately so it's tough to test low-end of a compatibility bracket (API21 etc) Testing any of the other APIs should be done in a re-try loop because the APIs suffer frequent startup failures
- Access GitHub REST API to fetch test run
fetch_workflow_jobs.sh <workflow run id>
- Parse out major component times from the logs and format as csv for analysis
node analyze_emulator_performance.js emulator_perf_results.json
- Test different RAM sizes. Hypothesis: RAM may improves first boot + test velocity, until virtual runner memory is fully utilized
- Test different disk sizes. Hypothesis: Smaller disk improves install velocity, until it is too small to contain the app+system
- Test more targets. Hypothesis: play store images will be really slow, but there are also watch images people may be interested in
- Examine AVD snapshot size: Hypothesis: RAM size or some other factor may affect snapshot size, which affects caching
Results sorted by test time (increasing), filtered for API/Arch/Target combos with zero failures.