Skip to content

Instantly share code, notes, and snippets.

@danielrosehill
Last active November 27, 2025 13:02
Show Gist options
  • Select an option

  • Save danielrosehill/8de84be9e906b5a2d3b94e554e8961af to your computer and use it in GitHub Desktop.

Select an option

Save danielrosehill/8de84be9e906b5a2d3b94e554e8961af to your computer and use it in GitHub Desktop.
Whisper ACFT Model Size Reference for FUTO Voice Input (And My Performance Notes) - Whisper CPP

Whisper GGML Model Sizes for FUTO Voice Input

Reference for choosing Whisper model sizes when using custom fine-tuned models with FUTO Voice Input on Android.

Model Size Comparison (GGML f16)

Model Parameters Approx GGML Size
Tiny 39M ~75 MB
Base 74M ~142 MB
Small 244M ~465 MB
Medium 769M ~1.5 GB
Large-v3 1.5B ~3 GB

Note:

FUTO displays the stock models by parameter size (39, 74, 244) rather than by name. But these are the correlations.

One Datapoint

A single data point just to provide a sense for what type of on-device transcription is viable on specific types of hardware:

My current Android is a One Plus Nord 3 5G

Operating System: OxygenOS 13.1 based on Android™ 13 CPU: MediaTek Dimensity 9000 GPU: Arm® Mali G710 MC10 RAM: 8GB/16GB LPDDR5X Storage: 128GB/256GB UFS 3.1 Available configurations: 8GB+128GB / 16GB+256GB Vibration: Haptic motor

On this handset:

  • Tiny (39M) is very fast but insufficiently accurate
  • Base (74M) provides the best performance tradoff
  • Small (244M) works but is not really fast enough for standard use

Note: I have Medium and Large available as fine-tunes (they are not provided as stock images) but not as ACFT fine tunes. Therefore, I can't assess whether they would work. But given that Small already pushes the performance envelope, there is no rationale reason to think why they would work!

Recommendations for Mobile (My Hardware)

Based on my testing with FUTO Voice Input:

  • Tiny/Base — Practical for real-time mobile use, responsive inference
  • Small — Borderline; may be sluggish on older phones but workable on newer devices
  • Medium/Large — Too heavy for comfortable mobile inference on my device

Note

These recommendations are based on my specific hardware. Your mileage may vary:

  • Newer/flagship phones may handle Medium reasonably well
  • Devices with dedicated NPUs or better thermal management could support larger models
  • Battery life impact increases significantly with larger models

Test on your own device to find the right balance between accuracy and responsiveness.


Generated by Claude Code. Please validate this information for your specific use case.

Comments are disabled for this gist.