Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save danielrosehill/6956bd9fc6622ee7a15c0696a41739e1 to your computer and use it in GitHub Desktop.

Select an option

Save danielrosehill/6956bd9fc6622ee7a15c0696a41739e1 to your computer and use it in GitHub Desktop.
Approximate accuracy comparison between locally hosted Whisper models of varying sizes (tiny, small, medium, large), based on their typical Word Error Rates.

Q&A: How Much Accuracy Do You Lose When Using Smaller Whisper Models?

This is a summary of a prompt in which the user asked:

"If I run Whisper locally and choose between tiny, small, medium, and large models, how much accuracy difference should I expect between them?"

The goal was to understand how much less accurate smaller Whisper models are compared to the larger ones.


Accuracy Comparison (Approximate)

Model Size Relative Accuracy Estimated Word Error Rate (WER) Notes
Large 100% (baseline) Lowest WER (~4-5%) Most accurate, but computationally heavy
Medium ~99% ~5-6% Nearly as good as large, much faster
Small ~97–98% ~6-8% Good speed/accuracy balance for most cases
Tiny ~90–95% ~10–15%+ Fastest, but significantly less accurate

Summary of Accuracy Differences

  • Medium vs. Large: ~0.5–1% less accurate
  • Small vs. Medium: ~1–2% less accurate
  • Tiny vs. Small: ~5–8% less accurate

These figures are approximations and vary depending on:

  • Language
  • Audio clarity
  • Domain vocabulary
  • Noise levels in recordings

This gist was generated with the help of OpenAI based on the user's prompt and publicly available model benchmarks (as of May 2025).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment