This is comparison between whisper.cpp and faster-whisper. The faster-whisper readme has some benchmarks on the readme but wanted to test it myself. For whisper, I just ran manually. For faster-whisper, wrote this small script
- https://github.com/guillaumekln/faster-whisper/
- https://github.com/ggerganov/whisper.cpp
- quanitization w faster-whisper: https://opennmt.net/CTranslate2/quantization.html
- quantization w whisper.cpp: https://github.com/ggerganov/whisper.cpp#quantization
- ./main -bs 5 -p 2 -f steve2.wav -m models/ggml-small.en.bin
- Total 8 CPU threads on my 12 core machine
- -bs 2 : actually performs better about 10s faster.