Skip to content

Instantly share code, notes, and snippets.

@jooray
Last active February 21, 2023 22:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jooray/22559480fecbe38444080c208b715cce to your computer and use it in GitHub Desktop.
Save jooray/22559480fecbe38444080c208b715cce to your computer and use it in GitHub Desktop.
Use whisper speech to text on an audio or video file regardless of codec, autodetect language
#!/bin/bash
# Usage: whisper-file FILE [LANGUAGE]
# If LANGAUGE is empty, it is set to "auto"
# General settings (paths) for whisper.cpp
# Note - this uses whisper.cpp, not official whisper. Get it at
# https://github.com/ggerganov/whisper.cpp
WHISPER_MODEL=/Users/test/whisper.cpp/models/ggml-large.bin
WHISPER_BIN=/Users/test/whisper.cpp/main
RECOGNITION_LANGUAGE="auto"
if [ -n "${2}" ]; then RECOGNITION_LANGUAGE="${2}"; fi
WAV_RECODED_FILENAME="${1}-whisper.wav"
WAV_VTT_FILENAME="${WAV_RECODED_FILENAME}.vtt"
if [ -e "${WAV_VTT_FILENAME}" ]
then
echo "${WAV_VTT_FILENAME} already exists"
exit 1
fi
if [ -e "${WAV_RECODED_FILENAME}" ]
then
echo "${WAV_RECODED_FILENAME} already exists, using it."
else
ffmpeg -i "${1}" -vn -ar 16000 -ac 1 -c:a pcm_s16le "${WAV_RECODED_FILENAME}"
fi
${WHISPER_BIN} -m ${WHISPER_MODEL} -f "${WAV_RECODED_FILENAME}" -otxt -ovtt -pp -l "${RECOGNITION_LANGUAGE}" && rm -f "${WAV_RECODED_FILENAME}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment