Skip to content

Instantly share code, notes, and snippets.

@firexcy
Created January 28, 2023 12:04
Show Gist options
  • Save firexcy/98d94bda7c8328c8fd444be148f03dd9 to your computer and use it in GitHub Desktop.
Save firexcy/98d94bda7c8328c8fd444be148f03dd9 to your computer and use it in GitHub Desktop.
Transcribe recordings with the C++ port of OpenAI Whisper
# Clone the project
git clone https://github.com/ggerganov/whisper.cpp && cd whisper.cpp
# Download at least one model, which may be "tiny", "base", "small", "medium",
# "large", etc. Usually "medium" is sufficient for recordings in Chinese.
bash ./models/download-ggml-model.sh medium
# Compile the program
make
# Prepare the input audio file, since the current version runs only with 16-bit
# WAV files. Dependent on ffmpeg.
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le test.wav
# Start transcription with the "medium" model (-m, or with another model of
# choice) and use Chinese (-l, for the full list of language codes see
# https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10).
./main -l zh -m models/ggml-medium.bin -f test.wav
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment