Skip to content

Instantly share code, notes, and snippets.

@reitzig
Last active March 13, 2024 02:54
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save reitzig/6582edd485a5d0a8b68600dab3b0861b to your computer and use it in GitHub Desktop.
Save reitzig/6582edd485a5d0a8b68600dab3b0861b to your computer and use it in GitHub Desktop.
Run whisper.cpp as Container
FROM debian:11 AS build
RUN apt-get update \
&& apt-get install -y libsdl2-dev alsa-utils g++ make wget
RUN mkdir /whisper && \
wget -q https://github.com/ggerganov/whisper.cpp/tarball/master -O - | \
tar -xz -C /whisper --strip-components 1
WORKDIR /whisper
ARG model
RUN bash ./models/download-ggml-model.sh "${model}"
RUN make main stream
FROM debian:11 AS whisper
RUN apt-get update \
&& apt-get install -y libsdl2-dev alsa-utils \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /root
ARG model
RUN mkdir /root/models
COPY --from=build "/whisper/models/ggml-${model}.bin" "/root/models/ggml-${model}.bin"
COPY --from=build /whisper/main /usr/local/bin/whisper
COPY --from=build /whisper/stream /usr/local/bin/stream
#!/usr/bin/env bash
set -eu
MODEL="base"
LANG="de"
audio_file="${1}"
file_name="$(basename "${audio_file}")"
wav_file="${audio_file%.*}.wav"
media_dir="$(realpath "$(dirname "${audio_file}")")"
script_dir="$(realpath "$(dirname "${0}")")"
docker build -t whisper-${MODEL} --build-arg model=${MODEL} "${script_dir}"
ffmpeg -i "${audio_file}" -ar 16000 "${wav_file}"
docker run --rm -it \
-v "${media_dir}":/media \
whisper \
whisper \
--model /root/models/ggml-${MODEL}.bin \
--language ${LANG} \
-t 2 \
--output-txt -nt \
-f "/media/$(basename "${wav_file}")"
@geimist
Copy link

geimist commented Mar 31, 2023

Nice 👍
I would not call docker build on every run, but check for it beforehand.

if [ ! $(docker images | grep -q "whisper-${MODEL}" ) ]; then
    docker build -t whisper-${MODEL} --build-arg model=${MODEL} "${script_dir}"
fi

@reitzig
Copy link
Author

reitzig commented Mar 31, 2023

Thanks to build caches, there's basically no overhead in running it every time, but you never have to rebuild yourself when changing things. YMMV, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment