Skip to content

Instantly share code, notes, and snippets.

@reasonableperson
Last active April 18, 2024 01:44
Show Gist options
  • Save reasonableperson/a539cdeea34f4aba9ef5f15d0927d0c1 to your computer and use it in GitHub Desktop.
Save reasonableperson/a539cdeea34f4aba9ef5f15d0927d0c1 to your computer and use it in GitHub Desktop.
generate running transcript for web streams
#!/bin/bash
# whisper-stream.sh
#
# Take a url supported by yt-dlp, dump 30-second segments to the current
# directory named by unix timestamp, and transcribe each segment using Whisper.
#
# example: TZ=Australia/Canberra ./whisper-stream.sh "https://..."
#
# The time displayed is the time when ffmpeg first opens the segment for
# writing (not when the 15 seconds are up), so adding the offset printed by
# Whisper should give you the approximate time when your computer received the
# broadcast words. Set the TZ environment variable to the timezone where the
# video was recorded for an estimate of when the words were spoken which does
# not account for broadcast delay.
yt-dlp "$1" -o - 2>/dev/null |
ffmpeg -f segment -segment_time 30 -strftime 1 %s.mp4 -i - -v verbose 2>&1 |
grep -Po --line-buffered "Opening '\K\d+" |
xargs -I _ bash -c 'echo; date -d @_; inotifywait -qqe CLOSE _.mp4; whisper --model medium.en _.mp4'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment