Skip to content

Instantly share code, notes, and snippets.

@rakslice
Created January 26, 2020 13:10
Show Gist options
  • Save rakslice/9e2e505e14f8f416f2b2b909199431ea to your computer and use it in GitHub Desktop.
Save rakslice/9e2e505e14f8f416f2b2b909199431ea to your computer and use it in GitHub Desktop.
convert vtt transcript to plain text
#!/bin/bash
set -e
# Convert vtt transcript to text
# For use with e.g. youtube-dl --write-auto-sub --skip-download $URL
tail -n +4 | sed 's/<c>//g' | sed 's/<[0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9][0-9]>//g' | sed 's/<\/c>//g' | sed '/[0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9][0-9] --> [0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9][0-9] align.*/d' | sed '/^\s*$/d' | uniq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment