Skip to content

Instantly share code, notes, and snippets.

@myabiku
Last active April 19, 2018 03:44
Show Gist options
  • Save myabiku/ba6283912043a87b17f77cba5751724a to your computer and use it in GitHub Desktop.
Save myabiku/ba6283912043a87b17f77cba5751724a to your computer and use it in GitHub Desktop.
mkdir tvsenado
cd tvsenado
# faz download das legendas geradas automaticamente do youtube [não é perfeito, mas ainda vou achar os closed captions oficiais]
youtube-dl "https://www.youtube.com/playlist?list=PLLLnytnGoqiYTJ7yDXiq8bMtU8tKUmop6" -v --skip-download --sub-lang pt --write-auto-sub
# substitui os espaços por _
OIFS="$IFS"; IFS=$'\n'; for file in `find . -type f -name "*.vtt"` ; do cp $file srt/$(echo $file | sed 's/ /_/g'); done
mkdir srt
# converte as legendas de vtt para srt
for i in $(ls *.vtt); do ffmpeg -i $i srt/$(echo $i | sed 's/vtt$/srt/'); done
cd srt
# mostra o ranking das palavras mais utilizadas em * arquivos
cat * | sed -e 's/[^[:alpha:]]/ /g' | tr '\n' " " | tr -s " " | tr " " '\n'| tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | nl | less
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment