Skip to content

Instantly share code, notes, and snippets.

Created October 15, 2022 17:52
Show Gist options
  • Save midudev/2bc13e6ef38ccc4716fba8b7258f1403 to your computer and use it in GitHub Desktop.
Save midudev/2bc13e6ef38ccc4716fba8b7258f1403 to your computer and use it in GitHub Desktop.
Transcribe vídeo de YouTube con Whisper e Inteligencia Artificial


Necesitas tener instalado Python 3.9 e instalar la dependencia de Whisper y PyTube:

pip install git+
pip install pytube

También necesitas tener instalado ffmpeg. Según tu sistema operativo se instala de esta forma:

# Ubuntu
sudo apt update && sudo apt install ffmpeg
# Arch Linux
sudo pacman -S ffmpeg
#  MacOS con Homebrew (
brew install ffmpeg
# Windows con Chocolatey (
choco install ffmpeg
# Windows con Scoop (
scoop install ffmpeg

Cómo usar la línea de comandos

Necesitas indicar la URL del vídeo de YouTube que quieres transcribir:

python3 -h

python3 --video ""

# también puedes indicar el modelo de IA que usará Whisper
# cuanto más grande, más tardará en descargarlo la primera vez
python3 --video "" --model "large"
import logging
import pytube
import whisper
import sys
import argparse
parser = argparse.ArgumentParser(description='Transcript a YouTube video using Whisper')
parser.add_argument("--video", help = "Pass the YouTube url to transcribe")
parser.add_argument("--model", help = "Indicate the Whisper model to download", default="small")
args = parser.parse_args()
format="%(asctime)s [%(levelname)s] %(message)s",
if not
logging.error("Please pass a YouTube url to transcribe")
exit()"Downloading Whisper model")
model = whisper.load_model(args.model)"Downloading the video from YouTube...")
youtubeVideo = pytube.YouTube("Get only the audio from the video")
audio = youtubeVideo.streams.get_audio_only()'tmp.mp4')"Transcribe the audio")
result = model.transcribe('tmp.mp4')
Copy link

python3 --video "" --model "large"

Copy link

Copy link

MLFDev01 commented May 1, 2023

KeyError                                  Traceback (most recent call last)
[<ipython-input-2-62cec1f2ee37>](https://localhost:8080/#) in <cell line: 37>()
     36"Get only the audio from the video")
---> 37 audio = youtubeVideo.streams.get_audio_only()

2 frames
[/usr/local/lib/python3.10/dist-packages/pytube/](https://localhost:8080/#) in streams(self)
    294         """
    295         self.check_availability()
--> 296         return StreamQuery(self.fmt_streams)
    298     @property

[/usr/local/lib/python3.10/dist-packages/pytube/](https://localhost:8080/#) in fmt_streams(self)
    174         self._fmt_streams = []
--> 176         stream_manifest = extract.apply_descrambler(self.streaming_data)
    178         # If the cached js doesn't work, try fetching a new js file

[/usr/local/lib/python3.10/dist-packages/pytube/](https://localhost:8080/#) in streaming_data(self)
    159         else:
    160             self.bypass_age_gate()
--> 161             return self.vid_info['streamingData']
    163     @property

KeyError: 'streamingData'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment