Skip to content

Instantly share code, notes, and snippets.

@markshust
Last active December 17, 2023 09:49
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save markshust/e9d772664492c5cb76a6fde032abc090 to your computer and use it in GitHub Desktop.
Save markshust/e9d772664492c5cb76a6fde032abc090 to your computer and use it in GitHub Desktop.
Python script to use Whisper to create srt's for all mp4's that don't currently have one in the current directory.
import os
import whisper
from whisper.utils import get_writer
# Get the current directory path
directory = os.getcwd()
# Loop through all the files in the directory
for file in sorted(os.listdir(directory)):
# Check if the file has the mp4 extension
if file.endswith('.mp4'):
# Get the name of the file with the extension
name = os.path.splitext(file)[0] + '.mp4'
srt_file = directory + '/' + name + '.srt'
# Check if there is a related srt file
if not os.path.isfile(srt_file):
# Create srt for all mp4 files that need one
print('Processing ' + file + '...')
model = whisper.load_model('large')
result = model.transcribe(file, fp16=False)
srt_writer = get_writer('srt', './')
srt_writer(result, file)
@sebington
Copy link

sebington commented Dec 17, 2023

Hi, I tried to run your script and got an "out of memory" error. I think it is due to the fact that the line "model = whisper.load_model('large')" is part of the loop (and so the program attempts to load the model several times). I fixed it by moving the line outside the loop, as follows:

import os
import whisper
from whisper.utils import get_writer

model = whisper.load_model('large')

# Get the current directory path
directory = os.getcwd()

# Loop through all the files in the directory
for file in sorted(os.listdir(directory)):
  
  # Check if the file has the mp4 extension
  if file.endswith('.mp4'):
    
    # Get the name of the file with the extension
    name = os.path.splitext(file)[0] + '.mp4'
    srt_file = directory + '/' + name + '.srt'
    
    # Check if there is a related srt file
    if not os.path.isfile(srt_file):
      
      # Create srt for all mp4 files that need one
      print('Processing ' + file + '...')
      result = model.transcribe(file, fp16=False)
      srt_writer = get_writer('srt', './')
      srt_writer(result, file)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment