Skip to content

Instantly share code, notes, and snippets.

@pavelbinar
Forked from bmaeser/subtitle-extract.txt
Last active December 24, 2023 12:10
Show Gist options
  • Save pavelbinar/20a3366b54f41e355d2745c89091ec46 to your computer and use it in GitHub Desktop.
Save pavelbinar/20a3366b54f41e355d2745c89091ec46 to your computer and use it in GitHub Desktop.
Extract subtitles from .mkv files on Mac OS X
@victornpb
Copy link

Extract subtitles from MKV on all subdirectories

extract_subtitles.py

from os import walk
import subprocess
import re
from os import path

tool_path = "/Applications/MKVToolNix-51.0.0.app/Contents/MacOS/"
dir = "./"

def find_files(dir, ext):
    file_list = []
    for (dirpath, dirnames, filenames) in walk(dir):
        for filename in filenames:
            if filename.endswith(ext) and not filename.startswith('._'):
                file_list.append(dirpath + '/' + filename)
    file_list.sort()
    return file_list

for file in find_files(dir, ".mkv"):
    basename = file.replace(".mkv", "")
    
    if path.exists(basename+".srt") or path.exists(basename+".ssa"):
        print("Already Exist, skipping...", basename)
        continue

    # Find subtitle track
    result = subprocess.run([tool_path + "mkvmerge", "-i", file], stdout=subprocess.PIPE, check=True)
    
    # SubRip .srt
    srt_track = re.search(r'Track ID (\d+): subtitles \(SubRip/SRT\)', str(result.stdout)) 
    if srt_track:
        srt_track = "{}:{}.{}".format(srt_track.group(1), basename, "srt")

    # SubStation Alpha .ssa
    ssa_track = re.search(r'Track ID (\d+): subtitles \(SubStationAlpha\)', str(result.stdout)) 
    if ssa_track:
        ssa_track = "{}:{}.{}".format(ssa_track.group(1), basename, "ssa")

    if not srt_track and not ssa_track:
        print('No SRT track found!', file, str(result.stdout))
        continue


    # Extract SRT
    subprocess.run(list(filter(None, [tool_path+"mkvextract", "tracks", file, srt_track, ssa_track])), check=True)

print("Finished!");

then run on terminal like:

python3 extract_subtitles.py

@victorboykocom
Copy link

same issue exactly with an srt file of over 10MB with binary data.
SAD!

@victorboykocom
Copy link

There's a Mac application called Subtitle Extractor in the App Store that does this

Thank you for this @larryy ! I needed an SRT to translate subtitles into another language and needed precise time codes, this app does it!

@kwccoin
Copy link

kwccoin commented Apr 15, 2021

For SRT it is ok but for SUP like Japanese and Chinese, seems OCR is needed.

@salishrodinger
Copy link

Thank you so much, it worked !!

@alsciende
Copy link

Thanks a lot!

@phamhphuc
Copy link

Very simple and work, it's really work well. Thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment