Instuctions available (moved) at REMOTE ORIGIN website: Extract Subtitles From mkv
-
-
Save pavelbinar/20a3366b54f41e355d2745c89091ec46 to your computer and use it in GitHub Desktop.
Hi I keep getting the message that Error: The file 'Episode_1.mkv' could not be opened for reading: open file error.
I did exactly as suggested. It extracted a file. The file ended in .srt because that's the filename I specified, but it's a 25MB binary file, not a plain text .srt file. I suspect it's a .sub file format. Here are the first couple of lines:
PG�9ÅÆ��������Ä�8���Ä��������˛�mPG�9ÅÆ������
���˛�m�á�îPG�9ÅÆ������B����ÄÄÄ��ÄÄ@��ÄÄ`��ÄÄø��ÄĘ��ÄÄ���ÄÄü��ÄÄ �ÄÄ0
Apparently mkvextract cannot convert subtitle format/codec, but just gives you whatever is in the Matroska file.
Going to try the ffmpeg suggestion. Usually ffmpeg is good about honoring the file extension you provide, if I remember right.
And just in case there are any doubts, here are the actual lines from the command-line session, except I've changed the filenames:
$ mkvmerge -i video.mkv
File 'video.mkv': container: Matroska
Track ID 0: video (MPEG-H/HEVC/h.265)
Track ID 1: audio (AC-3/E-AC-3)
Track ID 2: subtitles (HDMV PGS)
Chapters: 14 entries
$ mkvextract tracks video.mkv 2:subtitles.srt
Extracting track 2 with the CodecID 'S_HDMV/PGS' to the file 'subtitles.srt'. Container format: SUP
Progress: 100%
The lines of binary gibberish above are from subtitles.srt.
UPDATE: Nope, ffmpeg can't do it either. It notices the .srt extension and provides this helpful error message:
Subtitle encoding currently only possible from text to text or bitmap to bitmap
Hi I keep getting the message that Error: The file 'Episode_1.mkv' could not be opened for reading: open file error.
Me too, file is not open. or in use
Hi Guys,
This is not a support channel.
This is just drop-in "snippet".
If you need help, please move into more appropriate place where you can actually get some help.
Try https://stackoverflow.com/
I did exactly as suggested. It extracted a file. The file ended in .srt because that's the filename I specified, but it's a 25MB binary file, not a plain text .srt file. I suspect it's a .sub file format. Here are the first couple of lines:
PG�9ÅÆ��������Ä�8���Ä��������˛�mPG�9ÅÆ������
���˛�m�á�îPG�9ÅÆ������B����ÄÄÄ��ÄÄ@��ÄÄ`��ÄÄø��ÄĘ��ÄÄ���ÄÄü��ÄÄ �ÄÄ0Apparently mkvextract cannot convert subtitle format/codec, but just gives you whatever is in the Matroska file.
Going to try the ffmpeg suggestion. Usually ffmpeg is good about honoring the file extension you provide, if I remember right.
And just in case there are any doubts, here are the actual lines from the command-line session, except I've changed the filenames:
$ mkvmerge -i video.mkv
File 'video.mkv': container: Matroska
Track ID 0: video (MPEG-H/HEVC/h.265)
Track ID 1: audio (AC-3/E-AC-3)
Track ID 2: subtitles (HDMV PGS)
Chapters: 14 entries$ mkvextract tracks video.mkv 2:subtitles.srt
Extracting track 2 with the CodecID 'S_HDMV/PGS' to the file 'subtitles.srt'. Container format: SUP
Progress: 100%The lines of binary gibberish above are from subtitles.srt.
UPDATE: Nope, ffmpeg can't do it either. It notices the .srt extension and provides this helpful error message:
Subtitle encoding currently only possible from text to text or bitmap to bitmap
you solved this?
I use zsh on my Catalina and I'm getting:
zsh: command not found: mkvmerge
My solution: I had the GUI MKVToolNix installed (v47) (via DMG install), so I did like this:
/Applications/MKVToolNix-47.0.0.app/Contents/MacOS/mkvmerge -i myFile.mkv
and
/Applications/MKVToolNix-47.0.0.app/Contents/MacOS/mkvextract tracks myFile.mkv 3:myFile1.srt 4:myFile2.srt
Worked.
https://stackoverflow.com/ is better place for this type of support requests / discussion.
I have the same issue as @larryy and @antonioreyna. Any updates or solves?
I have the same issue as @larryy and @antonioreyna. Any updates or solves?
I was dealing with the same problem and what worked for me was just simplify the name of the mkv file, removing the brackets that it was and everything worked well
Works on M1 Mac mini. You just need to fire up Terminal in Rosetta 2 mode before remote installing brew. Don't forget to set a folder location for extracted SRT files. Otherwise it'll just save them under main User directory. Thanks for sharing.
If the tool is installed and in your $PATH it will run. The name of the file is irrelevant to the .srt subtitle problem. Unfortunately, there are multiple issues being discussed, not all having to do with subtitles.
That said, trying to extract text subtitles, like .srt, from a .mkv file that has bitmap subtitles won't work, whether from the command line or using a GUI utility like mkvToolnix. mkvToolnix will extract bitmap subtitles to a .mks file, and I suspect that's what this tool is doing as well. Either this tool or ffmpeg would have to implement OCR to convert bitmap subtitles to text subtitles. Sadly, neither one does, but it's kind of understandable, since open source OCR tools are not very good without a language model of some kind. There's a Mac application called Subtitle Extractor in the App Store that does this, but it has no language model, so it will make silly mistakes like replacing "silly" with "sil/y", "I'm" with "I 'm", "with" as "With", "won't" as "won 't" and on and on. It gets enough right that you could probably fix it by hand, but it would be tedious because there are a lot of errors. Better than nothing, I guess.
Consider it a complex feature request I guess, to implement OCR with a language model to convert bitmap subtitles to text subtitles.
Extract subtitles from MKV on all subdirectories
extract_subtitles.py
from os import walk
import subprocess
import re
from os import path
tool_path = "/Applications/MKVToolNix-51.0.0.app/Contents/MacOS/"
dir = "./"
def find_files(dir, ext):
file_list = []
for (dirpath, dirnames, filenames) in walk(dir):
for filename in filenames:
if filename.endswith(ext) and not filename.startswith('._'):
file_list.append(dirpath + '/' + filename)
file_list.sort()
return file_list
for file in find_files(dir, ".mkv"):
basename = file.replace(".mkv", "")
if path.exists(basename+".srt") or path.exists(basename+".ssa"):
print("Already Exist, skipping...", basename)
continue
# Find subtitle track
result = subprocess.run([tool_path + "mkvmerge", "-i", file], stdout=subprocess.PIPE, check=True)
# SubRip .srt
srt_track = re.search(r'Track ID (\d+): subtitles \(SubRip/SRT\)', str(result.stdout))
if srt_track:
srt_track = "{}:{}.{}".format(srt_track.group(1), basename, "srt")
# SubStation Alpha .ssa
ssa_track = re.search(r'Track ID (\d+): subtitles \(SubStationAlpha\)', str(result.stdout))
if ssa_track:
ssa_track = "{}:{}.{}".format(ssa_track.group(1), basename, "ssa")
if not srt_track and not ssa_track:
print('No SRT track found!', file, str(result.stdout))
continue
# Extract SRT
subprocess.run(list(filter(None, [tool_path+"mkvextract", "tracks", file, srt_track, ssa_track])), check=True)
print("Finished!");
then run on terminal like:
python3 extract_subtitles.py
same issue exactly with an srt file of over 10MB with binary data.
SAD!
There's a Mac application called Subtitle Extractor in the App Store that does this
Thank you for this @larryy ! I needed an SRT to translate subtitles into another language and needed precise time codes, this app does it!
For SRT it is ok but for SUP like Japanese and Chinese, seems OCR is needed.
Thank you so much, it worked !!
Thanks a lot!
Very simple and work, it's really work well. Thank you very much
Thanks a lot! That was really helpful!