#!/usr/bin/env python3 | |
def parse(file): | |
''' Generates encoded characters from file byte data. | |
Encoding is suitable for embedded [Graphics] in SubStation Alpha files. | |
See here for encoding specification and other details: | |
http://www.tcax.org/docs/ass-specs.htm | |
Bytes are split into groups of 6 bits, then 33 is added to each group | |
(which ensures all encoded bytes are printable ascii characters, and not | |
lower-case). | |
Most bytes are handled in groups of 3, since 3*8 = 24 bits = 4 groups of 6. | |
If the file length is not a multiple of 3 bytes the remaining one or two | |
bytes are left-shifted, with zero-bits added on the right to get an even | |
multiple of 6 bits, which can then be split and encoded normally. | |
''' | |
encoded_bits = 6 | |
while (data := file.read(3)): | |
if (offset := (len(data) % 3)) != 0: | |
if offset == 1: | |
# 1 remainder byte | |
joined = data[0] << 6 | |
encoded_characters = 2 | |
else: | |
# 2 remainder bytes | |
joined = data[0] << 10 | data[1] << 2 | |
encoded_characters = 3 | |
else: | |
# 3 bytes (normal) | |
joined = data[0] << 16 | data[1] << 8 | data[2] | |
encoded_characters = 4 | |
# yield one encoded character at a time | |
yield from (chr(((joined & 0b11_1111 << split) >> split) + 33) | |
for split in range((encoded_characters - 1) * encoded_bits, | |
-1, -encoded_bits)) | |
def to_lines(parser): | |
''' Yields encoded characters in 80-character lines. ''' | |
for index, value in enumerate(parser): | |
if index and index % 80 == 0: | |
yield '\n' | |
yield value | |
yield '\n' | |
if __name__ == '__main__': | |
from pathlib import Path | |
from argparse import ArgumentParser | |
parser = ArgumentParser( | |
description='Advanced SubStation embedded file encoder') | |
parser.add_argument('input_filename', type=Path) | |
args = parser.parse_args() | |
valid_file_types = ('.bmp', '.jpg', '.gif', '.ico', '.wmf', '.ttf') | |
assert (file_type := args.input_filename.suffix) in valid_file_types, \ | |
f'Unsupported {file_type = } - must be one of {valid_file_types}' | |
print('[Graphics]') | |
print('filename:', args.input_filename.name) | |
with open(args.input_filename, 'rb') as file: | |
print(''.join(to_lines(parse(file)))) |
Drawing tools seem to work, including on VLC. Straight lines are easy/intuitive, but bezier curves are hard to create without a GUI/some automation.
Not sure if it's best to make a converter that's tailored to logos and other low-palette images, or use existing online img->svg converters and make an svg parser. For conversion, would require code that detects a colour palette (kmeans on perceptual colour space?) and the major shapes/contours in an image, and outputs a valid set of commands (moves/lines/beziers) to draw them. For the output of either approach, user could specify nominal output size, and then specify a scale in the \p<scale>
command for using it in smaller videos.
Installed SubStation Alpha but couldn't get it to work with embedded files, or Pictures
more generally. Also confirmed that HandBrake doesn't seem capable of handling them or burning them in. At least the embedding code was fun to write.
Seems like vector-based drawings are the way to go, since they actually work across the main available modern softwares.
Requires
Python >= 3.8
.Details/Caveats
Outputs the same encoded result as Aegisub, so I assume it's implemented correctly. Unfortunately Aegisub doesn't support the SSA
Picture
event, so it can't be tested there. From the Aegisub Attachment Manager docs:Those docs also mention that
so it would seem either SubStation Alpha or mkvmerge are required for handling embedded files.
I'm unsure whether mkvmerge can handle
Picture
events, so will need to try that. I'm also not sure whether newer editors like SubtitleEdit can handlePicture
events. Assuming I formatted the line correctly, I believe VLC doesn't, and it at least apparently doesn't render embedded fonts, so likely can't handle embedded images even if it does supportPicture
s (although I did also try the file-path method, which didn't work either).It would likely be sufficient to burn in the subtitles (image(s) included) with HandBreak, but it also didn't seem to support the
Picture
events I tried.Alternatives
If I can't get SubStation Alpha or mkvmerge working then perhaps images will need to be included using the vector-graphics
Drawings
commands/style override codes specified in the ASS format. I suppose it's also technically possible to automatically convert a normal image into a vectorised format with individual squares/rectangles representing the pixels, but that's likely ill-advised for performance and memory reasons.