Skip to content

Instantly share code, notes, and snippets.

@msinkec
Last active March 1, 2024 11:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save msinkec/0139cbd72564e11ebf9b79bb30b12813 to your computer and use it in GitHub Desktop.
Save msinkec/0139cbd72564e11ebf9b79bb30b12813 to your computer and use it in GitHub Desktop.
Embedding a Secret Message in a PDF through LaTeX

This gist outlines a method to embed a secret message within a LaTeX document through the use of variable length spaces, ultimately rendered in a PDF. This technique is a form of steganography, allowing information to be hidden in plain sight, with the secret message being undetectable to casual observation.

Generating the LaTeX File with a Secret Message

The first step involves creating a Python script that processes a given text (the content of the LaTeX file) and a secret message. This script converts the secret message into a binary string, where each bit (0 or 1) corresponds to a specific spacing in the LaTeX document. The spaces are not uniform; 0 might correspond to a smaller space, while 1 corresponds to a larger space. This variation in spacing is subtle and typically unnoticed by readers, but it can be used to encode binary data.

Below is a simplified version of the Python script that generates a LaTeX file embedding a secret message:

content = '''Your document content goes here...'''

content_words = content.split()

secret_str = 'Very secret text.'

def string_to_binary_no_spaces(input_string):
    binary_data = ''.join(format(ord(char), '08b') for char in input_string)
    return binary_data

secret_bits = string_to_binary_no_spaces(secret_str)

print('''
\\documentclass{article}
\\usepackage{lipsum} % Package to generate dummy text

% Define commands for altered spacing
\\newcommand{\\zerospace}{\\hspace{0.3em}} % Standard space for '0'
\\newcommand{\\onespace}{\\hspace{0.6em}}  % Larger space for '1'

\\begin{document}

\\section{Steg Example}
''')

i = 0
for word in content_words:
    print(word, end='')
    if len(secret_bits) > i:
        if secret_bits[i] == '0':
            print('\\zerospace', end=' ')
        else:
            print('\\onespace', end=' ')
        i += 1
    else:
        print(' ', end='')
print('\\end{document}')

We can save the output to a file using the following command:

python3 script.py > out.tex

Compiling to PDF

Once the LaTeX file is generated, the next step is to compile it into a PDF document. This can be done using the pdflatex command-line tool, which is part of most LaTeX distributions like TeX Live or MiKTeX.

To compile the document, run the following command:

pdflatex out.tex

This will generate our final out.pdf file.

Decoding the Secret Message

To retrieve the secret message from the PDF, one would need to reverse the process by detecting the variable length spaces and converting them back into binary data, and then back into text. This might involve some form of optical character recognition (OCR) if done automatically.

A Challenge

I prepared a PDF file with an embedded secret using the method above. See if you can decypher it:

https://ordinals.gorillapool.io/content/d0b910e854faa7e391188043a080fd47b4f051608e6c4ba8e2955452cae18f7c_0?fuzzy=false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment