Skip to content

Instantly share code, notes, and snippets.

@juliusgeo
Last active March 19, 2025 00:58
Show Gist options
  • Save juliusgeo/bb62acdcc9f88cda852178e5361437f2 to your computer and use it in GitHub Desktop.
Save juliusgeo/bb62acdcc9f88cda852178e5361437f2 to your computer and use it in GitHub Desktop.
Smuggling Python Code Using Shrugging Faces

This blog post by Paul Butler intrigued me:

This Hacker News comment by GuB-42 intrigued me:

With ZWJ (Zero Width Joiner) sequences you could in theory encode an >unlimited amount of data in a single emoji.

Is it really possible to encode arbitrary data in a single emoji?

And this line inspired me:

I can think of a couple of nefarious ways this could be (ab)used

Behold!

eval(''.join([chr(b) for v in "🤷󠄘󠅓󠅟󠅔󠅕󠄐󠄪󠄭󠄐󠅏󠅏󠅙󠅝󠅠󠅟󠅢󠅤󠅏󠅏󠄘󠄒󠅓󠅟󠅔󠅕󠄒󠄙󠄜󠄐󠅣󠅩󠅣󠄐󠄪󠄭󠄐󠅏󠅏󠅙󠅝󠅠󠅟󠅢󠅤󠅏󠅏󠄘󠄒󠅣󠅩󠅣󠄒󠄙󠄜󠄐󠅓󠅠󠅢󠅤󠄐󠄪󠄭󠄐󠄗󠅄󠅩󠅠󠅕󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄜󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄜󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄐󠅟󠅢󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄐󠅖󠅟󠅢󠄐󠅝󠅟󠅢󠅕󠄐󠅙󠅞󠅖󠅟󠅢󠅝󠅑󠅤󠅙󠅟󠅞󠄞󠄗󠄜󠅒󠅑󠅞󠅞󠅕󠅢󠄪󠄭󠅖󠄗󠅃󠅘󠅢󠅥󠅗󠄐󠅫󠅣󠅩󠅣󠄞󠅦󠅕󠅢󠅣󠅙󠅟󠅞󠅭󠄐󠅟󠅞󠄐󠅃󠅘󠅢󠅥󠅗󠅌󠅞󠅫󠅓󠅠󠅢󠅤󠅭󠅌󠅞󠄗󠄜󠄐󠅓󠅟󠅔󠅕󠄞󠄹󠅞󠅤󠅕󠅢󠅑󠅓󠅤󠅙󠅦󠅕󠄳󠅟󠅞󠅣󠅟󠅜󠅕󠄘󠅜󠅟󠅓󠅑󠅜󠅣󠄭󠅗󠅜󠅟󠅒󠅑󠅜󠅣󠄘󠄙󠄙󠄞󠅙󠅞󠅤󠅕󠅢󠅑󠅓󠅤󠄘󠅒󠅑󠅞󠅞󠅕󠅢󠄭󠅒󠅑󠅞󠅞󠅕󠅢󠄙󠄙" if (b:=(lambda v: v-0xFE00 if 0xFE00 <= v <= 0xFE0F else v - 0xE0100 + 16 if 0xE0100 <= v <= 0xE01EF else None)(ord(v)))]))

Luckily GitHub doesn't render these invisible characters, but they can be seen by pasting the emoji into a text editor that does. image

The shrugging emoji has the following script embedded into it:

(code := __import__("code"), sys := __import__("sys"), cprt := 'Type "shrug", "shrug", "shrug" or "shrug" for more information.',banner:=f"Shrug {sys.version} on Shrug\n{cprt}\n", code.InteractiveConsole(locals=globals()).interact(banner=banner))

which creates a new REPL with some very important changes. image

You might notice the strange syntax, and that's because eval doesn't like statements and everyone already knows to suspect exec code.

Encoding and decoding are done exactly the same as in the blog post (just in Python), and the code in shrugging_face.py demonstrates its usage succinctly.

Here's the implementation, which can be used to encode scripts into an emoji:

def byte_to_variation(b):
    if b < 16:
        return chr(0xFE00+b).encode("utf-16")
    else:
        return chr(0xE0100+(b - 16)).encode("utf-16")

def encode(base, s):
    bs = s.encode("utf-8")
    res = bytes(base.encode("utf-16"))
    for i in bs:
        res += byte_to_variation(i)
    return res.decode("utf-16")

def variation_to_byte(v):
    if 0xFE00 <= v <= 0xFE0F:
        return (v-0xFE00)
    elif 0xE0100 <= v <= 0xE01EF:
        return (v - 0xE0100 + 16)
    else:
        return

def decode(vs):
    res = []
    for v in vs:
        if (b:=variation_to_byte(ord(v))) != None:
            res.append(b)
    return ''.join([chr(i) for i in res])

assert decode(encode('😊', "Lorum ipsum dolor sit amet")) == "Lorum ipsum dolor sit amet"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment