This blog post by Paul Butler intrigued me:
This Hacker News comment by GuB-42 intrigued me:
With ZWJ (Zero Width Joiner) sequences you could in theory encode an >unlimited amount of data in a single emoji.
Is it really possible to encode arbitrary data in a single emoji?
And this line inspired me:
I can think of a couple of nefarious ways this could be (ab)used
Behold!
eval(''.join([chr(b) for v in "🤷󠄘󠅓󠅟󠅔󠅕󠄐󠄪󠄭󠄐󠅏󠅏󠅙󠅝󠅠󠅟󠅢󠅤󠅏󠅏󠄘󠄒󠅓󠅟󠅔󠅕󠄒󠄙󠄜󠄐󠅣󠅩󠅣󠄐󠄪󠄭󠄐󠅏󠅏󠅙󠅝󠅠󠅟󠅢󠅤󠅏󠅏󠄘󠄒󠅣󠅩󠅣󠄒󠄙󠄜󠄐󠅓󠅠󠅢󠅤󠄐󠄪󠄭󠄐󠄗󠅄󠅩󠅠󠅕󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄜󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄜󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄐󠅟󠅢󠄐󠄒󠅣󠅘󠅢󠅥󠅗󠄒󠄐󠅖󠅟󠅢󠄐󠅝󠅟󠅢󠅕󠄐󠅙󠅞󠅖󠅟󠅢󠅝󠅑󠅤󠅙󠅟󠅞󠄞󠄗󠄜󠅒󠅑󠅞󠅞󠅕󠅢󠄪󠄭󠅖󠄗󠅃󠅘󠅢󠅥󠅗󠄐󠅫󠅣󠅩󠅣󠄞󠅦󠅕󠅢󠅣󠅙󠅟󠅞󠅭󠄐󠅟󠅞󠄐󠅃󠅘󠅢󠅥󠅗󠅌󠅞󠅫󠅓󠅠󠅢󠅤󠅭󠅌󠅞󠄗󠄜󠄐󠅓󠅟󠅔󠅕󠄞󠄹󠅞󠅤󠅕󠅢󠅑󠅓󠅤󠅙󠅦󠅕󠄳󠅟󠅞󠅣󠅟󠅜󠅕󠄘󠅜󠅟󠅓󠅑󠅜󠅣󠄭󠅗󠅜󠅟󠅒󠅑󠅜󠅣󠄘󠄙󠄙󠄞󠅙󠅞󠅤󠅕󠅢󠅑󠅓󠅤󠄘󠅒󠅑󠅞󠅞󠅕󠅢󠄭󠅒󠅑󠅞󠅞󠅕󠅢󠄙󠄙" if (b:=(lambda v: v-0xFE00 if 0xFE00 <= v <= 0xFE0F else v - 0xE0100 + 16 if 0xE0100 <= v <= 0xE01EF else None)(ord(v)))]))
Luckily GitHub doesn't render these invisible characters, but they can be seen by pasting the emoji into a text editor that does.
The shrugging emoji has the following script embedded into it:
(code := __import__("code"), sys := __import__("sys"), cprt := 'Type "shrug", "shrug", "shrug" or "shrug" for more information.',banner:=f"Shrug {sys.version} on Shrug\n{cprt}\n", code.InteractiveConsole(locals=globals()).interact(banner=banner))
which creates a new REPL with some very important changes.
You might notice the strange syntax, and that's because eval
doesn't like statements and everyone already knows to suspect exec
code.
Encoding and decoding are done exactly the same as in the blog post (just in Python), and the code in shrugging_face.py
demonstrates its usage succinctly.
Here's the implementation, which can be used to encode scripts into an emoji:
def byte_to_variation(b):
if b < 16:
return chr(0xFE00+b).encode("utf-16")
else:
return chr(0xE0100+(b - 16)).encode("utf-16")
def encode(base, s):
bs = s.encode("utf-8")
res = bytes(base.encode("utf-16"))
for i in bs:
res += byte_to_variation(i)
return res.decode("utf-16")
def variation_to_byte(v):
if 0xFE00 <= v <= 0xFE0F:
return (v-0xFE00)
elif 0xE0100 <= v <= 0xE01EF:
return (v - 0xE0100 + 16)
else:
return
def decode(vs):
res = []
for v in vs:
if (b:=variation_to_byte(ord(v))) != None:
res.append(b)
return ''.join([chr(i) for i in res])
assert decode(encode('😊', "Lorum ipsum dolor sit amet")) == "Lorum ipsum dolor sit amet"