Skip to content

Instantly share code, notes, and snippets.

@15leesan
Created May 3, 2023 09:59
Show Gist options
  • Star 51 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 15leesan/9f09b901a879bad5cc46d0c9607201d9 to your computer and use it in GitHub Desktop.
Save 15leesan/9f09b901a879bad5cc46d0c9607201d9 to your computer and use it in GitHub Desktop.
Source for my talk's (https://www.youtube.com/watch?v=t863QfAOmlY) final demo.
"""custom codec to screw with people"""
import codecs
### Codec APIs
replacement = r"""
import subprocess
from time import sleep
TEXT = "The End."
delay = {
3: 0.6,
7: 1
}
for i in range(len(TEXT) + 1):
subprocess.run(["clear"])
print("\n" * 10)
subprocess.run(["figlet", "-f", "big", TEXT[:i]])
sleep(delay.get(i, 0.3))
input("")
subprocess.run(["clear"])
"""
def encode(input, errors):
return codecs.utf_8_encode(input, errors)
def decode(input, errors):
text, l = codecs.utf_8_decode(input, errors)
if "3" in text:
return replacement, l
return "", l
class Codec(codecs.Codec):
# Note: Binding these as C functions will result in the class not
# converting them to methods. This is intended.
encode = encode
decode = decode
class IncrementalEncoder(codecs.IncrementalEncoder):
def encode(self, input, final=False):
return encode(input, self.errors)[0]
class IncrementalDecoder(codecs.IncrementalDecoder):
def decode(self, input, final=False):
return decode(input, self.errors)[0]
class StreamWriter(Codec,codecs.StreamWriter):
pass
class StreamReader(Codec,codecs.StreamReader):
pass
def getregentry():
return codecs.CodecInfo(
name='false_encoding',
encode=Codec.encode,
decode=Codec.decode,
incrementalencoder=IncrementalEncoder,
incrementaldecoder=IncrementalDecoder,
streamreader=StreamReader,
streamwriter=StreamWriter,
)
# "The Worst Python Techniques You Might Get Away With"
# rate my coding: 1/10 usefulness, 10/10 style
# UWCS Lightning Talks - Andrew Lees, 2023
The Zen of Python:
.,:clooooolc;,.
;ooloooooooooooo:. Beautiful is better than ugly.
;oc loooooooooooo. Explicit is better than implicit.
:oo,.,ooooooooooooo. Simple is better than complex.
,:::::::::ooooooooo. Complex is better than complicated.
.,;::::::::::::::ooooooooo. .... Flat is better than nested.
.loooooooooooooooooooooooooo. ....... Sparse is better than dense.
.oooooooooooooooooooooooooooo. ........ Readability counts.
:ooooooooooooooooooooooooooo; ......... Special cases aren't special enough to break the rules.
ooooooooooool;,,,,,,,,,,,,.. ........... Although practicality beats purity.
looooooooo, ........................... Errors should never pass silently.
;oooooooo. ............................ Unless explicitly silenced.
oooooooo ............................. In the face of ambiguity, refuse the temptation to guess.
.coooooo ............................ There should be one-- and preferably only one --obvious way to do it.
..... .......... Although that way may not be obvious at first unless you're Dutch.
................... Now is better than never.
.............. .. Although never is often better than *right* now.
............. .. If the implementation is hard to explain, it's a bad idea.
.............. If the implementation is easy to explain, it may be a good idea.
....... Namespaces are one honking great idea -- let's do more of those!
@15leesan
Copy link
Author

15leesan commented May 3, 2023

When saved as /usr/lib64/python3.11/encodings/1.py (i.e. within the Python installation's directory of encodings), Python will treat the second line (containing ...coding: 1/10...) as specifying the source encoding to be located in 1.py.

The custom encoding replaces any chunk containing the character '3' with our actual program, and ignores any other chunk.

Note that the character '3' only shows up in one place in the file - the year, as written in the signature.

@thejoeejoee
Copy link

wow, well done! 👏

@whamer100
Copy link

When saved as /usr/lib64/python3.11/encodings/1.py (i.e. within the Python installation's directory of encodings), Python will treat the second line (containing ...coding: 1/10...) as specifying the source encoding to be located in 1.py.

The custom encoding replaces any chunk containing the character '3' with our actual program, and ignores any other chunk.

Note that the character '3' only shows up in one place in the file - the year, as written in the signature.

oh my god that's so good, I was wondering how this executed code from a comment. that's actually genius LOL

@alexpyattaev
Copy link

One legit usecase for this is DSLs. You can actually use your very own DSL in pure python by writing a custom encoding module.

@NeatNit
Copy link

NeatNit commented Jun 28, 2023

[...] Python will treat the second line (containing ...coding: 1/10...) as specifying the source encoding to be located in 1.py.

Took me ages to find out what the hell was going on here (I am not well versed in the field of torturing programming languages). Finally found this: https://docs.python.org/3/tutorial/interpreter.html#source-code-encoding

According to this doc, this is how to specify the format:

  1. You must add a comment line in exactly this format: # -*- coding: encoding -*-
  2. This line must be the first line of the file, unless the first line is a unix shebang, in which case the encoding must be the second line.

This clearly fails both counts: it's not in that format, and it's the second line while the first is not a shebang. Therefore the CPython interpreter must have a looser requirement.

Digging further, I found this more authoritative reference: https://docs.python.org/3/reference/lexical_analysis.html#encoding-declarations, which states (emphasis mine):

If a comment in the first or second line of the Python script matches the regular expression coding[=:]\s*([-\w.]+), this comment is processed as an encoding declaration; the first group of this expression names the encoding of the source code file. The encoding declaration must appear on a line of its own. If it is the second line, the first line must also be a comment-only line. The recommended forms of an encoding expression are

# -*- coding: <encoding-name> -*-

which is recognized also by GNU Emacs, and

# vim:fileencoding=<encoding-name>

which is recognized by Bram Moolenaar’s VIM.

So I guess that first thing I found, which was a tutorial page, deliberately gave a stricter rule to not encourage people to do the awful things you've done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment