Writeup by @auscompgeek.
Disclaimer: I did not participate in CySCA 2017. I simply saw this challenge (and a number of people struggling over it) and found it interesting. As such, I do not have the full description of the challenge.
Competitors were provided with a file tree.py
. I have attached it for prosperity.
Before attempting to solve this challenge, it should be noted (from the shebang line) that this is a Python 2 program.
The provided Python program contains a dump of a Python AST of another Python program.
Thankfully, there's a PyPI package called astor
that can take an AST and dump corresponding Python source code.
How did I know about astor
in the first place though? Let's take a tangent for a second.
I have an interest in languages that run on top of other language implementations (e.g. Scala runs on the Java VM). I know of a couple of other languages that run on CPython: dg (a Haskell-inspired language which targets CPython bytecode) and hy (a Lisp that compiles to Python). hy
happens to generate an AST to compile down to Python, and uses astor
to allow for dumping generated Python source.
#!/usr/bin/python2.7
from ast import *
import astor
tree = Module(...) # entire AST text dump goes here
print astor.to_source(tree)
Running this with astor 0.5, however, results in an error:
Traceback (most recent call last):
File "/home/seedbox/ctf/trees.py", line 59, in <module>
print(astor.to_source(m))
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 40, in to_source
generator.visit(node)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 493, in visit_Module
self.visit(stmt)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 189, in visit_FunctionDef
self.body(node.body)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 99, in body
self.visit(stmt)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 159, in visit_Assign
self.visit(node.value)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 48, in newfunc
func(self, node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 455, in visit_Lambda
self.write(': ', node.body)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 73, in write
self.visit(item)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 363, in visit_Call
self.write(write_comma, arg)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 73, in write
self.visit(item)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 462, in visit
self.write(left, node.elt)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 73, in write
self.visit(item)
File "/usr/lib/python2.7/site-packages/astor/misc.py", line 161, in visit
return visitor(node)
File "/usr/lib/python2.7/site-packages/astor/codegen.py", line 366, in visit_Call
self.conditional_write(write_comma, '*', node.starargs)
AttributeError: 'Call' object has no attribute 'starargs'
Interestingly enough, ast.Call
objects have starargs
and kwargs
fields in Python < 3.5, however, if they're not specified, the attributes don't exist on the objects, but the compiler doesn't care.
Luckily, the git version of astor does not have this bug. (I initially patched the PyPI version of astor myself, but upon looking into this, I discovered that CPython 3.5 removed these fields in implementing PEP 448, so astor had to be fixed for this change.)
Installing the git version of astor and running our script results in the following code:
def main(value):
convert = lambda *nums: ''.join(chr(x) for x in nums)
lib = convert(104, 97, 115, 104, 108, 105, 98)
attr = convert(109, 100, 53)
method = convert(100, 105, 103, 101, 115, 116)
if getattr(getattr(__import__(lib), attr)(value), method)()[::-1
] != 'CN\x9f\x1e\xa0\x0e{\x8a\x86\xc4\x8f\xf7\xe6\xf5d\x1d':
raise ValueError('Wrong value!')
Yay, more obfuscated Python code!
The lib
, attr
, and method
names are simply given as ASCII numbers. Simple to decode, but they've given us the code to do that, so let's just reuse it.
>>> convert = lambda *nums: ''.join(chr(x) for x in nums)
>>> convert(104, 97, 115, 104, 108, 105, 98)
'hashlib'
>>> convert(109, 100, 53)
'md5'
>>> convert(100, 105, 103, 101, 115, 116)
'digest'
Hence, the if condition is basically:
hashlib.md5(value).digest()[::-1] != 'CN\x9f\x1e\xa0\x0e{\x8a\x86\xc4\x8f\xf7\xe6\xf5d\x1d'
The digest()
method returns the hash as a bytestring. Notice that the bytestring is also reversed before comparison. So this code is looking for a value whose MD5 hash is:
>>> 'CN\x9f\x1e\xa0\x0e{\x8a\x86\xc4\x8f\xf7\xe6\xf5d\x1d'[::-1].encode('hex')
'1d64f5e6f78fc4868a7b0ea01e9f4e43'