Skip to content

Instantly share code, notes, and snippets.

@vishvananda
Last active September 9, 2020 16:11
Show Gist options
  • Save vishvananda/980132c0970f8621bb3c to your computer and use it in GitHub Desktop.
Save vishvananda/980132c0970f8621bb3c to your computer and use it in GitHub Desktop.
Using OpenSSL from Python with python-cffi

Introduction

A few weeks ago I stumbled across a thread on hacker news that referenced the Matasano Cyrpto Challenge. I find myself unable to resist this type of problem so I decided to make an attempt. It teaches you to find vulnerabilities in crypto systems by starting with simple attacks and building up to more complex ones. Early on in the project it has you start breaking ecryption that uses the AES cypher in ECB mode. It specifically asks you not to implement the cypher yourself but to use a known-correct implementation like OpenSSL.

I tend to try to solve programming challenges in python, because the coding goes much more quickly. I checked the pyOpenSSL docs (which I have used before) to determine the call for encryption in ECB mode. Unfortunately, pyOpenSSL only exposes a small subset of the OpenSSL api, mostly relating to certificates, so it was a non-starter.

Scripting to the Rescue

It wasn't the first time I had to deal with a lack of features in pyOpenSSL, and, hey, python is a scripting language, right? A few minutes later I had a version that implemented the calls by shelling out to the openssl binary:

import subprocess
import tempfile

def _encrypt_decrypt_ecb_shell(data, key, decrypt):
    infile = tempfile.mktemp()
    with open(infile, "wb") as f:
        f.write(data)
    outfile = tempfile.mktemp()
    cmd = ['openssl', 'enc']
    if decrypt:
        cmd.append('-d')
    cmd += ['-aes-128-ecb',
            '-nopad',
            '-in', infile,
            '-out', outfile,
            '-K', key.encode("hex")]
    subprocess.Popen(cmd).wait()
    with open(outfile) as f:
        result = f.read()
    return result

def encrypt_ecb_shell(data, key):
    return _encrypt_decrypt_ecb_shell(data, key, False)

def decrypt_ecb_shell(data, key):
    return _encrypt_decrypt_ecb_shell(data, key, True)

This code is pretty straightforward and simple and got me through the first few exercises, but somewhere in the second set I needed to predictively decrypt a string many thousands of times. The overhead of shelling out meant multiple minutes to break the encryption.

A Better Option

It is likely that one of the other libraries like pycrypto supports aes in ecb mode, but I like to treat roadblocks as an opportunity to experiment with unfamiliar technologies.

In one of my previous PyPy investigations I had come across a library called cffi which purports to be an easier way to interface with c code than using ctypes. You can even directly embed c in your python! I suspected that a few lines of code would give me a version that directly calls the evp encryption code the openssl c library.

Including research and experimentation, the implementation took a couple of hours and resulted in a couple dozen lines of code. Read on for an explanation.

Explanation

First we import the module and construct the cffi object:

import cffi

_FFI = cffi.FFI()

Next we define the functions we want to be able to call from python. We could directly define functions contained in libraries, but the EVP code requires multiple calls. To avoid round-trips from c to python we are going to encapsulate them in our own functions:

_FFI.cdef("""
int encrypt_ecb(unsigned char * input, unsigned char * output,
                unsigned char * key, int len);

int decrypt_ecb(unsigned char * input, unsigned char * output,
                unsigned char * key, int len);
""")

Then we use verify() to embed the c code four our encapsulation functions. These functions include the header where the evp functions are defined:

_C = _FFI.verify("""
#include <openssl/evp.h>

int encrypt_ecb(unsigned char * input, unsigned char * output,
                unsigned char * key, int len)
{
  int outlen, finallen;
  EVP_CIPHER_CTX ctx;
  EVP_CIPHER_CTX_init(&ctx);
  EVP_EncryptInit(&ctx, EVP_aes_128_ecb(), key, 0);
  EVP_CIPHER_CTX_set_padding(&ctx, 0);
  if(!EVP_EncryptUpdate(&ctx, output, &outlen, input, len)) return 0;
  if(!EVP_EncryptFinal(&ctx, output + outlen, &finallen)) return 0;
  EVP_CIPHER_CTX_cleanup(&ctx);
  return outlen + finallen;
}

int decrypt_ecb(unsigned char * input, unsigned char * output,
                unsigned char * key, int len)
{
  int outlen, finallen;
  EVP_CIPHER_CTX ctx;
  EVP_CIPHER_CTX_init(&ctx);
  EVP_DecryptInit(&ctx, EVP_aes_128_ecb(), key, 0);
  EVP_CIPHER_CTX_set_padding(&ctx, 0);
  if(!EVP_DecryptUpdate(&ctx, output, &outlen, input, len)) return 0;
  if(!EVP_DecryptFinal(&ctx, output + outlen, &finallen)) return 0;
  EVP_CIPHER_CTX_cleanup(&ctx);
  return outlen + finallen;
}
""", libraries=["crypto"], extra_compile_args=['-Wno-deprecated-declarations'])

Note that we specify the library "crypto" above. This ensures that the linker will link our code with libcrypto, openssl's crypto library. The deprecated-declarations warning suppression is passed into the compiler because the apple has decided to mark all of the OpenSSL EVP routines as deprecated so people will use higher-level encryption functions. The code works fine without the extra_compile_args, but will emit a bunch of warnings on OSX the first time you run it.

The last step is simply calling these methods via the object we created with verify. Note that we create a buffer for the resulting data and extract it to a python string using the string method:

def encrypt_ecb_cffi(data, key):
    datalen = len(data)
    out = _FFI.new("char[%s]" % (datalen))
    num = _C.encrypt_ecb(data, out, key, datalen)
    return _FFI.string(out, num)

def decrypt_ecb_cffi(data, key):
    datalen = len(data)
    out = _FFI.new("char[%s]" % (datalen))
    num = _C.decrypt_ecb(data, out, key, datalen)
    return _FFI.string(out, num)

Note that we have to explicitly pass the length of the string into the method because c strings are null terminated and the data to encrypt may contain null bytes. It isn't strictly necessary to return the length from the encryption functions because aes_128_ecb will always produce a blob the same length as the input data, but I included it to show how easy it is to get the return value of the function back into python.

How Does it Work?

If you run code using cffi, it will create a __pycache__ directory. Inside this directory you will find some autogenerated c code that includes the code that you have written and some translation code to munge python types into c types and vice-versa. You will also a custom built library module (.dll or .so) that starts with _cffi_. When you run the python code for the first time it compiles this module and future runs just call into it.

Results

So how much faster is it encrypt and decrypt via the library than by shelling out?

$ python openssl-cffi.py
Verifying encryption...
VERIFIED
Profiling shell version...
SHELL: 2.01278114319
Profiling cffi version...
CFFI: 0.000946998596191

Using cffi is roughly 2125x faster than shelling out.

In other words, enough to make the change worth it. The example code can be found on my github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment