Skip to content

Instantly share code, notes, and snippets.

@hamsolodev
Created July 27, 2010 22:28
Show Gist options
  • Save hamsolodev/492990 to your computer and use it in GitHub Desktop.
Save hamsolodev/492990 to your computer and use it in GitHub Desktop.
import struct
from Crypto.Cipher import AES
def pkcspad(s, block_size=AES.block_size):
"""PKCS#5/#7 padding - return string the right length for a block cipher.
# the output of pkcspad is 8 bit so use an md5 digest here in the
# doctests to check output.
>>> from hashlib import md5
Padding works like this:
>>> out1 = pkcspad('mark')
>>> md5(out1).hexdigest()
'd8133e2c50049c19f1198969ed7559f1'
>>> out2 = pkcspad('1')
>>> md5(out2).hexdigest()
'82576de2a3f29251c5cc8c768b50a440'
>>> out3 = pkcspad('hi', block_size=8)
>>> md5(out3).hexdigest()
'a5fecdb69c6a4871d8df86d904b16e21'
Which results in output of the right length for block cipher use:
>>> len(out1)
16
>>> len(out2)
16
>>> len(out3)
8
Leaves input alone if it was already of suitable length.
>>> out4 = pkcspad('1234567890123456', AES.block_size)
>>> out4
'1234567890123456'
>>> len(out4)
16
It won't let you pad an empty string.
>>> pkcspad('')
Traceback (most recent call last):
...
AssertionError: can't pad empty input
>>> pkcspad(None)
Traceback (most recent call last):
...
AssertionError: can't pad empty input
"""
assert bool(s), "can't pad empty input"
if len(s) % block_size == 0:
return s
padlen = block_size - (len(s) % block_size)
padded = s + padlen * struct.pack("@B", padlen)
assert len(padded) % block_size == 0, "padded output isn't multiple of block_size"
return padded
def pkcsunpad(s, block_size=AES.block_size):
"""PKCS#5/#7 if it looks like padding, remove it.
If the last byte (x) of input is less than block_size then it might be
a padding byte added by `pkcspad`. If that's really the case then the
last x bytes of input will be the same. If they are then we've
clearly encountered some padding, so strip it and return the result. If
we're not sure, just return the data unchanged.
>>> out1 = pkcspad('mark')
>>> pkcsunpad(out1)
'mark'
>>> out2 = pkcspad('1')
>>> pkcsunpad(out2)
'1'
>>> out3 = pkcspad('hi', block_size=8)
>>> pkcsunpad(out3)
'hi'
If we encounter something that doesn't look like padding then return it
unchanged.
>>> pkcsunpad('1234567890123456')
'1234567890123456'
"""
padbyte = s[-1:]
padint = struct.unpack('@B', padbyte)[0]
if padint < block_size and s[-padint:].count(padbyte) == padint:
return s[:-padint]
return s
@hamsolodev
Copy link
Author

I don't think it's safe to use this when you think the input doesn't require padding and the last byte might have a value of 1 - because that looks like padding according to this algorithm. Still, it's better than other padding methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment