Skip to content

Instantly share code, notes, and snippets.

@perrygeo
Last active October 25, 2023 16:20
Show Gist options
  • Save perrygeo/ee7c65bb1541ff6ac770 to your computer and use it in GitHub Desktop.
Save perrygeo/ee7c65bb1541ff6ac770 to your computer and use it in GitHub Desktop.
Avoiding TypeError: Incorrect padding with Python's base64 encoding

Avoiding padding errors with Python's base64 encoding

>>> import base64
>>> data = '{"u": "test"}'
>>> code = base64.b64encode(data)
>>> code
'eyJ1IjogInRlc3QifQ=='

Note the trailing == to make len a multiple of 4. This decodes properly

>>> len(code)
20
>>> base64.b64decode(code)
'{"u": "test"}'
>>> base64.b64decode(code) == data
True

without the == padding (this is how many things are encoded for e.g. access tokens)

>>> base64.b64decode(code[0:18]) == data
...
TypeError: Incorrect padding 

However, you can add back the padding

>>> base64.b64decode(code + "==") == data
True

Or add an arbitrary amount of padding (it will ignore extraneous padding)

>>> base64.b64decode(code + "========") == data
True

This last property of python's base64 decoding ensures that the following code adding 3 padding = will never succumb to the TypeError and will always produce the same result.

>>> base64.b64decode(code + "===") == data
True

It's clumsy but effective method to deal with strings from different implementations of base64 encoders

@criptych
Copy link

Or add an arbitrary amount of padding (it will ignore extraneous padding)

Unfortunately, b32decode doesn't do the same. 😞

@koorukuroo
Copy link

Thank you so much!

@cgthayer
Copy link

Thanks :)

@aakashkaji
Copy link

very helpful solution

@Bitcoin3ra
Copy link

Really helpful. Thank you. I did base64.b64encode and then again base64.b64decode and this solved the issue (python 3.5.2).

@Akendo
Copy link

Akendo commented Aug 16, 2019

I think you shouldn't add an arbitrary amount of padding characters. This might work in this specify case. However, not every base64 implementation will honour this. For example the linux based tool base64 will return an error message when to many padding characters are found:

echo "SGVsbG8gd29ybGQK==" |base64 -d
Hello world
base64: invalid input

The underlying implementation makes the assumption that a chunk has a fixed size of 24 Bits worth of information. When the encoded information does not provides enough information to fill the chunk, padding becomes necessary. Therefore you should only add as much padding character as necessary to have a completed chunk.

Copy link

ghost commented Oct 7, 2019

That is genuinely a horrible but beautiful thing. It saves so much extra and padding code!

@PhatDang
Copy link

Very thank you!

@geronsv
Copy link

geronsv commented Nov 14, 2019

Very helpful!

@nimatt
Copy link

nimatt commented Apr 14, 2020

Didn't feel quite right to pad more that needed so I ended up doing this

code = code.ljust((int)(math.ceil(len(code) / 4)) * 4, '=')
data = base64.b64decode(code)

@monkut
Copy link

monkut commented Jul 30, 2020

A similar way to add the necessary padding:

code_with_padding = f"{code}{'=' * (len(code) % 4)}"

@mohitgupta3
Copy link

You saved my bacon buddy. Thank you.

@bobobox
Copy link

bobobox commented Oct 5, 2020

A similar way to add the necessary padding:

code_with_padding = f"{code}{'=' * (len(code) % 4)}"

That is not quite right. Modulo alone isn't going to give you the number of padding characters you need. This works:

code_with_padding = f"{code}{'=' * ((4 - len(code) % 4) % 4)}"

@ybhan
Copy link

ybhan commented May 14, 2021

Thank you!

@HolmonAlp
Copy link

I don't understand this theme

@sarxhaUFT
Copy link

I don't know what black magic is this, but it is working. Thanks

@orichcasperkevin
Copy link

Works like a charm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment