Skip to content

Instantly share code, notes, and snippets.

@cameronmaske
Last active February 5, 2024 19:16
Show Gist options
  • Star 20 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cameronmaske/f520903ade824e4c30ab to your computer and use it in GitHub Desktop.
Save cameronmaske/f520903ade824e4c30ab to your computer and use it in GitHub Desktop.
base64 that actually encodes URL safe (no '=' nonsense)
"""
base64's `urlsafe_b64encode` uses '=' as padding.
These are not URL safe when used in URL paramaters.
Functions below work around this to strip/add back in padding.
See:
https://docs.python.org/2/library/base64.html
https://mail.python.org/pipermail/python-bugs-list/2007-February/037195.html
"""
import base64
def base64_encode(string):
"""
Removes any `=` used as padding from the encoded string.
"""
encoded = base64.urlsafe_b64encode(string)
return encoded.rstrip("=")
def base64_decode(string):
"""
Adds back in the required padding before decoding.
"""
padding = 4 - (len(string) % 4)
string = string + ("=" * padding)
return base64.urlsafe_b64decode(string)
>>> test = "helloworld"
>>> encode_base64(test)
'aGVsbG93b3JsZA'
>>> e = encode_base64(test)
>>> decode_base64(e)
'helloworld'
>>> test = "Hello World"
>>> encoded = encode_base64(test)
>>> print encoded
SGVsbG8gV29ybGQ
>>> decoded = decode_base64(encoded)
>>> decoded
'Hello World'
>>> decoded == test
True
@jsrpy
Copy link

jsrpy commented Dec 5, 2018

base64.urlsafe_b64encode will return a byte object, and somehow your rstrip() will need to pass an byte-like object as well.

encoded.rstrip(b"=") works for me.

@ppittavi
Copy link

Thanks a lot!
This suggestion save me!

@zred
Copy link

zred commented Sep 20, 2021

padding cannot be recovered if strings are concatenated

>>> decode_base64(encode_base64(b'a') + encode_base64(b'aa'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in b64d
  File "/usr/lib/python3.8/base64.py", line 133, in urlsafe_b64decode
    return b64decode(s)
  File "/usr/lib/python3.8/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (5) cannot be 1 more than a multiple of 4

@BrandonDavidDee
Copy link

Thank you for this. I was implementing into a project and tweaked it for my setup and have this now:

# modified for 3.8.12
import base64
import random
import string


def make_random_string() -> str:
    return "".join(random.choice(string.ascii_lowercase) for x in range(20))


def base64_encode(value: str) -> str:
    encoded = base64.urlsafe_b64encode(str.encode(value))
    result = encoded.rstrip(b"=")
    return result.decode()


def base64_decode(value: str) -> str:
    padding = 4 - (len(value) % 4)
    value = value + ("=" * padding)
    result = base64.urlsafe_b64decode(value)
    return result.decode()


def encode_decode_test() -> dict:
    original_str = make_random_string()
    encoded_str = base64_encode(original_str)
    decoded_str = base64_decode(encoded_str)
    assert original_str == decoded_str
    return {
        "original_str": original_str,
        "encoded_str": encoded_str,
        "decoded_str": decoded_str,
    }


result = encode_decode_test()
print(result)

@hyp3ri0n-ng
Copy link

hyp3ri0n-ng commented Oct 17, 2022

while this is a nice workaround, I don't believe it's the "correct" way to do it anymore. Padding isn't the only thing that can break base64 encoding as it may use 0 - 9 , a - z , A - Z , + , and /. + is a special URL char (meaning a space) and / of course is another special URL char. So you'd have to take care of those yourself as well via string replacement. I don't recommend this route as at some point the standards of something may change. Better to let the libraries handle all of this stuff for you. There are 3 things that might help and be safer.

(1) Instead of stripping the padding you can use the following functions:

base64.urlsafe_b64encode(s)
Encode [bytes-like object](https://docs.python.org/3/glossary.html#term-bytes-like-object) s using the URL- and filesystem-safe alphabet, which substitutes - instead of + and _ instead of / in the standard Base64 alphabet, and return the encoded [bytes](https://docs.python.org/3/library/stdtypes.html#bytes). The result can still contain =.

base64.urlsafe_b64decode(s)
Decode [bytes-like object](https://docs.python.org/3/glossary.html#term-bytes-like-object) or ASCII string s using the URL- and filesystem-safe alphabet, which substitutes - instead of + and _ instead of / in the standard Base64 alphabet, and return the decoded [bytes](https://docs.python.org/3/library/stdtypes.html#bytes).

(2) These functions aren't really necessary. The correct way to do this would be to not build your URL using raw base64 encoded stuff and string concatenation. instead something like:

requests.get(http://example.com, params = {"base64_encoded_param" : base_64_param})

Requests, along with any other decent URL library will then percent encode these for the URL for you. No need to keep track of what needs to be percent-encoded etc.

(3) Use base58 encoding. The allowed charset is A-Z and the digits 1-9. Base58 excludes zero, uppercase 'O', uppercase 'I', and lowercase 'l'. In other words, no padding, no funny chars. Just A-Z and 1-9. It requires an additional library as I don't think there's a standard lib module yet, but I usually find it well worth it to just deal with ascii-range and no special char stuff.

@AveragePythonEnjoyer29
Copy link

AveragePythonEnjoyer29 commented Mar 22, 2023

Here are 2 one-liners for encoding and decoding:

(lambda string: urlsafe_b64encode(string).strip(b"="))(b"this will be converted into base64!")
(lambda string: urlsafe_b64decode((string+(b"="*(4-(len(string)%4))))))(b"this will be converted back into base64!")

Examples:

Encoding:

>>> (lambda string: urlsafe_b64encode(string).strip(b"="))(b"this will be converted into base64!")
b'dGhpcyB3aWxsIGJlIGNvbnZlcnRlZCBpbnRvIGJhc2U2NCE'

Decoding

>>> (lambda string: urlsafe_b64decode((string+(b"="*(4-(len(string)%4))))))(b'dGhpcyB3aWxsIGJlIGNvbnZlcnRlZCBpbnRvIGJhc2U2NCE')
b'this will be converted into base64!'

Note

It does require base64's urlsafe_b64encode and urlsafe_b64decode to be imported like this

from base64 import urlsafe_b64encode, urlsafe_b64decode

This can be easily changed, as all you need to do is change the calls to the functions (right after (lambda string:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment