Skip to content

Instantly share code, notes, and snippets.

@barrysteyn
Last active September 29, 2023 14:35
Show Gist options
  • Star 53 You must be signed in to star a gist
  • Fork 22 You must be signed in to fork a gist
  • Save barrysteyn/7308212 to your computer and use it in GitHub Desktop.
Save barrysteyn/7308212 to your computer and use it in GitHub Desktop.
OpenSSL Base64 En/Decode: Portable and binary safe.

OpenSSL Base64 Encoding: Binary Safe and Portable

Herewith is an example of encoding to and from base64 using OpenSSL's C library. Code presented here is both binary safe, and portable (i.e. it should work on any Posix compliant system e.g. FreeBSD and Linux).

License

The MIT License (MIT)

Copyright (c) 2013 Barry Steyn

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

//Decodes Base64
#include <stdio.h>
#include <string.h>
#include <openssl/bio.h>
#include <openssl/evp.h>
#include <stdint.h>
#include <assert.h>
size_t calcDecodeLength(const char* b64input) { //Calculates the length of a decoded string
size_t len = strlen(b64input),
padding = 0;
if (b64input[len-1] == '=' && b64input[len-2] == '=') //last two chars are =
padding = 2;
else if (b64input[len-1] == '=') //last char is =
padding = 1;
return (len*3)/4 - padding;
}
int Base64Decode(char* b64message, unsigned char** buffer, size_t* length) { //Decodes a base64 encoded string
BIO *bio, *b64;
int decodeLen = calcDecodeLength(b64message);
*buffer = (unsigned char*)malloc(decodeLen + 1);
(*buffer)[decodeLen] = '\0';
bio = BIO_new_mem_buf(b64message, -1);
b64 = BIO_new(BIO_f_base64());
bio = BIO_push(b64, bio);
BIO_set_flags(bio, BIO_FLAGS_BASE64_NO_NL); //Do not use newlines to flush buffer
*length = BIO_read(bio, *buffer, strlen(b64message));
assert(*length == decodeLen); //length should equal decodeLen, else something went horribly wrong
BIO_free_all(bio);
return (0); //success
}
//Encodes Base64
#include <openssl/bio.h>
#include <openssl/evp.h>
#include <openssl/buffer.h>
#include <stdint.h>
int Base64Encode(const unsigned char* buffer, size_t length, char** b64text) { //Encodes a binary safe base 64 string
BIO *bio, *b64;
BUF_MEM *bufferPtr;
b64 = BIO_new(BIO_f_base64());
bio = BIO_new(BIO_s_mem());
bio = BIO_push(b64, bio);
BIO_set_flags(bio, BIO_FLAGS_BASE64_NO_NL); //Ignore newlines - write everything in one line
BIO_write(bio, buffer, length);
BIO_flush(bio);
BIO_get_mem_ptr(bio, &bufferPtr);
BIO_set_close(bio, BIO_NOCLOSE);
BIO_free_all(bio);
*b64text=(*bufferPtr).data;
return (0); //success
}
#include <stdio.h>
#include <string.h>
int main() {
//Encode To Base64
char* base64EncodeOutput, *text="Hello World";
Base64Encode(text, strlen(text), &base64EncodeOutput);
printf("Output (base64): %s\n", base64EncodeOutput);
//Decode From Base64
char* base64DecodeOutput;
size_t test;
Base64Decode("SGVsbG8gV29ybGQ=", &base64DecodeOutput, &test);
printf("Output: %s %d\n", base64DecodeOutput, test);
return(0);
}
all:
gcc -o base64 Main.c Base64Encode.c Base64Decode.c -lcrypto -lm -w
@minkovich
Copy link

Could you add a license? Without a license this code is copyrighted. People can read the code, but they have no legal right to use it. May I suggest WTFPL, MIT, or BSD. Wikipedia has the templates.

@mandrakos
Copy link

I'll second that request. I'd like to use bits of this code as a template for something similar but am hesitant without an explicitly stated license. I also second the list of suggest licenses. Thanks for sharing & Best regards.

@sashank
Copy link

sashank commented Jun 18, 2014

What is the license under which this is released ? LGPL ? BSD ?

@barrysteyn
Copy link
Author

I am using this gist as a code block for a blog post. I don't want the licensing information in the blog.

@barrysteyn
Copy link
Author

Okay, I just updated the Gist with MIT licensing. This may be too late for some of you, but people in the future can use it from now on...

@vink007
Copy link

vink007 commented Apr 24, 2015

Is the caller required to free 'base64EncodeOutput'and 'base64DecodeOutput' ?

@jeroen
Copy link

jeroen commented May 5, 2015

Some potential improvements:

  • The uint8_t type is non ansi, better use unsigned char.
  • I think you're missing #include <string.h> required for strlen()

@barrysteyn
Copy link
Author

@jeroenooms - thanks, I have added your suggestions.

@kvelakur
Copy link

The BIO_read function you use in Base64Decode.c does not null terminate the buffer. So if you try to print the decoded data as a string, it might fail. So maybe modify your code to *buffer = (unsigned char*)malloc(decodeLen+1); and before the returning, *(buffer+decodeLen) = '\0' ?

@barrysteyn
Copy link
Author

Good point @kvelakur

@inspire365
Copy link

why not give some examples to free relative memory? like the (*bufferPtr).data one, how to free it? Even I do think free it with free(). Thanks!

@kvelakur
Copy link

kvelakur commented Jul 2, 2015

@barrysteyn I think there is similar mistake in the Base64Encode method. First, the returned string is not null terminated. Second, after calling BIO_free_all there is no guarantee that (_bufferPtr).data will still contain the encoded string (I think) - so its a good idea to copy over the encoded string to a new memory location. Instead of _b64text=(*bufferPtr).data; this might be a better way to go:

*b64text = (char*) malloc((bufferPtr->length + 1) * sizeof(char));
memcpy(*b64text, bufferPtr->data, bufferPtr->length);
*b64text[bufferPtr->length] = '\0';

Also the BIO_free_all(bio); line needs to be after these.

@bawejakunal
Copy link

@kvelakur, thanks a lot for your corrections, just one more to your addition, *b64text[bufferPtr->length] = '\0'; should be update to (*b64text)[bufferPtr->length] = '\0'; else the \0 gets placed at the wrong location many times.

@greendev5
Copy link

About calcDecodeLength function. For example someone may call this function with b64input="123412341234==". It is not real encoded base64, but we can obtain 9 bytes from this base64 string, because == does not make any payload. But in your function we will get (14*3)/4 - 2 = 8, where 14 is len and 2 is the padding. So, size of the allocated buffer may be not enough. I think it is better to count payload character not from the end of b64input but from the begin. Like in Base64decode_len function here http://www.opensource.apple.com/source/QuickTimeStreamingServer/QuickTimeStreamingServer-452/CommonUtilitiesLib/base64.c

@hvge
Copy link

hvge commented Feb 20, 2016

About that calcDecodeLength function again. It has a critical bug when you're looking for that equal sign terminators. It's a very classic example of buffer over-read. Just think about string which is shorter than 2. You should not access memory which doesn't belong to you.

Btw, I'm surprised how many people over the whole internet are trying to decode a B64 strings with using that crappy interface provided by OpenSSL. I guess that if you're using OpenSSL, then your job must be somehow related to the security. That's why I'm surprised even more, because that Base64 filter doesn't handle wrong inputs very well. Yes, it doesn't crash (hopefully), but it will report incorrect B64 strings as a valid sequences, but result is just shorter. It simply ends reading on first incorrect character and that's all. Normally that might be acceptable, but in the field of security, that's always suspicious. On top of that, all that "BIO" routines are slow like hell.

So, don't be lazy and write your own B64 encoder and decoder or grab some other code. It's not so difficult, trust me :)

@petrdvorak
Copy link

@hvge: Care to post a link, dude? ;-) #trolling #trolls

@zewt
Copy link

zewt commented Feb 28, 2016

("Herewith"? Ouch.)

@andrew-stevenson-sociomantic

Lines 32 and 33 of Decode.c - should you not be using the b64 handle?

@andrew-stevenson-sociomantic

I think the comments about the length calculation are correct - you need to round up if the numerator is not evenly divisible. As you are doing integer division I think ((len*3)+3)/4 - padding; should work (untested).

@weivincewang
Copy link

The buf_mem_st is not copied so after bio_free, it is a potential memory corruption.

@yeshog
Copy link

yeshog commented Sep 16, 2016

Good effort but:

  1. There is no need for calculating length (you get it for free with BIO routines)
  2. weivincewang 's comment looks correct.

Here is an alternative:

#include <stdio.h>
#include <string.h>
#include <openssl/bio.h>
#include <openssl/evp.h>
#include <stdint.h>

#define OP_ENCODE 0
#define OP_DECODE 1
int b64_op(const unsigned char* in, int in_len,
              char *out, int out_len, int op)
{
    int ret = 0;
    BIO *b64 = BIO_new(BIO_f_base64());
    BIO *bio = BIO_new(BIO_s_mem());
    BIO_set_flags(b64, BIO_FLAGS_BASE64_NO_NL);
    BIO_push(b64, bio);
    if (op == 0)
    {
        ret = BIO_write(b64, in, in_len);
        BIO_flush(b64);
        if (ret > 0)
        {
            ret = BIO_read(bio, out, out_len);
        }

    } else
    {
        ret = BIO_write(bio, in, in_len);
        BIO_flush(bio);
        if (ret)
        {
            ret = BIO_read(b64, out, out_len);
        }
    }
    BIO_free(b64);
    return ret;
}

int main(void)
{
    char enc_data[] = "grrr shebangit!";
    char out[256];
    char orig[256];

    int enc_out_len =
        b64_op(enc_data, sizeof(enc_data)-1,
                  out, sizeof(out), OP_ENCODE);

    printf("Enc data [%s] len [%d]\n",
           out, enc_out_len);

    int dec_out_len =
            b64_op(out, enc_out_len,
                  orig, sizeof(orig), OP_DECODE);

    printf("Dec data [%s] len [%d]\n",
           orig, dec_out_len);

    return 0;
}

@supperfox
Copy link

In Base64Encode.c, if you set your close flag like: BIO_set_close(bio, BIO_NOCLOSE); Then BIO_free_all(bio) won't free bufferPtr. I understand it's on purpose to be able to use bufferPtr afterwards, but when to free then? I didn't find it.

@tushar2708
Copy link

@barrysteyn
In the statement

int Base64Decode(char* b64message, unsigned char** buffer, size_t* length)

You are declaring length as size_t *, but the statement

*length = BIO_read(bio, *buffer, strlen(b64message));

tries to map an int on this address. That will most probably cause a stack corruption on most systems(especially 64 bit Linux
).

size_t* length can be converted to int * length without hurting anyone.

@williamcroberts
Copy link

This code is broken as it assumes that the bio routines provide null terminated buffers, which is incorrect. As the code above attempts to do, is track length. Also, the BUF_MEM size field has this information.

@1Hyena
Copy link

1Hyena commented May 17, 2018

@yeshog

int b64_op(const unsigned char* in, int in_len,
              char *out, int out_len, int op)
{
    int ret = 0;
    BIO *b64 = BIO_new(BIO_f_base64());
    BIO *bio = BIO_new(BIO_s_mem());
    BIO_set_flags(b64, BIO_FLAGS_BASE64_NO_NL);
    BIO_push(b64, bio);
    if (op == 0)
    {
        ret = BIO_write(b64, in, in_len);
        BIO_flush(b64);
        if (ret > 0)
        {
            ret = BIO_read(bio, out, out_len);
        }

    } else
    {
        ret = BIO_write(bio, in, in_len);
        BIO_flush(bio);
        if (ret)
        {
            ret = BIO_read(b64, out, out_len);
        }
    }
    BIO_free(b64); // MEMORY LEAK HERE? 
    return ret;
}

b64 gets freed but not bio

to be honest, it's kind of embarrassing to have to have a lib as popular as openssl designed in a way that it is literally begging for the developers to produce memory leaks. it's as if it was intentional (hinttidiy hint hint NSA)

@avalon1610
Copy link

@kvelakur

Second, after calling BIO_free_all there is no guarantee that (_bufferPtr).data will still contain the encoded string (I think)

BIO_get_mem_ptr(bio, &bufferPtr);
BIO_set_close(bio, BIO_NOCLOSE);
BIO_free_all(bio);
*b64text=(*bufferPtr).data;

return (0); //success

I think here BIO_set_close(bio, BIO_NOCLOSE) means openssl will not free memory under bufferPtr, even after BIO_free_all(bio), so (*bufferPtr).data is safe to use. But, user must free it manually, or it will cause memory leak.

@jige003
Copy link

jige003 commented Apr 25, 2019

@avalon1610

BUF_MEM struct should use BUF_MEM_free to free memory

@addagreem
Copy link

addagreem commented Sep 28, 2023

Base64Decode cuts 2 characters off on this data:
eyJzdWIiOiIwMHVibHVvazVsVXloWVd3STVkNyIsIm5hbWUiOiJPbGVuYSBZYXJ1dGEiLCJ2ZXIiOjEsImlzcyI6Imh0dHBzOi8vZGV2LTY5Nzk4NzYyLm9rdGEuY29tIiwiYXVkIjoiMG9hYjF1dDkyeTRKY201MXo1ZDciLCJpYXQiOjE2OTU5MDMzNDgsImV4cCI6MTY5NTkwNjk0OCwianRpIjoiSUQudmJjeHJxOEljQW1SNkprb0E4OEdsaDE4THBvYUZTaURHZDJIT1prTUhnUSIsImFtciI6WyJwd2QiXSwiaWRwIjoiMDBvODVzdHcwY1R3bzNGYzk1ZDciLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJvbGVuYS55YXJ1dGFAYXZpZC5jb20iLCJhdXRoX3RpbWUiOjE2OTU5MDIxNjEsImF0X2hhc2giOiJfMHJZV0tzRUhwV3lRc1o4enVuMmtRIn0
The issue is in the OpenSSL library, most likely. I use OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)

UPD:
It appeared that the encoded string must have '=' padding at the end which I did not receive from the external service.
Maybe it is worth adding some kind of normalization for b64message buffer to fix missing paddings.
I would add to the beginning of Base64Decode something similar to this:

    // normailze paddings
    int b64msg_length = strlen(b64message);
    auto missing_paddings = (4 - (b64msg_length % 4)) % 4; // ugh...
    if (missing_paddings)
    {
        b64message = (char*)realloc(b64message, b64msg_length + (4 - missing_paddings) + 1);
        for (int pos = b64msg_length; missing_paddings; --missing_paddings, ++pos)
            *(b64message + pos) = '=';
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment