public
Last active — forked from 140bytes/LICENSE.txt

base64 decoder

  • Download Gist
LICENSE.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
 
Copyright* (C) 2011 Alex Kloss <alexthkloss@web.de>
 
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
 
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 
0. You just DO WHAT THE FUCK YOU WANT TO.
 
* As far as something complicated as Copyright applies to such an
simple code...
README.md
Markdown

base64 decoder

A tweet-sized 64bit decoder inspired by this base64 encoder: https://gist.github.com/999166

It will fail on older IEs (Version 7 or older) unless the input String is converted to an Array using .split('').

Thanks to @jed for this great idea of 140byt.es and thanks to him and Kambfhase for help; thanks to @nikola for a hint saving 2 brackets! Thanks to LeverOne for his support fixing the last bugs regarding RFC2045 conformity and golfing the last bytes.

annotated.js
JavaScript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
function(
d, // base64 data (in IE7 or older, use .split('') to get this working
b, // replacement map ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/")
c, // character and - ascii value buffer
u, // bit storage
r, // result
q, // bit counter
x // char counter
){
for (
// initialize result and counters
r = q = x = '';
// get next character
c = d[x++];
// character found in table? initialize bit storage and add its ascii value;
~c && (u = q%4 ? u*64+c : c,
// and if not first of each 4 characters, convert the first 8bits to one ascii character
q++ % 4) ? r += String.fromCharCode(255&u>>(-2*q&6)) : 0
)
// try to find character in table (0-63, not found => -1)
c = b.indexOf(c);
// return result
return r
}
package.json
JSON
1 2 3 4 5 6 7 8 9 10 11 12
{
"name": "Base64Decoder",
 
"description": "Fully working Base64 decoder in 139bytes",
 
"keywords": [
"base64",
"decode",
"padding",
"rfc2045"
]
}
test.html
HTML
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
<!DOCTYPE html>
<title>Base64 decoder</title>
<script type="text/javascript">
 
(function(){
 
var f = function(d,b,c,u,r,q,x){for(r=q=x='';c=d[x++];~c&&(u=q%4?u*64+c:c,q++%4)?r+=String.fromCharCode(255&u>>(-2*q&6)):0)c=b.indexOf(c);return r}
, test = {
"Zg==" : "f"
,"Zm8=" : "fo"
,"Zm9v" : "foo"
,"Zm9vYg==" : "foob"
,"Zm9vYmE=" : "fooba"
,"Zm9vYmFy" : "foobar"
,"MTQwYnl0ZX\n MgcnVsZXMh" : "140bytes rules!"
}
, error = 0;
;
 
for( i in test ) {
var r = f(i, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/");
if( r != test[i] ) {
error++;
document.writeln( 'Expected &quot;'+test[i]+'&quot; for &quot;'+i+'&quot; but got &quot;'+ r + "&quot;<br>" );
}
}
 
if( error > 0) {
document.writeln( "<br>"+error+ " tests failed!<br>" );
} else {
document.writeln( "<br>Everything is fine!<br>" );
}
 
})();
 
 
</script>
</body>
</html>

cool, thanks for kicking this off! currently doesn't work due to the leading comma on line 9, and when fixed fails all tests. will keep watching this space, though, hopefully folks can help golf enough room for String.fromCharCode.

Currently, there are 4bytes left, while String.fromCharCode is 19 characters long - so we would need to get another 15 characters. Not very likely, but worth a try anyway.

you should probably git clone git://gist.github.com/1020396.git gist-1020396 instead of using the web interface.

Should be fixed now. Thanks for the tip.

cool, nice work!

Unless anyone comes up with a shorter version than String.fromCharCode to convert numbers to characters, the following stuff could probably be optimized.

  • for-loop-body - maybe one can save some characters by omitting the body
  • bitshift calculation "((8-q%4*2)&7)" -> should work like ([0,6,4,2])[q%4]

And already found one for the bitshift calculation, saving 1 characters: ((4-q)<<1&7)!

also, .search is one character shorter than .indexOf.

function(d,b,s,c,u,q,r){for(r=u=q='';~(c=b.indexOf(d[q++]||'='));q%4||(u=0))(u=u<<6+c)>>8&&(r+=s(255&u>>(8-q%4*2&7)));return r}

removed needless parens + moved q%4||(u=0).

/edit: sorry, doesnt work.

@Kambfhase: The parens is not needless and the q%4 needs to be checked every iteration.
@jed: thanks, noted!

/edit
OT: how can I push back my edited gist?

function(d,b,s,c,u,q,r){for(r=u=q='';~(c=b.indexOf(d[q++]||'='));q%4||(u=0))(u=(u<<6)+c)>>8&&(r+=s(255&(u>>(8-q%4*2)%8)));return r}

@atk: I was fooled by uglify.js which removed the parens. Plus I screwed up the testing. This should work now, though.

i'm pretty sure

q%4||(u=0)

can be changed into

u*=!(q%4)

or even

u*=q!=4

Thanks, guys! I just remembered that the s argument can be spared in case String.fromCharCode was present, so we only needed 13 instead of 15 characters. Since you've saved some more, we could really come close!

u*=q!=4 would only work on the first iteration, alas.

only 3 bytes left:

function(d,b,c,u,q,r){for(r=u=q='';~(c=b.search(d[q++]||'='));u*=q!=4)(u=(u<<6)+c)>>8&&(r+=String.fromCharCode(255&(u>>(8-q%4*2)%8)));return r}

@jed: try "MTQwYnl0ZXMgcnVsZXMh" ("140bytes rules!") - it will not work with your version, as q can exceed 4.

Anyway: 130bytes - 2bytes for saved s-argument + 19bytes for "String.fromCharCode" = 147bytes - only 7 bytes to go!

(u=(u<<6)+c) can be reduced to (u=u*64+c).

Just found another way to save 1 character: instead of "=", we can use -1! 6bytes and still counting :-)

@Kambfhase: 2 bytes less! Yay! 4bytes and counting!

@atk, can you post your latest version?

is already updated

take 2:

q%4||(u=0)

into

u*=q%4>0

same length as jed's but more obvious: u=q%4&&u. 1 byte left.

Updated. 2 bytes, to be precise - lest I have overlooked something.

140 bytes:
```function(d,b,c,u,q,r){for(r=u=q='';~(c=b.search(d[q++]||b));u=q%4&&u)(u=u*64+c)>>8&&(r+=String.fromCharCode(255&(u>>(8-q%4*2)%8)));return r}

not sure why this works, though
/edit: the `9+/` matches multiple nines, but not the plus. this is awesome!

this will fail on some padding situations, so I found a better solution (coercion again):

function(d,b,c,u,q,r){for(r=u=q='';~(c=b.search(d[q++]+''));u=q%4&&u)(u=u*64+c)>>8&&(r+=String.fromCharCode(255&(u>>(8-q%4*2)%8)));return r}

Thanks for your help - you guys are really awesome. If you happen to be near Karlsruhe (Germany), we totally should meet there swapping JS stories!

hm, cant we just omit the +''? I mean, adding an empty string to an empty string is still an empty string.

Nope - adding '' to undefined returns "undefined" in String format, which will not be found inside the replace map.

yeah, apparently search doesn't coerce undefined into a string.

though i can't say i really grok how search works here.

also, i seem to recall IE not being able to use bracket notation to access characters, only .charAt.

also, i'd use if where logical AND doesn't save you any bytes:

function(d,b,c,u,q,r){for(r=u=q='';~(c=b.search(d[q++]+''));u=q%4&&u)if((u=u*64+c)>>8)r+=String.fromCharCode(255&(u>>(8-q%4*2)%8));return r}

search coerces the parameter into a regexp. if the parameter is undefined that will result in the regexp /undefined/ which will fail.

Maybe I'm missing the obvious here, but why are curly braces needed for the loop? Looks like the body is a one-liner.

the braces are only in the annotated version.

personally, i think only comments and whitespace should be added to annotations.

Thanks again. Updated. The curly brackets were a remain of an earlier version that I forgot to edit out on changing the result.

I agree. Any work done by a minifier other than whitespace and comments takes away from the spirit of this site :). Maybe that should by parts of the rules.

you still have extra parens in the penultimate statement, so ya know.

Jed, about the non-savings on logical if, that's only when the condition needs to be wrapped in parens anyway due to having precedence issues?

right. assignment is always going to need parens.

good job everyone! I still wonder if there is a short option for String.fromCharCode.

@Kambfhase: _sebastienp was definitely trying his best to find it for LZW, let him know if you come up with anything.

I have already written an LZW in JS: http://tinyjs.sourceforge.net/tiny-lzw-en.html (which is definitely easier than Huffman: http://tinyjs.sourceforge.net/tiny-huffman-en.html).

@Kambfhase: Alas, eval+\x+toString(16) is still longer... and I'm out of ideas on how to do this otherwise.

The only thought currently coming up is to use another character map, in this case 0-255... but this means outsourcing the problem, too.

@jed : thanks.
@Kambfhase : yes, please let me know. The best alternative I came up with is actually unescape("%"+x.toString(8)) but, yep, still longer than String.fromCharCode(x) :(
@atk : my attempts to LZW compression/decompression can be found here > http://jsfiddle.net/sebastienp/p7kDe/

@sebastien-p : I've already seen them - quite nice; I would like to suggest you "outsource" String.fromCharCode like I did in the beginning and gist it here so the folks here @github can try to shave off byte by byte :-)

thx @nikola for the final hint regarding operator precedence :-)

I`d rather you guys joining the discussion on the gist instead of me this forum, to avoid things getting too fragmented over the net.

Your points seem mostly valid, though a 135bytes small base64 decoder need not be RFC-compatible to begin with. If I find the time, I will try to achieve further refinement for this function anyway.

RFC 1925, ยง 3

I compiled a decoder that fixes both problems, but is 152 bytes long:

function(d,b,c,u,r,q,x){for(r=q=x='';(u=q%4?u:1)&&(c=d[x++]);~(c=b.search(c))&&(u=u*64+c,q++%4)&&(r+=String.fromCharCode(255&u>>(8-q%4*2)%8)));return r}

If you want to golf away enough bytes, feel free to do so.

Right. In this case replace "search" with "indexOf". Dang, 1 byte more. So be it. 153 bytes, so 13 bytes to go to have a full working tweet-sized decoder.

Currently, after a tiny bit of hacking, I got function(d,b,c,u,r,q,x){for(r=q=x='';(u=q%4?u:1,c=d[x++]);~(c=b.indexOf(c))&&(u=u*64+c,q++%4)&&(r+=String.fromCharCode(255&u>>(8-q%4*2)%8)));return r}, which is still 150 bytes. 3 down, 10 to go :-)

function(d,b,c,u,r,q,x){for(r=q=x='';c=d[x++];~(c=b.indexOf(c))&&(u=(q%4?u:1)*64+c,q++%4)&&(r+=String.fromCharCode(255&u>>(8-2*q&6))));return r} - now there are only 4 bytes left!

And another 2 bytes gone by rethinking the byte array filling:

function(d,b,c,u,r,q,x){for(r=q=x='';c=d[x++];~(c=b.indexOf(c))&&(u=q%4?u*64+c:c,q++%4)&&(r+=String.fromCharCode(255&u>>(8-2*q&6))));return r} - only 2 bytes left!

If we omit the ability to allow for \x0 bytes, we could reduce another byte...

function(d,b,c,u,r,q,x){for(r=q=x='';c=d[x++];~(c=b.indexOf(c))&&(u=q++%4?u*64+c:c)!=c&&(r+=String.fromCharCode(255&u>>(8-2*q&6))));return r}

though I don't like this solution, since I had a hard time adding the ability.

After some additional conversation with @LeverOne, we are now down to 139 bytes:

function(d,b,c,u,r,q,x){for(r=q=x='';c=d[x++];~c&&(u=q%4?u*64+c:c,q++%4)?r+=String.fromCharCode(255&u>>(-2*q&6)):0)c=b.indexOf(c);return r}

Thanks for your help, @LeveOne !

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.