public
Last active — forked from 140bytes/LICENSE.txt

UTF8 encoder

  • Download Gist
LICENSE.txt
1 2 3 4 5 6 7 8 9 10 11 12 13
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
 
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE>
 
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
 
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 
0. You just DO WHAT THE FUCK YOU WANT TO.
README.md
Markdown

UTF8 encoder

A simple UTF8 encoder.

annotated.js
JavaScript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
function(
a, // the text
b, // String.fromCharCode
c, // placeholder
d, // placeholder
e // placeholder
){
for (c=e=''; d=a.charCodeAt(c++); ) // get the Unicode value of the current character
e += d < 128 ? // U+0000-U+007F
b(d) : // 0xxxxxxx
(d < 2048 ? // U+0080-U+07FF
b(d >> 6 | 192) : // 110xxxxx
b(d >> 12 | 224, d >> 6 & 63 | 128) // U+0800-U+FFFF 1110xxxx 10xxxxxx
) + b(d & 63 | 128); // 10xxxxxx
return e;
}
package.json
JSON
1 2 3 4 5 6 7 8 9 10 11 12 13
{
"name": "utf8encoder",
 
"description": "A simple UTF8 encoder.",
 
"keywords": [
"utf8",
"utf-8",
"encode",
"encoder",
"unicode"
]
}
test.html
HTML
1 2 3 4 5 6 7 8 9 10
<!DOCTYPE html>
<title>UTF8 encode</title>
<div>Expected value: <b>Normal text</b></div>
<div>Actual value: <b id="ret"></b></div>
<script>
 
var myFunction = function(a,b,c,d,e){for(c=0,e="";d=a.charCodeAt(c++);)e+=d<128?b(d):(d<2048?b(d>>6|192):b(d>>12|224)+b(d>>6&63|128))+b(d&63|128);return e};
 
document.getElementById( "ret" ).innerHTML = myFunction('Normal text', String.fromCharCode);
</script>

this is really amazing! - pity that charCodeAt and String.fromCharCode are such byte hoggers.

Yes, especially the String.fromCharCode. I'm thinking if we can save 20 bytes in order to put the String.fromCharCode inside...

  • c=0,e="" => c=e=""
  • perhaps this is ripe for some sort of eval? d>>6 could appear 3 times.

also, can you exploit the fact that String.fromCharCode(a,null) === String.fromCharCode(a) ?

Thanks for your tips, @jed! I think eval is good, but it seems couldn't save bytes with d>>6.
Anyways, that fact is really awesome. I'm still thinking how to exploit it...

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.