Skip to content

Instantly share code, notes, and snippets.

@mikesherov
Forked from bytespider/LICENSE.txt
Created June 4, 2011 19:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mikesherov/1008218 to your computer and use it in GitHub Desktop.
Save mikesherov/1008218 to your computer and use it in GitHub Desktop.
140byt.es -- convert string to array of UTF-8 bytes
// http://en.wikipedia.org/wiki/UTF-8
function stringToByteArray(a,b,c,d,e,f,g,h,j){
for(
b = [
e =
f =
d = 0
] // initialise variables
;
c = a.charCodeAt(d++) // get the character code from the string
;
){
g = 128; // 128 is the base for a lot of numbers below
c < g // under 128 is UTF-8 (ASCII range), 1 byte
?
b[e] = c // add to byte array
:
c < g*4 // under 2048 2bytes
?
f = 1 // 1 bytes left to process
:
c < 65536 // under 65536 is 3bytes
?
f = 2 // 2 left to process
:
c < 2<<20 && f = 3; // 3bytes left to process
for( // process the remaining bytes indicated by `f`
j = e++, // move onto the next slot in the byte array
h = f
;
f-- > 0 // -1 and check if greater than 0 still
;
)
g += (2<<(5-f)),
b[e++] = 128 + (c >> f*6 & 63), // shift f * 6 bits, mask 1byte and add 128
b[j] = g + (c >> 6*h) //set the first byte
}
return b // return the byte array
}
function(a,b,c,d,e,f,g,h,j){for(b=[e=f=d=0];c=a.charCodeAt(d++);){g=128;c<g?b[e]=c:c<g*4?f=1:c<65536?f=2:c<2<<20&&f=3;for(j=e++,h=f;f-->0;)g+=(2<<(5-f)),b[e++]=128+(c>>f*6&63),b[j]=g+(c>>6*h)}return b}
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
{
"name": "stringToByteArray",
"description": "Convert a string of characters to an array of UTF-8 bytes",
"keywords": [
"cryptography",
"utf8",
]
}
function stringToByteArray(str) {
var bytes = [], code, i;
for(i = 0; i < str.length; i++) {
code = str.charCodeAt(i);
if (code < 128) {
bytes.push(code);
} else if (code < 2048) {
bytes.push(192+(code>>6), 128+(code&63));
} else if (code < 65536) {
bytes.push(224+(code>>12), 128+((code>>6)&63), 128+(code&63));
} else if (code < 2097152) {
bytes.push(240+(code>>18), 128+((code>>12)&63), 128+((code>>6)&63), 128+(code&63));
}
}
return bytes;
}
<!DOCTYPE html>
<title>stringToByteArray</title>
<div><b id="ret"></b> Before golfing value</div>
<div><b id="ret2"></b> After golfing value</div>
<script>
// write a small example that shows off the API for your example
// and tests it in one fell swoop.
function stringToByteArrayOld(str) {
var bytes = [], code, i;
for(i = 0; i < str.length; i++) {
code = str.charCodeAt(i);
if (code < 128) {
bytes.push(code);
} else if (code < 2048) {
bytes.push(192+(code>>6), 128+(code&63));
} else if (code < 65536) {
bytes.push(224+(code>>12), 128+((code>>6)&63), 128+(code&63));
} else if (code < 2097152) {
bytes.push(240+(code>>18), 128+((code>>12)&63), 128+((code>>6)&63), 128+(code&63));
}
}
return bytes;
}
var stringToByteArray = function(a,b,c,d,e,f,g,h,j){for(b=[e=f=d=0];c=a.charCodeAt(d++);){g=128;c<g?b[e]=c:c<g*4?f=1:c<65536?f=2:c<2<<20&&f=3;for(j=e++,h=f;f-->0;)g+=(2<<(5-f)),b[e++]=128+(c>>f*6&63),b[j]=g+(c>>6*h)}return b};
document.getElementById( "ret" ).innerHTML = stringToByteArrayOld("hello☺䭢it works👍");
document.getElementById( "ret2" ).innerHTML = stringToByteArray("hello☺䭢it works👍");
</script>
@bytespider
Copy link

FYI demo doesn't seem to work. before and after seem to differ

@mikesherov
Copy link
Author

That's weird. The demo I modified from yours to compare function calls before and after. Seems to work every time I run it....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment