Skip to content

Instantly share code, notes, and snippets.

@tudisco
Created February 20, 2010 03:22
Show Gist options
  • Save tudisco/309475 to your computer and use it in GitHub Desktop.
Save tudisco/309475 to your computer and use it in GitHub Desktop.
javascript utf8 encode and decode
//+ Jonas Raoni Soares Silva
//@ http://jsfromhell.com/geral/utf-8 [v1.0]
UTF8 = {
encode: function(s){
for(var c, i = -1, l = (s = s.split("")).length, o = String.fromCharCode; ++i < l;
s[i] = (c = s[i].charCodeAt(0)) >= 127 ? o(0xc0 | (c >>> 6)) + o(0x80 | (c & 0x3f)) : s[i]
);
return s.join("");
},
decode: function(s){
for(var a, b, i = -1, l = (s = s.split("")).length, o = String.fromCharCode, c = "charCodeAt"; ++i < l;
((a = s[i][c](0)) & 0x80) &&
(s[i] = (a & 0xfc) == 0xc0 && ((b = s[i + 1][c](0)) & 0xc0) == 0x80 ?
o(((a & 0x03) << 6) + (b & 0x3f)) : o(128), s[++i] = "")
);
return s.join("");
}
};
@mathiasbynens
Copy link

This solution doesn’t work correctly for astral symbols. e.g. UTF8.encode('\uD834\uDF06').

'\uD834\uDF06' is U+1D306.

The result should be F0 9D 8C 86.

// …but with the current implementation: 
UTF8.encode('\uD834\uDF06') == '\xF0\x9D\x8C\x86' // false

@pascaldekloe
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment