Skip to content

Instantly share code, notes, and snippets.

@boushley
Last active May 17, 2019 09:11
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save boushley/5471599 to your computer and use it in GitHub Desktop.
Save boushley/5471599 to your computer and use it in GitHub Desktop.
A JavaScript UTF-8 decoding function for ArrayBuffers. Credit for most of the heavy lifting goes to "bob" http://ciaranj.blogspot.com/2007/11/utf8-characters-encoding-in-javascript.html
function decodeUtf8(arrayBuffer) {
var result = "";
var i = 0;
var c = 0;
var c1 = 0;
var c2 = 0;
var data = new Uint8Array(arrayBuffer);
// If we have a BOM skip it
if (data.length >= 3 && data[0] === 0xef && data[1] === 0xbb && data[2] === 0xbf) {
i = 3;
}
while (i < data.length) {
c = data[i];
if (c < 128) {
result += String.fromCharCode(c);
i++;
} else if (c > 191 && c < 224) {
if( i+1 >= data.length ) {
throw "UTF-8 Decode failed. Two byte character was truncated.";
}
c2 = data[i+1];
result += String.fromCharCode( ((c&31)<<6) | (c2&63) );
i += 2;
} else {
if (i+2 >= data.length) {
throw "UTF-8 Decode failed. Multi byte character was truncated.";
}
c2 = data[i+1];
c3 = data[i+2];
result += String.fromCharCode( ((c&15)<<12) | ((c2&63)<<6) | (c3&63) );
i += 3;
}
}
return result;
}
@ChristianUlbrich
Copy link

"var c1 = 0;" should be "var c3 = 0;" , because c1 is otherwise not used, and c3 would then not be defined, which gives errors in strict mode.

@loon3
Copy link

loon3 commented Sep 17, 2015

Just what I needed! Thanks!

FYI This works great with the file.getBuffer() function in the Webtorrent library

@pascaldekloe
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment