Skip to content

Instantly share code, notes, and snippets.

@mathiasbynens
Created September 26, 2011 19:50
Show Gist options
  • Save mathiasbynens/1243213 to your computer and use it in GitHub Desktop.
Save mathiasbynens/1243213 to your computer and use it in GitHub Desktop.
Escape all characters in a string using both Unicode and hexadecimal escape sequences
// Ever needed to escape '\n' as '\\n'? This function does that for any character,
// using hex and/or Unicode escape sequences (whichever are shortest).
// Demo: http://mothereff.in/js-escapes
function unicodeEscape(str) {
return str.replace(/[\s\S]/g, function(character) {
var escape = character.charCodeAt().toString(16),
longhand = escape.length > 2;
return '\\' + (longhand ? 'u' : 'x') + ('0000' + escape).slice(longhand ? -4 : -2);
});
}
@josephrocca
Copy link

josephrocca commented Jun 18, 2020

@mervick @rafaelvanat If I use that function like this:

escapeUnicode("abc𝔸𝔹ℂ")

Then I get:

abc𝔸𝔹\u2102

The following function fixes this by matching all non-ASCII characters after splitting the string in a "unicode-safe" way (using [...str]). It then splits each Unicode character up into its code-points, and gets the escape code for each (rather than just grabbing the first char code of each Unicode character):

function escapeUnicode(str) {
  return [...str].map(c => /^[\x00-\x7F]$/.test(c) ? c : c.split("").map(a => "\\u" + a.charCodeAt().toString(16).padStart(4, "0")).join("")).join("");
}

This gives the correct result:

abc\ud835\udd38\ud835\udd39\u2102

This seems to work fine in all my tests so far, but if I find any bugs I'll add fixes in this gist. Performance doesn't matter for my use-case, so I haven't benchmarked or optimised it at all.

@mathiasbynens
Copy link
Author

Check out jsesc which solves this problem in a more robust manner.

@josephrocca
Copy link

josephrocca commented Jun 19, 2020

@mathiasbynens It looks great! I did try to use it but unfortunately I'm not up to date with all the browserify/bundling stuff and just need a vanilla JS script (e.g. no use of Buffer) to include in a module import and wasn't able to work out how to do that with jsesc (though I admit I only poked around for a few minutes before deciding to write the function above). Also, out of pure curiosity I'd be interested in cases where the above function fails - I couldn't find any failing cases in my tests.

@mathiasbynens
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment