Skip to content

Instantly share code, notes, and snippets.

@mathiasbynens
Created September 26, 2011 19:50
Show Gist options
  • Save mathiasbynens/1243213 to your computer and use it in GitHub Desktop.
Save mathiasbynens/1243213 to your computer and use it in GitHub Desktop.
Escape all characters in a string using both Unicode and hexadecimal escape sequences
// Ever needed to escape '\n' as '\\n'? This function does that for any character,
// using hex and/or Unicode escape sequences (whichever are shortest).
// Demo: http://mothereff.in/js-escapes
function unicodeEscape(str) {
return str.replace(/[\s\S]/g, function(character) {
var escape = character.charCodeAt().toString(16),
longhand = escape.length > 2;
return '\\' + (longhand ? 'u' : 'x') + ('0000' + escape).slice(longhand ? -4 : -2);
});
}
@brandonros
Copy link

I'm throwing this up here hoping to help somebody else down the road.

I had to restore partial keys from a Redis dump, and this function almost helped. Here is what I came up with.

Make sure to create the redis client with like this:

var client = redis.createClient(global['redis_port'], global['redis_host'], { return_buffers: true });

var fs = require('fs');

var redis = require('../lib/redis.js');

function e(buf) {
    var res = '';

    for (var i = 0; i < Buffer.byteLength(buf); ++i) {
        var c = buf[i].toString(16);
        if (c.length == 1) {
            c = '0' + c;
        }

        res += '\\x' + c;
    }

    return res;
}

function generate_dump() {
    var keys = fs.readFileSync('keys.txt').toString().split('\n');

    return keys.reduce(function (prev, key) {
        return prev.then(function () {
            return redis.dump(key)
                .then(function (res) {
                    if (!res) {
                        console.log('missing key', key);

                        return;
                    }

                    fs.appendFileSync('dump.txt', 'RESTORE ' + key + ' 0 "' + e(res) + '"\n');
                });
        });
    }, Promise.resolve());
}

redis.init()
.then(function () {
    return generate_dump();
})
.then(function () {
    console.log('done');
})
.catch(function (err) {
    console.log(err['stack']);
});

@adamvleggett
Copy link

If the goal is to do this with minimal code size, the following works well and minifies to ~100 bytes:

function escapeUnicode(str) {
    return str.replace(/[^\0-~]/g, function(ch) {
        return "\\u" + ("000" + ch.charCodeAt().toString(16)).slice(-4);
    });
}

@F1LT3R
Copy link

F1LT3R commented Dec 15, 2016

Fantastic! Thanks for this @mathiasbynens!

@mervick
Copy link

mervick commented Nov 13, 2018

Replace only unicode characters

function escapeUnicode(str) {
  return str.replace(/[\u00A0-\uffff]/gu, function (c) {
    return "\\u" + ("000" + c.charCodeAt().toString(16)).slice(-4)
  });
}

I use this for convert utf8 content of js files to latin1

@rafaelvanat
Copy link

Very interesting work guys, thanks for sharing.
@mervick was especially useful for my use case, any restriction to use it? Thanks!

@mervick
Copy link

mervick commented Dec 19, 2019

@rafaelvanat I used that in my project more then year, and so far there have been no problems

@josephrocca
Copy link

josephrocca commented Jun 18, 2020

@mervick @rafaelvanat If I use that function like this:

escapeUnicode("abc𝔸𝔹ℂ")

Then I get:

abc𝔸𝔹\u2102

The following function fixes this by matching all non-ASCII characters after splitting the string in a "unicode-safe" way (using [...str]). It then splits each Unicode character up into its code-points, and gets the escape code for each (rather than just grabbing the first char code of each Unicode character):

function escapeUnicode(str) {
  return [...str].map(c => /^[\x00-\x7F]$/.test(c) ? c : c.split("").map(a => "\\u" + a.charCodeAt().toString(16).padStart(4, "0")).join("")).join("");
}

This gives the correct result:

abc\ud835\udd38\ud835\udd39\u2102

This seems to work fine in all my tests so far, but if I find any bugs I'll add fixes in this gist. Performance doesn't matter for my use-case, so I haven't benchmarked or optimised it at all.

@mathiasbynens
Copy link
Author

Check out jsesc which solves this problem in a more robust manner.

@josephrocca
Copy link

josephrocca commented Jun 19, 2020

@mathiasbynens It looks great! I did try to use it but unfortunately I'm not up to date with all the browserify/bundling stuff and just need a vanilla JS script (e.g. no use of Buffer) to include in a module import and wasn't able to work out how to do that with jsesc (though I admit I only poked around for a few minutes before deciding to write the function above). Also, out of pure curiosity I'd be interested in cases where the above function fails - I couldn't find any failing cases in my tests.

@mathiasbynens
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment