Skip to content

Instantly share code, notes, and snippets.

@allenwb
Created February 19, 2012 01:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save allenwb/1861530 to your computer and use it in GitHub Desktop.
Save allenwb/1861530 to your computer and use it in GitHub Desktop.
JavaScript utf-16 JSON round-tripping experiment

This is a response to Git 1850768 For some reason the following wouldn't post as a comment

@mranney @piscisaureus

I did some experiments and I don't see any round-tripping issues showing up, at least in FF:

var z= "\ud83d\ude38";   //u+1f638
console.log("z.length: " + z.length);  //expect 2
console.log("z:  >"+z+"<");
console.log("JSON.stringify(z): >"+JSON.stringify(z)+"<");
console.log("JSON.parse(JSON.stringify(z))===z: "+(JSON.parse(JSON.stringify(z))==z));
var qz='"'+z+'"';
console.log("evals(qz)===z: "+(eval(qz)===z));
console.log("eval(qz) == JSON.parse(qz): "+(eval(qz)===JSON.parse(qz)));
console.log("JSON.stringify(JSON.parse(qz))===qz: "+(JSON.stringify(JSON.parse(qz))===qz))
[17:47:27.489] z.length: 2
[17:47:27.497] z:  >😸< 
[17:47:27.504] JSON.stringify(z): >"😸"< 
[17:47:27.511] JSON.parse(JSON.stringify(z))===z: true 
[17:47:27.518] evals(qz)===z: true 
[17:47:27.525] eval(qz) == JSON.parse(qz): true 
[17:47:27.532] JSON.stringify(JSON.parse(qz))===qz: true 

--

This example seems to round trip perfectly. The only thing that needs to be kept in mind is that the JS parser (processing an externally provided source stream) UTF-16 encodes supplemental characters that it encounters in string literals and that both eval and JSON.parse (both of which always get their input from JS strings) simply pass through any surrogate pairs that they encounter in string literals.

This isn't uniform handling of BMP and supplemental characters as you need to be aware that UTF-16 encoding is used. But if the major worry is JSON round tripping, that seems to work.

Do you have examples where JSON round-tripping doesn't work?

@piscisaureus
Copy link

@allenwb I have to agree, that works in node too.

> var z = "\ud83d\ude38";
undefined
> z
'��'
> z.length
2
> JSON.stringify(z)
'"��"'
> var z2 = JSON.parse(JSON.stringify(z))
undefined
> z2
'"��"'
> z === z2
true

Funny, btw, that pasting \ud83d in a github comment makes github respond with 422 Unprocessable Entity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment