Skip to content

Instantly share code, notes, and snippets.

@DerekNonGeneric
Created May 5, 2023 02:33
Show Gist options
  • Save DerekNonGeneric/ff1591e8a899de19c0b22a72f9523aa2 to your computer and use it in GitHub Desktop.
Save DerekNonGeneric/ff1591e8a899de19c0b22a72f9523aa2 to your computer and use it in GitHub Desktop.

Here are a couple niche cases where the normalizeString() function could have an effect due to re-tokenizing the string:

  1. Unicode joiner/non-breaking space characters: These "invisible" characters are meant to join or provide spaces between characters without breaking lines or strings. Re-splitting and re-joining a string containing them could disrupt their spacing effect. For example:

let str = 'hello' + '\u00A0' + 'world'; // non-breaking space between 'hello' and 'world'
normalizeString(str);
// Splits and rejoins, potentially disturbing the non-breaking space
  1. Zero-width characters: Certain unicode characters like zero-width spaces, zero-width joiners, etc. occupy "width" in a string but have no visual glyph. They depend on the string boundaries remaining intact. Re-tokenizing the string could affect their "invisible" position or join/split strings unintentionally. For example:

let str = 'hel' + '\u200B' + 'lo'; // zero-width space between 'hel' and 'lo'
normalizeString(str);
// Re-splitting and re-joining may remove or alter the zero-width space
  1. String "reset": Highly niche, but re-tokenizing and re-joining a string could be used to "reset" certain unicode state (e.g. removing strange double-spacing from Thai or Lao fonts). Again, an edge case but where reconstructing the string with no other changes has an effect.

So in summary, while the normalizeString() function does not provide substantial utility for most normal string processing needs, re-delimiting and re-joining strings can have an impact, however subtle, on certain unicode and special characters that depend on precise string boundaries or spacing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment