Last active
September 7, 2021 07:06
-
-
Save mattt/7539541bde6eba6e03fb4eb4db698ba1 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
# Problem | |
In Swift, it can be cumbersome to work with Unicode characters that are | |
non-printing, confusable, or have difficulty rendering in the editor. | |
For example, to generate the "Family: Woman, Girl" emoji: | |
*/ | |
// Option 1: Unicode Scalar Value Escapes | |
"\u{1F469}\u{200D}\u{1F467}" | |
// Option 2: Commented Declaration + Interpolation | |
let zwj: Character = "\u{200D}" // ZERO WIDTH JOINER | |
"👩\(zwj)👧" | |
/* | |
# Proposed Solution | |
Add \N{name} escape sequence for named Unicode characters. | |
*/ | |
"\N{WOMAN}\N{ZERO WIDTH JOINER}\N{GIRL}" | |
// Consider the 24 Unicode characters | |
// comprising the Punctuation, Dash [Pd] category, | |
// such as: | |
/* | |
U+002D HYPHEN-MINUS - | |
U+2010 HYPHEN ‐ | |
U+2011 NON-BREAKING HYPHEN ‑ | |
U+2012 FIGURE DASH ‒ | |
U+2013 EN DASH – | |
U+2014 EM DASH — | |
U+2015 HORIZONTAL BAR ― | |
U+2E3A TWO-EM DASH ⸺ | |
U+2E3B THREE-EM DASH ⸻ | |
*/ | |
// Which of these would you rather find in a code base? | |
"‒? \u{2012}? or \N{FIGURE DASH}?" | |
// The \N{} escape sequence is obscure, | |
// but supported in Python and a few other languages. | |
// Most notably, though, it's the output you get | |
// when you call the method `applyingTransform(_:reverse:)` | |
// with the `.toUnicodeName` transform: | |
import Foundation | |
"🍩".applyingTransform(.toUnicodeName, reverse: false) // \N{DOUGHNUT} | |
"\\N{DOUGHNUT}".applyingTransform(.toUnicodeName, reverse: true) // 🍩 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Note: as for the naming of the constants (Unicode names are uppercase, with spaces and hyphens, which are invalid names for constants and would need some normalization), there are some precedent on this in SE-0211 which did exactly the same kind of unicode properties naming normalization 😉