Last active
November 6, 2016 18:50
-
-
Save brablc/fc781f442781aa548e6a1159e0117e81 to your computer and use it in GitHub Desktop.
macOS filesystem uses decomposed utf8 characters, they look the same but have different byte representation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
iconv -f utf8-mac -t utf8 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If you need to compare characters from file system with utf8 characters stored in your script, you may be surprised that they look the same but cannot be compared, for instance file system gives
č
as\x63\xCC\x8C
but standard utf8 code forč
is\xC4\x8D
.In Perl no
use utf8;
is required (actually when you would use it, comparison would not work).See https://metacpan.org/pod/Encode::UTF8Mac for more info.