Skip to content

Instantly share code, notes, and snippets.

@brablc
Last active November 6, 2016 18:50
Show Gist options
  • Save brablc/fc781f442781aa548e6a1159e0117e81 to your computer and use it in GitHub Desktop.
Save brablc/fc781f442781aa548e6a1159e0117e81 to your computer and use it in GitHub Desktop.
macOS filesystem uses decomposed utf8 characters, they look the same but have different byte representation
iconv -f utf8-mac -t utf8
@brablc
Copy link
Author

brablc commented Nov 6, 2016

If you need to compare characters from file system with utf8 characters stored in your script, you may be surprised that they look the same but cannot be compared, for instance file system gives č as \x63\xCC\x8C but standard utf8 code for č is \xC4\x8D.

In Perl no use utf8; is required (actually when you would use it, comparison would not work).

See https://metacpan.org/pod/Encode::UTF8Mac for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment