Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Unicode on Mac is insane. Mac OS X uses NFD while everything else uses NFC. This fixes that.

convmv manpage

Install convmv if you don't have it

sudo apt-get install convmv

Convert all files in a directory from NFD to NFC:

convmv -r -f utf8 -t utf8 --nfc --notest .

Convert all files in a directory from NFC to NFD:

convmv -r -f utf8 -t utf8 --nfd --notest .

@djordn

This comment has been minimized.

Copy link

commented Apr 6, 2016

Hey man.
Your post helped me a lot!
I was going crazy with some accent files.
Thanks.

@marteiro

This comment has been minimized.

Copy link

commented Jul 7, 2017

Dude, seriously... Saved my day.....

@LPhat

This comment has been minimized.

Copy link

commented Oct 5, 2017

Thanks so much! Running the --nfc command helped Apache serve up files with special characters on a CentOS machine after the files had been put on an OSX machine.

@diamondsw

This comment has been minimized.

Copy link

commented Sep 4, 2018

Also very happy to have this, although I really wish I could isolate the exact chunk of code that reads in an NFC/NFD filename, converts it, and renames the file on disk. The code handles so many cases (and handles them well, I might add), that it's hard to strip it back to its basics if, for example, you want to add a single very specific case to a script.

As a side note, the issue isn't technically the OS; it's the underlying filesystem. Linux can be just as boneheaded, just in the opposite direction. Try to write Unicode to an HFS+ volume under Linux, and it merrily ignores the fact that HFS+ uses NFD and writes NFC into the filesystem. Then when you try to use that data on a Mac you get a stream of "File Not Found" errors. Only solution is to either use Linux again to delete the data, or reformat the disk.

(Sadly these decisions were set in stone when HFS+ was designed over 20 years ago and Unicode was still a relatively new thing; the NFC vs NFD thing hadn't been settled yet. Not a clue what APFS does now.)

@LazyRen

This comment has been minimized.

Copy link

commented Oct 11, 2018

Thanks a lot. 'covmv' will save me a lot of time from now on.

@watersb

This comment has been minimized.

Copy link

commented Jan 27, 2019

NOTE: Apple's new file system, APFS, apparently preserves Unicode normalization: if it gets a filename specified with decomposed Unicode (NFD), it won't change it, but if APFS writes new files, it will use the NFC (composed char) form.

You might not need covmv with APFS.

https://medium.com/@yorkxin/apfs-docker-unicode-6e9893c9385d

(I used to run ZFS storage arrays on my Mac Pro, and had a script that would set NFD on ZFS volume setup. Note that this was when I was using ZFS as direct-attached storage; for a while it seemed that ZFS was to be the next-gen macOS file system of choice. That blew up when Sun was acquired by Oracle, and Sun was not able to separate intellectual-property claims in order to ensure its ability to license the ZFS codebase. So now we have APFS, and macOS seems to have used the decade-long delay to implement NFC in its VFS layer. YMMV. WWJD. WTF.)

@SHawnHardy

This comment has been minimized.

Copy link

commented Feb 24, 2019

Saved my day. Thanks a lot.

@DanielSmedegaardBuus

This comment has been minimized.

Copy link

commented Feb 25, 2019

Remember, if you send files to a non-Mac with rsync from a Mac, you can use the argument --iconv=utf-8-mac,utf-8 to ensure the files are sent with the proper NFC names to the target; and vice-versa, when fetching from a non-Mac to a Mac via rsync, you can use --iconv=utf-8,utf-8-mac.

Unfortunately, at least for the Ubuntu version of rsync, this argument may not be supported. Really weird. But it is for the native Mac version of rsync, as well as the Homebrew version.

@hwdbk

This comment has been minimized.

Copy link

commented May 31, 2019

Also note that on MacOS, the command iconv can be used to convert between NFD and NFC
iconv -f UTF-8 -t UTF-8-MAC (or vice versa of course)
but many UNIX/Linux implementations that I've come across have the iconv command but do not support the UTF-8-MAC option...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.