Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Unicode on Mac is insane. Mac OS X uses NFD while everything else uses NFC. This fixes that.

convmv manpage

Install convmv if you don't have it

sudo apt-get install convmv

Convert all files in a directory from NFD to NFC:

convmv -r -f utf8 -t utf8 --nfc --notest .

Convert all files in a directory from NFC to NFD:

convmv -r -f utf8 -t utf8 --nfd --notest .


This comment has been minimized.

Copy link

djordn commented Apr 6, 2016

Hey man.
Your post helped me a lot!
I was going crazy with some accent files.


This comment has been minimized.

Copy link

marteiro commented Jul 7, 2017

Dude, seriously... Saved my day.....


This comment has been minimized.

Copy link

LPhat commented Oct 5, 2017

Thanks so much! Running the --nfc command helped Apache serve up files with special characters on a CentOS machine after the files had been put on an OSX machine.


This comment has been minimized.

Copy link

diamondsw commented Sep 4, 2018

Also very happy to have this, although I really wish I could isolate the exact chunk of code that reads in an NFC/NFD filename, converts it, and renames the file on disk. The code handles so many cases (and handles them well, I might add), that it's hard to strip it back to its basics if, for example, you want to add a single very specific case to a script.

As a side note, the issue isn't technically the OS; it's the underlying filesystem. Linux can be just as boneheaded, just in the opposite direction. Try to write Unicode to an HFS+ volume under Linux, and it merrily ignores the fact that HFS+ uses NFD and writes NFC into the filesystem. Then when you try to use that data on a Mac you get a stream of "File Not Found" errors. Only solution is to either use Linux again to delete the data, or reformat the disk.

(Sadly these decisions were set in stone when HFS+ was designed over 20 years ago and Unicode was still a relatively new thing; the NFC vs NFD thing hadn't been settled yet. Not a clue what APFS does now.)


This comment has been minimized.

Copy link

LazyRen commented Oct 11, 2018

Thanks a lot. 'covmv' will save me a lot of time from now on.


This comment has been minimized.

Copy link

watersb commented Jan 27, 2019

NOTE: Apple's new file system, APFS, apparently preserves Unicode normalization: if it gets a filename specified with decomposed Unicode (NFD), it won't change it, but if APFS writes new files, it will use the NFC (composed char) form.

You might not need covmv with APFS.

(I used to run ZFS storage arrays on my Mac Pro, and had a script that would set NFD on ZFS volume setup. Note that this was when I was using ZFS as direct-attached storage; for a while it seemed that ZFS was to be the next-gen macOS file system of choice. That blew up when Sun was acquired by Oracle, and Sun was not able to separate intellectual-property claims in order to ensure its ability to license the ZFS codebase. So now we have APFS, and macOS seems to have used the decade-long delay to implement NFC in its VFS layer. YMMV. WWJD. WTF.)


This comment has been minimized.

Copy link

SHawnHardy commented Feb 24, 2019

Saved my day. Thanks a lot.


This comment has been minimized.

Copy link

DanielSmedegaardBuus commented Feb 25, 2019

Remember, if you send files to a non-Mac with rsync from a Mac, you can use the argument --iconv=utf-8-mac,utf-8 to ensure the files are sent with the proper NFC names to the target; and vice-versa, when fetching from a non-Mac to a Mac via rsync, you can use --iconv=utf-8,utf-8-mac.

Unfortunately, at least for the Ubuntu version of rsync, this argument may not be supported. Really weird. But it is for the native Mac version of rsync, as well as the Homebrew version.


This comment has been minimized.

Copy link

hwdbk commented May 31, 2019

Also note that on MacOS, the command iconv can be used to convert between NFD and NFC
iconv -f UTF-8 -t UTF-8-MAC (or vice versa of course)
but many UNIX/Linux implementations that I've come across have the iconv command but do not support the UTF-8-MAC option...


This comment has been minimized.

Copy link

mackyle commented Oct 24, 2019

The UTF-8-MAC support was added to Cupertino’s version of iconv -- that’s why it’s not available on other systems.

They have also, apparently, removed their documentation of the HFS+ file name encodings. But, thanks to the wayback machine, you can see it here:

File Systems and Unicode Support

It states:

Mac OS Extended (HFS+) uses canonically decomposed Unicode 3.2 [...]
characters in the ranges U2000-U2FFF, UF900-UFA6A, and U2F800-U2FA1D are not decomposed

And that last little bit is how UTF-8-MAC differs from Unicode 3.2’s NFD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.