Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Unicode on Mac is insane. Mac OS X uses NFD while everything else uses NFC. This fixes that.

convmv manpage

Install convmv if you don't have it

sudo apt-get install convmv

Convert all files in a directory from NFD to NFC:

convmv -r -f utf8 -t utf8 --nfc --notest .

Convert all files in a directory from NFC to NFD:

convmv -r -f utf8 -t utf8 --nfd --notest .

@djordn

This comment has been minimized.

Copy link

@djordn djordn commented Apr 6, 2016

Hey man.
Your post helped me a lot!
I was going crazy with some accent files.
Thanks.

@marteiro

This comment has been minimized.

Copy link

@marteiro marteiro commented Jul 7, 2017

Dude, seriously... Saved my day.....

@LPhat

This comment has been minimized.

Copy link

@LPhat LPhat commented Oct 5, 2017

Thanks so much! Running the --nfc command helped Apache serve up files with special characters on a CentOS machine after the files had been put on an OSX machine.

@diamondsw

This comment has been minimized.

Copy link

@diamondsw diamondsw commented Sep 4, 2018

Also very happy to have this, although I really wish I could isolate the exact chunk of code that reads in an NFC/NFD filename, converts it, and renames the file on disk. The code handles so many cases (and handles them well, I might add), that it's hard to strip it back to its basics if, for example, you want to add a single very specific case to a script.

As a side note, the issue isn't technically the OS; it's the underlying filesystem. Linux can be just as boneheaded, just in the opposite direction. Try to write Unicode to an HFS+ volume under Linux, and it merrily ignores the fact that HFS+ uses NFD and writes NFC into the filesystem. Then when you try to use that data on a Mac you get a stream of "File Not Found" errors. Only solution is to either use Linux again to delete the data, or reformat the disk.

(Sadly these decisions were set in stone when HFS+ was designed over 20 years ago and Unicode was still a relatively new thing; the NFC vs NFD thing hadn't been settled yet. Not a clue what APFS does now.)

@LazyRen

This comment has been minimized.

Copy link

@LazyRen LazyRen commented Oct 11, 2018

Thanks a lot. 'covmv' will save me a lot of time from now on.

@watersb

This comment has been minimized.

Copy link

@watersb watersb commented Jan 27, 2019

NOTE: Apple's new file system, APFS, apparently preserves Unicode normalization: if it gets a filename specified with decomposed Unicode (NFD), it won't change it, but if APFS writes new files, it will use the NFC (composed char) form.

You might not need covmv with APFS.

https://medium.com/@yorkxin/apfs-docker-unicode-6e9893c9385d

(I used to run ZFS storage arrays on my Mac Pro, and had a script that would set NFD on ZFS volume setup. Note that this was when I was using ZFS as direct-attached storage; for a while it seemed that ZFS was to be the next-gen macOS file system of choice. That blew up when Sun was acquired by Oracle, and Sun was not able to separate intellectual-property claims in order to ensure its ability to license the ZFS codebase. So now we have APFS, and macOS seems to have used the decade-long delay to implement NFC in its VFS layer. YMMV. WWJD. WTF.)

@SHawnHardy

This comment has been minimized.

Copy link

@SHawnHardy SHawnHardy commented Feb 24, 2019

Saved my day. Thanks a lot.

@DanielSmedegaardBuus

This comment has been minimized.

Copy link

@DanielSmedegaardBuus DanielSmedegaardBuus commented Feb 25, 2019

Remember, if you send files to a non-Mac with rsync from a Mac, you can use the argument --iconv=utf-8-mac,utf-8 to ensure the files are sent with the proper NFC names to the target; and vice-versa, when fetching from a non-Mac to a Mac via rsync, you can use --iconv=utf-8,utf-8-mac.

Unfortunately, at least for the Ubuntu version of rsync, this argument may not be supported. Really weird. But it is for the native Mac version of rsync, as well as the Homebrew version.

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented May 31, 2019

Also note that on MacOS, the command iconv can be used to convert between NFD and NFC
iconv -f UTF-8 -t UTF-8-MAC (or vice versa of course)
but many UNIX/Linux implementations that I've come across have the iconv command but do not support the UTF-8-MAC option...

@mackyle

This comment has been minimized.

Copy link

@mackyle mackyle commented Oct 24, 2019

The UTF-8-MAC support was added to Cupertino’s version of iconv -- that’s why it’s not available on other systems.

They have also, apparently, removed their documentation of the HFS+ file name encodings. But, thanks to the wayback machine, you can see it here:

File Systems and Unicode Support

It states:

Mac OS Extended (HFS+) uses canonically decomposed Unicode 3.2 [...]
characters in the ranges U2000-U2FFF, UF900-UFA6A, and U2F800-U2FA1D are not decomposed

And that last little bit is how UTF-8-MAC differs from Unicode 3.2’s NFD.

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented May 19, 2020

I've created a repository with a pair of bijective scripts that do the conversion to and from NFD and does not rely on iconv:
https://github.com/hwdbk/synology-scripts/tree/master/mac-nfd-conversion
The scripts run on Mac OS X and other unixes (it uses bash and sed only). I use them on a Synology NAS, hence the names syn2mac and mac2syn, but what's in a name?
Also contains the script generating these scripts, if you want to play with it.

@jeiksegovia

This comment has been minimized.

Copy link

@jeiksegovia jeiksegovia commented Jun 4, 2020

Very simple and nice

@rico

This comment has been minimized.

Copy link

@rico rico commented Oct 27, 2020

... another day saved - thanks so much!

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 20, 2020

Hello everyone.

I am on mac and I can't make the script work.

After a cd to the directory i want to change UTF, I copy the script path and press enter. But nothing happen.
Can you help me to make it work?
Thank you very very much by advance.

Best

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 21, 2020

Hi fguern, if you're referring to mac2syn or syn2mac, these scripts read from file or stdin, and output on stdout.
So, suppose you have file with NFD UTF text, called my_nfd_utf.txt (for instance), you type
mac2syn my_nfd_utf.txt
or
mac2syn < my_nfd_utf.txt
or
some-other-program-producing-the-text | mac2syn
make sure the script is executable (chmod 750 mac2syn)

If you have a string that needs translating, the syntax is
echo $(mac2syn <<< "string_or_variable_with_nfd_utf_text")

Cheers, Henk

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 23, 2020

Hello Henk,

Thanks for your help. It's still not clear for me.

Using a mac, does this mean to type that in the terminal ?
However, my goal is to change an entire folder with sub folders and sub files to the syno compatible UTF.

I tried that in the terminal :
Francoiss-MacBook-Air:~ francois$ mac2syn /Users/francois/Documents/01.\ Documents/2008_03_26\ -\ A\ voir\ à\ paris.rtf
-bash: mac2syn: command not found

I guess it didn't work :D.

And the script mac2syn is read and write for everyone.

Cheers, François

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 23, 2020

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 23, 2020

Hey Henk,

I tried your command, after installing the homebrew package + convmv (see : http://macappstore.org/convmv/). Because I understood that the script is based on these two package, right?

When launching the script with your command, even by adding 'sightseeing paris utf.rtf' it doesn't do anything : the file is still not synchronized with my syno.

francois@Francoiss-MacBook-Air ~ % cd /Users/francois/Downloads/synology-scripts-master/mac-nfd-conversion
francois@Francoiss-MacBook-Air mac-nfd-conversion % ./mac2syn /Users/francois/Documents/01.\ Documents/2008_03_26\ -\ A\ voir\ à\ paris.rtf 'sightseeing paris utf.rtf'
{\rtf1\ansi\ansicpg1252\cocoartf1265
\cocoascreenfonts1{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\paperw11900\paperh16840\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural

\f0\fs24 \cf0 * Caf'e9 Branly + mus'e9e
m'e9tro alma marceau
rer, pont de l'alma
\

  • Caf'e9 des deux moulin _ Am'e9lie Poulain
    15 rue Lepic _ Montmartre
    -> fait
    \
  • Mus'e9e Gr'e9vin, 10 boulevard Montmartre
    m'e9tro gd boulevard}cat: sightseeing paris utf.rtf: No such file or directory

Once it works with a file, i'll try you command on a folder.

Thank you Henk!!
+

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 24, 2020

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 24, 2020

I did a test and it's only the title of the file which is the problem: if i create a copy and rename the file as "2008_03_26 - A voir a paris copy.rtf" instead of "2008_03_26 - A voir à paris.rtf" the file is synced with the synology

My goal is to rename all my files in a folder to adapt from NFD to NFC and in the future be sure all the accented files are sync with the synology.
And I thought the script mac2syn was to adapt the title file from NFD to NFC.

Was I wrong and did I miss something?

Thanks a lot for your help and time Henk.
Cheers, François

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 24, 2020

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 24, 2020

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 26, 2020

Hello Henk,

I wish you a merry Christmas !

Thanks for the command. I tried it. At first the command worked but no file title where changed, and no sync with the syno happened. Then it display a dquote>.
Terminal stuff is definitely not for me.

To summarize :

  • mac2syn script is in the mac download folder : /Users/francois/Downloads/synology-scripts-master/mac-nfd-conversion/mac2syn
  • The folder and all its subfolders to change NFD to NFC is /Users/francois/Documents/01.\ Documents/25.\ Test/
  • I created a "01. Test"sub folder in the "25. Test" Folder to simulate subfolders. (need it in the future as I will apply the command on all 01.\ Documents)
  • The file "2008_03_26 - A voir à paris.rtf" is in the "01. Test" subfolder
  • I CD the mac2syn foder (cd /Users/francois/Downloads/synology-scripts-master/mac-nfd-conversion/mac2syn)
  • I enter your command without the */rtf to deal with all files : for f in /Users/francois/Documents/01.\ Documents/25.\ Test/ ; do mv -v -n ”$f" "$(dirname "$f")/$(./mac2syn <<< "$(basename "$f")")" ; done

Here is the result :
francois@Francoiss-MacBook-Air mac-nfd-conversion % for f in /Users/francois/Documents/01.\ Documents/25.\ Test/ ; do mv -v -n ”$f" "$(dirname "$f")/$(./mac2syn <<< "$(basename "$f")")" ; done
for dquote> for f in /Users/francois/Documents/01.\ Documents/25.\ Test/ ; do mv -v -n ”$f" "$(dirname "$f")/$(./mac2syn <<< "$(basename "$f")")" ; done
mv: rename ”/Users/francois/Documents/01. Documents/25. Test/ /Users/francois/Documents/01. to /Users/francois/Documents/01. Documents/25. Test/01.: No such file or directory
mv: rename Documents/25. to /Users/francois/Documents/01. Documents/25. Test/25.: No such file or directory
mv: rename Test ; done
for f in /Users/francois/Documents/01.\ Documents/25.\ Test/ ; do mv -v -n ”/Users/francois/Documents/01. Documents/25. Test/ to /Users/francois/Documents/01. Documents/25. Test/25. Test/: No such file or directory

Is the mac2syn and a rename command done for my need?

Thank you Henk,
François

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 26, 2020

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 26, 2020

Got it.
I changed the double quotes and have an invalid argument this time:
francois@Francoiss-MacBook-Air mac-nfd-conversion % for f in /Users/francois/Documents/01.\ Documents/25.\ Test/ ; do mv -v -n "$f" "$(dirname "$f")/$(./mac2syn <<< "$(basename "$f")")" ; done
mv: rename /Users/francois/Documents/01. Documents/25. Test/ to /Users/francois/Documents/01. Documents/25. Test/25. Test/: Invalid argument

The command seems to repeat the last folder. Is it the Dirname+basename command?

However, I found another command which copy an entire folder and change the nfd, without script: rsync -a --iconv=utf-8-mac,utf-8 /Users/francois/Documents/01.\ Documents/25.\ Test/ /Users/francois/Documents/01.\ Documents/26.\ Test\ 2/
This one works. Even If it duplicate the files, I think it's a acceptable workaround. What do you think ?

Thank you,
François

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 27, 2020

@fguern

This comment has been minimized.

Copy link

@fguern fguern commented Dec 27, 2020

Hello Henk.

At this stage I see three solutions:

/////1 - Your script

-> This time, it's "not overwritten":
francois@Francoiss-MacBook-Air mac-nfd-conversion % for f in /Users/francois/Documents/01.\ Documents/25.\ Test/*.rtf ; do mv -v -n "$f" "$(dirname "$f")/$(./mac2syn <<< "$(basename "$f")")" ; done
/Users/francois/Documents/01. Documents/25. Test/2008_03_26 - A voir à paris.rtf not overwritten
Is it possible to have an entire folder+subfolders rename with your script?

//// 2 - James CONVMV command
-> I also tried the command above "convmv -r -f utf8 -t utf8 --nfc --notest" and I got a "wrong/unknown encoding" :
francois@Francoiss-MacBook-Air 25. Test % convmv -r -f enc -t enc utf8 --nfc --notest
wrong/unknown "from" encoding!

///// 3 - Rsync local copy with NFC
-> No apparent problem, except the copy of 10Go
rsync -a --iconv=utf-8-mac,utf-8 /Users/francois/Documents/01.\ Documents/02.\ Administratif /Users/francois/Documents/01.\ Documents/02.\ Administratif\ nfc

What's your expert advise? Is it worth it to try to make the script or the convmv command work?
Thanks

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 28, 2020

@hwdbk

This comment has been minimized.

Copy link

@hwdbk hwdbk commented Dec 28, 2020

@jsvini

This comment has been minimized.

Copy link

@jsvini jsvini commented Mar 18, 2021

God bless you! 🙌

@jcarnat

This comment has been minimized.

Copy link

@jcarnat jcarnat commented Apr 14, 2021

Great. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment