Skip to content

Instantly share code, notes, and snippets.

@dogancelik
Last active December 29, 2023 02:38
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save dogancelik/2a88c81d309a753cecd8b8460d3098bc to your computer and use it in GitHub Desktop.
Save dogancelik/2a88c81d309a753cecd8b8460d3098bc to your computer and use it in GitHub Desktop.
ANSI / UTF-8 (with or without BOM) conversion #Windows

Using Uni2Me

  • It's free but discontinued.

Using UTFCast

  • Proprietary software
  • Allows conversion from ANSI to UTF-8 with or without BOM

Using Notepad++

Using Python Script Plugin

from glob import glob
from Npp import notepad

globPath = "C:\MyFiles\*.txt"

for file in glob(globPath):
  notepad.open(file)
  notepad.runMenuCommand("Encoding", "Convert to UTF-8-BOM")
  notepad.save()
  notepad.close()

Using Macros

  1. Start Macro recording
  2. Select Encoding > Convert to UTF-8-BOM
  3. Select all text and copy it (it's a bug otherwise it will replace file contents with Clipboard content)
  4. Save file and close it

Using Bash

Add BOM to an already encoded UTF-8 file

echo -ne '\xEF\xBB\xBF' > utf8-no-bom.txt

Batch conversion using find and iconv

# Find all .txt files and convert them to UTF-8 (assuming US characters only / ANSI)
find *.txt -exec 'iconv -f CP1252 -t UTF-8  {} > {}'

# all Windows character sets
iconv -l | grep -i windows

Batch conversion using ls and iconv

for i in `ls *.txt`; do
  iconv -f WINDOWS-1252 -t UTF8 $i -o $i.utf8
  mv $i.utf8 $i
done

Change in …; do with in $@; do to create a usable Bash file. (e.g. convert.sh myfile.txt myfile2.txt)

Using Batch

Add BOM to all text files using nkf

for %a in (*.txt) do nkf32 -W8 -w8 --overwrite "%a"

Download binary for WindowsSource code

Note: Change -w8 with -w80 to remove BOM

Batch conversion using for and iconv

for %a in (*.txt) do iconv -f CP1252 -t UTF-8 "%a" > "%a"

Trivial methods

@dogancelik
Copy link
Author

@bulli-03 Your question is more about for command than the conversion itself. See for /? for more examples about the loop process.

rem Example: you are in 'C:\Files\' and you run this command:
mkdir new
for %a in (*.csv) do iconv -f CP1252 -t UTF-8 "%~a" > "new\%~nxa"
rem C:\Files\Test.csv ➡ C:\Files\new\Test.csv

@Pooja5757
Copy link

iconv -f CP1252 -t UTF-8 "%~a" > "new\%~nxa"

Thank youu..this worked so well....but when im giving same filename after conversion(as i dont want to have other folder or another file), the file after conversion is getting empty/corrupted......Any idea how can i have the same file after conversion(just like overwriting the same file)

@Basti-Fantasti
Copy link

Basti-Fantasti commented Dec 14, 2021

Thanks for sharing the method to change the file encoding using Python in NP++ 👍

I had to make some adjustments to the script to get it to work.
First of all I had to change the globPath to be set like this:

globPath = "C:\\mydir\\*.txt"

And the line notepad.runMenuCommand needs to be adjusted to the NP++ language in use.
So I had to change it on my German NP++ setup from:

notepad.runMenuCommand("Encoding", "Convert to UTF-8")

to

notepad.runMenuCommand("Kodierung", "Konvertiere zu UTF-8")

@c-sanchez
Copy link

Now there is a much simpler, free and open source tool, thanks to @tomwillow :)
SmartCharsetConverter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment