Skip to content

Instantly share code, notes, and snippets.

@smccutchen
Created January 4, 2017 04:59
Show Gist options
  • Save smccutchen/86c0f8bdd6e5e36eebae0fbf73b752a7 to your computer and use it in GitHub Desktop.
Save smccutchen/86c0f8bdd6e5e36eebae0fbf73b752a7 to your computer and use it in GitHub Desktop.
Convert multiple files to UTF-8 encoding with Notepad++
# 2016-2017 Soverance Studios.
# Scott McCutchen
# This file will search all files and folders within a given directory, and use Notepad++ to convert their encoding to UTF-8 without Byte Order Marks
#
# This file must be run using the PythonScript plugin from within Notepad++, which is available through the Notepad++ Plugin Manager
#
# You must have Python 2.7 installed
#
# Additionally, this script can only exist and be run from within the Notepad++ user's working directory, the default of which is here:
# Note that selecting "New Script" from within the PythonScript plugin will automatically default to this save location
# .. USER DIRECTORY\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts
import os;
import sys;
from Npp import notepad
filePathSrc="U:\\UnrealEngine\\Ethereal\\Source\\Ethereal" # Path to the folder with files to convert
for root, dirs, files in os.walk(filePathSrc):
for fn in files:
if fn[-2:] == '.h' or fn[-4:] == '.cpp': # Specify file types, taking care to change the fn[number] to correspond to length of the file's extension including the .
notepad.open(root + "\\" + fn)
notepad.runMenuCommand("Encoding", "Encode in ANSI")
notepad.runMenuCommand("Encoding", "Convert to UTF-8")
notepad.save()
notepad.close()
@LulaSvob
Copy link

LulaSvob commented Sep 20, 2018

Here is an improved version that prints the action in the console and also accounts for non-Latin characters like Cyrillic, Japanese, Chinese etc... in the path

# -*- coding: utf-8 -*-
import os
import sys
from Npp import notepad

filePathSrc = "D:\\Файлове\\" # Path to the folder with files to convert

filePathSrc = filePathSrc.decode('utf-8')
os.chdir(filePathSrc)
for root, dirs, files in os.walk(".", topdown = False):
    for fn in files:
        if fn[-4:] == '.txt': 
			notepad.open(root + "\\" + fn)             
			notepad.runMenuCommand("Encoding", "Convert to UTF-8")			 
			notepad.save()
			console.write('File ' + fn + ' saved. Closing ... \n')
			notepad.close()

@Joppest
Copy link

Joppest commented Nov 15, 2019

Thanks! Had ~65k small text files where every 1000 or so there was a 0x90 character messing up my processing!

@TurtleShroom
Copy link

Does this script work on folder trees, with sub-folders?

@SUDALV92
Copy link

Does this script work on folder trees, with sub-folders?

no, i'm also searching to a solution with subfolders

@LulaSvob
Copy link

Does this script work on folder trees, with sub-folders?

no, i'm also searching to a solution with subfolders

Did you test it? os.walk is set so that it goes through folders recursively, so it should work with subfolders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment