Instantly share code, notes, and snippets.

Embed
What would you like to do?
Bash script for recursive file convertion windows-1251 --> utf-8
#!/bin/bash
# Recursive file convertion windows-1251 --> utf-8
# Place this file in the root of your site, add execute permission and run
# Converts *.php, *.html, *.css, *.js files.
# To add file type by extension, e.g. *.cgi, add '-o -name "*.cgi"' to the find command
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |
while read file
do
echo " $file"
mv $file $file.icv
iconv -f WINDOWS-1251 -t UTF-8 $file.icv > $file
rm -f $file.icv
done
@Batname

This comment has been minimized.

Batname commented Apr 15, 2014

Thanks

@FernandoBasso

This comment has been minimized.

FernandoBasso commented May 7, 2015

Great script. Works perfectly, even on cygwin (which I have to use at work). Thanks a lot.

@vkdimitrov

This comment has been minimized.

vkdimitrov commented Aug 4, 2015

save my day

@ranold

This comment has been minimized.

ranold commented Apr 7, 2016

thank you!

@nymo

This comment has been minimized.

nymo commented Jan 20, 2017

Thanks! Really good script.

@obojdi

This comment has been minimized.

obojdi commented Feb 17, 2017

Confirmed working on cygwin, many thanks @akost!

@anonymous2ch

This comment has been minimized.

anonymous2ch commented Feb 27, 2017

That script is bad. since iconv doesn't detect if file is already UTF-8. So it will ruin your files if run on directory with files in mixed encodings. Running iconv more than once is guaranteed to screw your files too.

What you actually should use for this operation is enca, since it will correctly detect input encoding and act accordingly.

After installing enca, just run this one-liner & your files will be UTF-8 in no time:
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f | while read file; do enca -x UTF-8 $file; done;

@loadinger

This comment has been minimized.

loadinger commented Jun 15, 2017

thanks @anonymous2ch

@finalchild

This comment has been minimized.

finalchild commented Jul 25, 2017

worked like a charm
Thank you so much!!!!

@ayzakh

This comment has been minimized.

ayzakh commented Aug 29, 2017

Thanks!

@shuravban

This comment has been minimized.

shuravban commented Dec 21, 2017

That script is bad. since iconv doesn't detect if file is already UTF-8.

Yes. I too often see something like
Какое унижение для противника!
It's utf8 text converted to utf8 text assuming it was cp1251.

@mitya12342

This comment has been minimized.

mitya12342 commented Jan 2, 2018

@anonymous2ch Помог )

@1nt3g3r

This comment has been minimized.

1nt3g3r commented Jan 20, 2018

Есть момент, когда имена файлов с пробелами - тогда скрипт не работает. Поправленный вариант скрипта -

find ./ -name ".txt" -o -name ".html" -o -name ".css" -o -name ".js" -type f |
while read file
do
echo " $file"
mv "$file" "$file".icv
iconv -f WINDOWS-1251 -t UTF-8 "$file".icv > "$file"
rm -f "$file".icv
done

@pasha-pivo

This comment has been minimized.

pasha-pivo commented Apr 26, 2018

@1nt3g3r, your script won't work. You missed * in the filename templates. To make it work the first line should look like this:

find ./ -name "*.txt" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |

However, your variant works much better then the TS's. It works even with the unprintable characters in the filenames. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment