Create a gist now

Instantly share code, notes, and snippets.

Embed
Bash script for recursive file convertion windows-1251 --> utf-8
#!/bin/bash
# Recursive file convertion windows-1251 --> utf-8
# Place this file in the root of your site, add execute permission and run
# Converts *.php, *.html, *.css, *.js files.
# To add file type by extension, e.g. *.cgi, add '-o -name "*.cgi"' to the find command
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |
while read file
do
echo " $file"
mv $file $file.icv
iconv -f WINDOWS-1251 -t UTF-8 $file.icv > $file
rm -f $file.icv
done
@Batname

This comment has been minimized.

Show comment
Hide comment
@Batname

Batname Apr 15, 2014

Thanks

Batname commented Apr 15, 2014

Thanks

@FernandoBasso

This comment has been minimized.

Show comment
Hide comment
@FernandoBasso

FernandoBasso May 7, 2015

Great script. Works perfectly, even on cygwin (which I have to use at work). Thanks a lot.

Great script. Works perfectly, even on cygwin (which I have to use at work). Thanks a lot.

@vkdimitrov

This comment has been minimized.

Show comment
Hide comment
@vkdimitrov

vkdimitrov Aug 4, 2015

save my day

save my day

@ranold

This comment has been minimized.

Show comment
Hide comment
@ranold

ranold Apr 7, 2016

thank you!

ranold commented Apr 7, 2016

thank you!

@nymo

This comment has been minimized.

Show comment
Hide comment
@nymo

nymo Jan 20, 2017

Thanks! Really good script.

nymo commented Jan 20, 2017

Thanks! Really good script.

@obojdi

This comment has been minimized.

Show comment
Hide comment
@obojdi

obojdi Feb 17, 2017

Confirmed working on cygwin, many thanks @akost!

obojdi commented Feb 17, 2017

Confirmed working on cygwin, many thanks @akost!

@anonymous2ch

This comment has been minimized.

Show comment
Hide comment
@anonymous2ch

anonymous2ch Feb 27, 2017

That script is bad. since iconv doesn't detect if file is already UTF-8. So it will ruin your files if run on directory with files in mixed encodings. Running iconv more than once is guaranteed to screw your files too.

What you actually should use for this operation is enca, since it will correctly detect input encoding and act accordingly.

After installing enca, just run this one-liner & your files will be UTF-8 in no time:
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f | while read file; do enca -x UTF-8 $file; done;

That script is bad. since iconv doesn't detect if file is already UTF-8. So it will ruin your files if run on directory with files in mixed encodings. Running iconv more than once is guaranteed to screw your files too.

What you actually should use for this operation is enca, since it will correctly detect input encoding and act accordingly.

After installing enca, just run this one-liner & your files will be UTF-8 in no time:
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f | while read file; do enca -x UTF-8 $file; done;

@loadinger

This comment has been minimized.

Show comment
Hide comment
@loadinger

loadinger Jun 15, 2017

thanks @anonymous2ch

@finalchild

This comment has been minimized.

Show comment
Hide comment
@finalchild

finalchild Jul 25, 2017

worked like a charm
Thank you so much!!!!

worked like a charm
Thank you so much!!!!

@ayzakh

This comment has been minimized.

Show comment
Hide comment
@ayzakh

ayzakh Aug 29, 2017

Thanks!

ayzakh commented Aug 29, 2017

Thanks!

@shuravban

This comment has been minimized.

Show comment
Hide comment
@shuravban

shuravban Dec 21, 2017

That script is bad. since iconv doesn't detect if file is already UTF-8.

Yes. I too often see something like
Какое унижение для противника!
It's utf8 text converted to utf8 text assuming it was cp1251.

That script is bad. since iconv doesn't detect if file is already UTF-8.

Yes. I too often see something like
Какое унижение для противника!
It's utf8 text converted to utf8 text assuming it was cp1251.

@mitya12342

This comment has been minimized.

Show comment
Hide comment
@mitya12342

mitya12342 Jan 2, 2018

@anonymous2ch Помог )

@anonymous2ch Помог )

@1nt3g3r

This comment has been minimized.

Show comment
Hide comment
@1nt3g3r

1nt3g3r Jan 20, 2018

Есть момент, когда имена файлов с пробелами - тогда скрипт не работает. Поправленный вариант скрипта -

find ./ -name ".txt" -o -name ".html" -o -name ".css" -o -name ".js" -type f |
while read file
do
echo " $file"
mv "$file" "$file".icv
iconv -f WINDOWS-1251 -t UTF-8 "$file".icv > "$file"
rm -f "$file".icv
done

1nt3g3r commented Jan 20, 2018

Есть момент, когда имена файлов с пробелами - тогда скрипт не работает. Поправленный вариант скрипта -

find ./ -name ".txt" -o -name ".html" -o -name ".css" -o -name ".js" -type f |
while read file
do
echo " $file"
mv "$file" "$file".icv
iconv -f WINDOWS-1251 -t UTF-8 "$file".icv > "$file"
rm -f "$file".icv
done

@pasha-pivo

This comment has been minimized.

Show comment
Hide comment
@pasha-pivo

pasha-pivo Apr 26, 2018

@1nt3g3r, your script won't work. You missed * in the filename templates. To make it work the first line should look like this:

find ./ -name "*.txt" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |

However, your variant works much better then the TS's. It works even with the unprintable characters in the filenames. Thanks!

@1nt3g3r, your script won't work. You missed * in the filename templates. To make it work the first line should look like this:

find ./ -name "*.txt" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |

However, your variant works much better then the TS's. It works even with the unprintable characters in the filenames. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment