-
-
Save akost/2304819 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# Recursive file convertion windows-1251 --> utf-8 | |
# Place this file in the root of your site, add execute permission and run | |
# Converts *.php, *.html, *.css, *.js files. | |
# To add file type by extension, e.g. *.cgi, add '-o -name "*.cgi"' to the find command | |
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f | | |
while read file | |
do | |
echo " $file" | |
mv $file $file.icv | |
iconv -f WINDOWS-1251 -t UTF-8 $file.icv > $file | |
rm -f $file.icv | |
done |
That script is bad. since iconv doesn't detect if file is already UTF-8.
Yes. I too often see something like
Какое унижение для противника!
It's utf8 text converted to utf8 text assuming it was cp1251.
find ./ -name "*.php" -o -name "*.html" -o -name "*.css" -o -name "*.js" -o -name "*.txt" -type f |
while read file
do
if ! file -bi $file | grep -q 'utf-8'
then
echo " $file"
mv "$file" "$file".icv
iconv -f WINDOWS-1251 -t UTF-8 "$file".icv > "$file"
rm -f "$file".icv
fi
done
For many Russian filenames with spaces and etc, and autodetect for codepage, (macos) best for me:
find ./ -name "*.sql" -type f | while read file; do enca -L russian -x UTF-8 "$file"; done;
For many Russian filenames with spaces and etc, and autodetect for codepage, (macos) best for me:
find ./ -name "*.sql" -type f | while read file; do enca -L russian -x UTF-8 "$file"; done;
just a quick note that that would require enca installed (brew install enca
) and might fail if, say, a CP-1251 file was incorrectly saved as UTF-8
@1nt3g3r, your script won't work. You missed
*
in the filename templates. To make it work the first line should look like this:However, your variant works much better then the TS's. It works even with the unprintable characters in the filenames. Thanks!