Skip to content

Instantly share code, notes, and snippets.

@AidasK
Last active December 28, 2015 08:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AidasK/a718f94781cb647a23c4 to your computer and use it in GitHub Desktop.
Save AidasK/a718f94781cb647a23c4 to your computer and use it in GitHub Desktop.
Convert file to UTF8 without bom and with UNIX line endings. Usage: `utf8unixnobom ./* -e=iso-8859-1`
#!/bin/bash
for arg in $*
do
if [[ $arg =~ ^\-e\=.* ]]; then
force=${arg:3}
echo "FORCE $force"
fi
done;
for file in $*
do
if [ -d "$file" ]; then
continue
fi
if [ ! -f "$file" ]; then
continue
fi
enc=`file -bi $file | awk '{ print $2 }' | sed -e 's/charset=//'`
echo $file" "$enc
# NOBOM
awk '{if(NR==1)sub(/^\xef\xbb\xbf/,"");print}' $file > "$file.tmp"
if [ $? -eq 0 ]
then
rm -f $file
mv "$file.tmp" $file
fi
if [ "$enc" != "utf-8" ] && [ -z $force ]
then
echo "SUGGESTIONS"
firstLines=`head -2 $file`
for TRYENC in "iso-8859-1" "iso-8859-2" "ISO-8859-8" "windows-1252" "windows-1251" "windows-1250" "cp1252"
do
echo "$TRYENC"
echo $firstLines | iconv -f $TRYENC -t utf-8//TRANSLIT
echo ""
done
exit
fi
if [ "$enc" != "utf-8" ]
then
enc=$force
fi
echo "$enc to utf-8"
# UTF8
iconv -f $enc -t utf-8//TRANSLIT $file > $file".tmp"
if [ $? -eq 0 ]
then
rm -f $file
mv "$file.tmp" $file
fi
# UNIX LF
dos2unix $file 2> /dev/null
done;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment