Skip to content

Instantly share code, notes, and snippets.

@exetico
Last active January 29, 2020 19:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save exetico/0f84fe7cda38a118e995e8c54849d6e5 to your computer and use it in GitHub Desktop.
Save exetico/0f84fe7cda38a118e995e8c54849d6e5 to your computer and use it in GitHub Desktop.
Bash script, which allows you to add a Index-column to a CSV file, and encode it as UTF-8. You can also add your own seperator. Created, for easy prep of file, before import to Firefly III.
#!/bin/bash
echo ""
if [ -z "${1}" ]; then
echo "csv-add-index-and-encode: Error!"
echo "Please define input-file as the first paramter, and optinally define seperator (like ;) as the secound one, like:"
echo "./csv-fix.sh NemKonto_PosterinterTest3.csv \";\""
echo ""
exit 1
else
echo "> csv-add-index-and-encode: Let's start"
fi
in=$1
out="final-$1"
encoding_wanted="UTF-8"
current_encoding=`uchardet $1`
seperator=$2
if [ -z "${2}" ]; then
seperator=";"
echo "> Seperator not defined, using the default one, which is ';'"
else
echo "> Using '$2' as seperator"
fi
echo "> Using the following seperator '$seperator'"
#OFC can be used, to define a space after the changed value...
awk -F'\t' -v OFS='' -v sep="$seperator" '
NR == 1 {print "\"ID\""sep, $0; next}
{print (NR-1)sep, $0}
' $in > $out.tmp
echo "> Saved tmp-file as: $out.tmp"
echo "> Index-colum added!"
echo "> Using iconv to change character-encoding to $encoding_wanted. Current encoding: $current_encoding."
iconv --from-code=$current_encoding --to-code=$encoding_wanted $out.tmp > $out
current_encoding=`uchardet $out`
echo "> Encoding done. uchardet now reports the following encoding: $current_encoding"
echo "> Removing $out.tmp ..."
rm $out.tmp
echo "> Result saved as: $out"
echo ""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment