Skip to content

Instantly share code, notes, and snippets.

@mmcfarland
Last active August 29, 2015 14:05
Show Gist options
  • Save mmcfarland/16cf4f2919dc38c45bf1 to your computer and use it in GitHub Desktop.
Save mmcfarland/16cf4f2919dc38c45bf1 to your computer and use it in GitHub Desktop.
Checks for duplicate values for a column in an Excel file. Requires libreoffice.
# usage:
# arg1: xls file
# arg2: col position to check for dups
#
# file name without extension
base_name=$(echo $1 | cut -d. -f1)
# convert to csv
libreoffice --headless --convert-to csv $1 --outdir .
# cut out the column specified and check for dups
cut -d, -f${2} "${base_name}.csv" | sort | uniq -d > "$base_name.dups"
echo "$(wc -l "$base_name.dups") duplicated column values"
rm ${base_name}.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment