Skip to content

Instantly share code, notes, and snippets.

@cougrimes
Last active September 5, 2018 17:51
Show Gist options
  • Save cougrimes/8047fe30f70a80deb2506b6cfafdcd94 to your computer and use it in GitHub Desktop.
Save cougrimes/8047fe30f70a80deb2506b6cfafdcd94 to your computer and use it in GitHub Desktop.
CSV Splitter. Use as ./csv-splitter.sh [your_csv_filename] [number_of_lines_to_split_into]
#!/bin/bash
FILENAME=$1
FILESIZE=$(stat -c%s "$FILENAME")
FILELINES=$(wc -l < $FILENAME)
SPLITRATIO=$(($FILESIZE / 9500000))
[ -z "$2" ] && LINES=$(($FILELINES / $SPLITRATIO)) || LINES=$2
TRIM="${FILENAME:0:${#FILENAME}-4}"
HDR=$(head -1 $FILENAME)
split -l $LINES $FILENAME xyz
n=1
for f in xyz*
do
echo $HDR > $TRIM"_"${n}.csv
cat $f >> $TRIM"_"${n}.csv
rm $f
((n++))
done
@cougrimes
Copy link
Author

Update 2018-09-05: the Bash script now tries to split CSV files into smaller CSV files just under 10MB if a line number is not passed as a variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment