Skip to content

Instantly share code, notes, and snippets.

@steezeburger
Last active November 10, 2022 09:55
Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 9 You must be signed in to fork a gist
  • Save steezeburger/98114746b2e4c5fa1ad1 to your computer and use it in GitHub Desktop.
Save steezeburger/98114746b2e4c5fa1ad1 to your computer and use it in GitHub Desktop.
Bash script for splitting large CSV files into 100 lines while keeping the header.
#!/bin/bash
FILENAME=file-to-split.csv
HDR=$(head -1 ${FILENAME})
split -l 100 ${FILENAME} xyz
n=1
for f in xyz*
do
if [[ ${n} -ne 1 ]]; then
echo ${HDR} > part-${n}-${FILENAME}.csv
fi
cat ${f} >> part-${n}-${FILENAME}.csv
rm ${f}
((n++))
done
@madurapa
Copy link

Thanks.

A couple of improvements can be done though.

  1. The first set takes including the header so the data count always stays as n-1 for the first one.
  2. adding the extension on lines 9 and 11 makes doubled up when writing the files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment