Skip to content

Instantly share code, notes, and snippets.

@yorzi
Created April 27, 2013 09:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yorzi/5472466 to your computer and use it in GitHub Desktop.
Save yorzi/5472466 to your computer and use it in GitHub Desktop.
a script to split a big csv file into several files.
#!/bin/bash
# check if an input filename was passed as a command
# line argument:
if [ ! $# == 1 ]; then
echo "Please specify the name of a file to split!"
exit
fi
# create a directory to store the output:
mkdir output
# create a temporary file containing the header without
# the content:
head -n 1 $1 > header.csv
# create a temporary file containing the content without
# the header:
tail +2 $1 > content.csv
# split the content file into multiple files of 5 lines each:
split -l 40000 content.csv output/data_
# loop through the new split files, adding the header
# and a '.csv' extension:
for f in output/*; do cat header.csv $f > $f.csv; rm $f; done;
# remove the temporary files:
rm header.csv
rm content.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment