Last active
March 10, 2017 18:06
-
-
Save chriswhong/fd2941fba840262d0657daaa26e87bab to your computer and use it in GitHub Desktop.
Chunk a csv into many files
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
FILENAME=cpdb_spending.csv | |
HDR=$(head -1 $FILENAME) # Pick up CSV header line to apply to each file | |
split -l 200000 $FILENAME xyz # Split the file into chunks of 20 lines each | |
n=1 | |
for f in xyz* # Go through all newly created chunks | |
do | |
if [n -gt 1] | |
then | |
echo $HDR > Part${n}.csv # Write out header to new file called "Part(n)" | |
fi | |
cat $f >> Part${n}.csv # Add in the lines from the "split" command | |
zip -r Part${n}.zip Part${n}.csv | |
rm $f # Remove temporary file | |
rm Part${n}.csv | |
((n++)) # Increment name of output part | |
done | |
# Found on this quora post and adapted https://www.quora.com/How-can-I-parse-a-CSV-string-with-Javascript |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment