Skip to content

Instantly share code, notes, and snippets.

@mossprescott
Created September 12, 2017 19:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mossprescott/1d4b8724f25de06d3a787d05d2e2c485 to your computer and use it in GitHub Desktop.
Save mossprescott/1d4b8724f25de06d3a787d05d2e2c485 to your computer and use it in GitHub Desktop.
Download, gunzip, and concatenate a directory of (CSV) files from S3
#! /bin/bash
REMOTE_DIR=$1
LOCAL_FILE=$2
if [ -e $LOCAL_FILE ]; then
echo "File exists: $LOCAL_FILE"
exit -1
fi
for f in $(aws s3 ls "$REMOTE_DIR/" | sed -n 's/.* \(part.*gz\)/\1/p'); do
path="$REMOTE_DIR/$f"
tmp=$(mktemp /tmp/XXXXXXXX)
aws s3 cp $path $tmp.gz
ls -l $tmp.gz
gunzip -f $tmp.gz
ls -l $tmp
cat $tmp >> $LOCAL_FILE
rm $tmp
done
ls -l $LOCAL_FILE
wc -l $LOCAL_FILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment