Skip to content

Instantly share code, notes, and snippets.

@rezamt
Last active May 14, 2019 07:00
Show Gist options
  • Save rezamt/5d5a3939a6cedd25229e9d7e2fcf4d05 to your computer and use it in GitHub Desktop.
Save rezamt/5d5a3939a6cedd25229e9d7e2fcf4d05 to your computer and use it in GitHub Desktop.
aws s3 multipart uploader (using jq and aws s3api command line api)
#!/bin/bash
BUCKET_NAME="soevm"
RAW_FILE_BASE="win2k-soe"
RAW_FILE="win2k-soe.raw"
UPLOAD_PAC="upload.json"
PARTS=7
# For example for Windows 2016 iso 6.5 GB ---> 7 * 1GB
for i in {1..$PARTS}; do dd if=$RAW_FILE of=$RAW_FILE_BASE-$i.raw bs=1024k skip=$[i*1024 - 1024] count=1024; done
# Get UploadId
UploadId=`aws s3api create-multipart-upload --bucket $BUCKET_NAME --key $RAW_FILE | jq --raw-output .UploadId`
echo "Upload ID: $UploadId"
cat >$UPLOAD_PAC<<EOF
{
"Parts": [
]
}
EOF
# Run through all slices and upload them to AWS
for i in {1..$PARTS}; do
TIMEFORMAT='Upload completed in %R seconds'
time {
echo "Uploading File part #$i"
ETag=`aws s3api upload-part --bucket soevm --key $RAW_FILE --upload-id $UploadId --part-number $i --body $RAW_FILE_BASE-$i.raw | jq --raw-output .ETag`
echo "ETag: $ETag for UploadNumber $i"
jq --arg ETag $ETag --argjson PartNumber $i '.Parts += [{"ETag": $ETag, "PartNumber": $PartNumber}]' $UPLOAD_PAC > $UPLOAD_PAC.bak && mv $UPLOAD_PAC.bak $UPLOAD_PAC
}
done
aws s3api complete-multipart-upload --bucket soevm --key $RAW_FILE --upload-id $UploadId --multipart-upload file://$UPLOAD_PAC
rm -rf $RAW_FILE_BASE-*.raw
rm -rf upload.json
@rezamt
Copy link
Author

rezamt commented May 14, 2019

Important:

  • Running in Parallel

Challenge:

  • Bandwidth & Throughput - When running in parallel - will be throughput be as good as single upload

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment