Skip to content

Instantly share code, notes, and snippets.

@dramaticlly
Created June 9, 2022 21:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dramaticlly/006e52d8f8568eb55152e330ed2c9f21 to your computer and use it in GitHub Desktop.
Save dramaticlly/006e52d8f8568eb55152e330ed2c9f21 to your computer and use it in GitHub Desktop.
Recover Data from S3 DeletionMarker using BatchDeletion
#!/bin/bash
set -e
# S3 list object version command reference
# https://docs.aws.amazon.com/cli/latest/reference/s3api/list-object-versions.html
# S3 batch deletion CLI command reference
# https://docs.aws.amazon.com/cli/latest/reference/s3api/delete-objects.html
DAYOFINTEREST=20220506
BUCKET=aiml-prod-data-transfer-legacy
PREFIX=$PATH_TO_YOUR_PREFIX/$DAYOFINTEREST
# 1. list file with latest version as deleltion marker and export to input JSON file
# 2. slice input JSON files by batch size (s3 support up to 1000 item in each batch deletion request) and generate output json
# 3. call delete-objects with sliced json and argument in while loop
aws s3api list-object-versions \
--bucket $BUCKET \
--prefix $PREFIX \
--output json \
--query 'DeleteMarkers[?IsLatest==`true`].[Key, VersionId]' \
> ${DAYOFINTEREST}input.json
TOTALLENGTH=`cat ${DAYOFINTEREST}input.json | jq 'length'`
BATCH_SIZE=1000
lower=1
while (( $lower < $TOTALLENGTH )); do
upper=$((lower + BATCH_SIZE))
echo "$lower:$upper"
cat ${DAYOFINTEREST}input.json \
| jq --argjson lower "$lower" --argjson upper "$upper" '.[$lower:$upper]' \
| jq '{Objects:[ .[]|{Key:.[0],VersionId:.[1]}]}' > ${DAYOFINTEREST}output_${lower}_${upper}.json
echo "delete entries in batch from ${lower} to ${upper} for day=${DAYOFINTEREST}"
aws s3api delete-objects --bucket $BUCKET --delete file://${DAYOFINTEREST}output_${lower}_${upper}.json
lower=$upper
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment