Skip to content

Instantly share code, notes, and snippets.

@goblain
Last active November 10, 2015 09:44
Show Gist options
  • Save goblain/281199069d81265c6b0d to your computer and use it in GitHub Desktop.
Save goblain/281199069d81265c6b0d to your computer and use it in GitHub Desktop.
S3 cleanup based on timeslots
#!/bin/bash
##
## Gradual cleanup of S3 stored backups
## by Radek 'Goblin' Pieczonka <goblin@pentex.pl>
##
## Configure with environment variables
## - S3BUCKET
## - S3PATH
## Files need to be named YYYYMMDDHHmm.tgz
## TODO: support for moving to glacier for long term backup storage
##
## Run script like : S3BUCKET=<bucket> S3PATH=<path_in_bucket> AWS_ACCESS_KEY_ID=<aws_id> AWS_SECRET_ACCESS_KEY=<aws_key> ./cleanup.sh
##
for FILE in $(aws s3 ls s3://${S3BUCKET}/${S3PATH}/ | cut -d' ' -f4 | egrep "^[0-9]{12}.tgz")
do
# Calculate days from epoch and diff
FILE_EPOCHDAY=$(( $(date --date ${FILE:0:8} +%s)/24/3600 ))
WEEKSTART_EPOCHDAY=$(( $(date +%s)/24/3600 - $(date +%u) ))
DIFF=$(( ${WEEKSTART_EPOCHDAY} - ${FILE_EPOCHDAY} ))
echo -n "${FILE} : ${FILE_EPOCHDAY} ${WEEKSTART_EPOCHDAY} ${DIFF} : "
# Define modulo LEVEL for next step based on how old is the file in days
LEVEL=0
[ ${DIFF} -gt 7 ] && LEVEL=1
[ ${DIFF} -gt 32 ] && LEVEL=2
[ ${DIFF} -gt 64 ] && LEVEL=3
[ ${DIFF} -gt 128 ] && LEVEL=4
[ ${DIFF} -gt 366 ] && LEVEL=6 && GLACIER=true
# Make decision if keep or remove based on the modulo from dividing by 2^LEVEL (keep if 0)
# counted on a day since epoch (for coherent modulo calculation) effecting in gradualy lowering of backup density
if [ $(( ${FILE_EPOCHDAY}%(2**${LEVEL}) )) -eq 0 ]
then
echo "keep"
[ "${GLACIER}" == "true" ] && echo "and move to glacier (TODO)"
else
echo "remove"
aws s3 rm s3://${S3BUCKET}/${S3PATH}/${FILE}
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment