Created
November 4, 2021 15:57
-
-
Save schuettc/83c2ef71b5604eff69d1248553a8ef2c to your computer and use it in GitHub Desktop.
Bulk Delete S3 Objects by LastModified Date
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import boto3 | |
from datetime import datetime, timezone | |
s3 = boto3.client("s3") | |
bucket = "INSERT_BUCKET_NAME" | |
paginator = s3.get_paginator("list_objects_v2") | |
pages = paginator.paginate(Bucket=bucket) | |
date_check = datetime(2021, 11, 1) | |
keys_to_delete = [] | |
for page in pages: | |
for object in page["Contents"]: | |
if object["LastModified"] > date_check.replace(tzinfo=timezone.utc): | |
keys_to_delete.append({"Key": object["Key"]}) | |
s3.delete_objects(Bucket=bucket, Delete={"Objects": keys_to_delete}) |
That would be one way to do it (assuming you're trying to clear out old objects in an S3 bucket). There may be an easier way to do it though: https://aws.amazon.com/premiumsupport/knowledge-center/s3-empty-bucket-lifecycle-rule/
To do the compare, you'd want to check the current date to the delta.
Maybe something like this:
from datetime import date
today = date.today()
date_check = today - 7
That should set the date_check to be 7 days prior to the current date. This hasn't been tested but should be a good start to solving that.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I want to compare the LastModified date with the current date of that day. So I want to run this function everyday and if the LastModified date is <, > or = current date, it should delete the object. Also I am based in Sydney, so the LastModified date is shown in AEST whereas Lambda checks date in UTC.
Please help if possible, I don't have much coding knowledge and I am new to DevOps space.
Thank You.