Skip to content

Instantly share code, notes, and snippets.

@schuettc
Created November 4, 2021 15:57
Show Gist options
  • Save schuettc/83c2ef71b5604eff69d1248553a8ef2c to your computer and use it in GitHub Desktop.
Save schuettc/83c2ef71b5604eff69d1248553a8ef2c to your computer and use it in GitHub Desktop.
Bulk Delete S3 Objects by LastModified Date
import boto3
from datetime import datetime, timezone
s3 = boto3.client("s3")
bucket = "INSERT_BUCKET_NAME"
paginator = s3.get_paginator("list_objects_v2")
pages = paginator.paginate(Bucket=bucket)
date_check = datetime(2021, 11, 1)
keys_to_delete = []
for page in pages:
for object in page["Contents"]:
if object["LastModified"] > date_check.replace(tzinfo=timezone.utc):
keys_to_delete.append({"Key": object["Key"]})
s3.delete_objects(Bucket=bucket, Delete={"Objects": keys_to_delete})
@Yusufdoc91
Copy link

Hi,

I want to compare the LastModified date with the current date of that day. So I want to run this function everyday and if the LastModified date is <, > or = current date, it should delete the object. Also I am based in Sydney, so the LastModified date is shown in AEST whereas Lambda checks date in UTC.

Please help if possible, I don't have much coding knowledge and I am new to DevOps space.
Thank You.

@schuettc
Copy link
Author

That would be one way to do it (assuming you're trying to clear out old objects in an S3 bucket). There may be an easier way to do it though: https://aws.amazon.com/premiumsupport/knowledge-center/s3-empty-bucket-lifecycle-rule/

To do the compare, you'd want to check the current date to the delta.

Maybe something like this:

from datetime import date
today = date.today()
date_check = today - 7

That should set the date_check to be 7 days prior to the current date. This hasn't been tested but should be a good start to solving that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment