Skip to content

Instantly share code, notes, and snippets.

@justinnaldzin
Created October 17, 2018 20:09
Show Gist options
  • Save justinnaldzin/e844d12a01d4d5a1775a4b322d88cc3a to your computer and use it in GitHub Desktop.
Save justinnaldzin/e844d12a01d4d5a1775a4b322d88cc3a to your computer and use it in GitHub Desktop.
Listing objects and keys in an S3 bucket
import boto3
def get_matching_s3_objects(bucket, prefix='', suffix=''):
"""
Fetch objects in an S3 bucket.
:param bucket: Name of the S3 bucket.
:param prefix: Only fetch objects whose key starts with
this prefix (optional).
:param suffix: Only fetch objects whose keys end with
this suffix (optional).
"""
s3 = boto3.client('s3')
kwargs = {'Bucket': bucket}
# If the prefix is a single string (not a tuple of strings), we can
# do the filtering directly in the S3 API.
if isinstance(prefix, str):
kwargs['Prefix'] = prefix
while True:
# The S3 API response is a large blob of metadata.
# 'Contents' contains information about the listed objects.
resp = s3.list_objects_v2(**kwargs)
try:
contents = resp['Contents']
except KeyError:
return
for obj in contents:
key = obj['Key']
if key.startswith(prefix) and key.endswith(suffix):
yield obj
# The S3 API is paginated, returning up to 1000 keys at a time.
# Pass the continuation token into the next response, until we
# reach the final page (when this field is missing).
try:
kwargs['ContinuationToken'] = resp['NextContinuationToken']
except KeyError:
break
def get_matching_s3_keys(bucket, prefix='', suffix=''):
"""
Fetch the object keys in an S3 bucket.
:param bucket: Name of the S3 bucket.
:param prefix: Only fetch keys that start with this prefix (optional).
:param suffix: Only fetch keys that end with this suffix (optional).
"""
for obj in get_matching_s3_objects(bucket, prefix, suffix):
yield obj['Key']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment