Skip to content

Instantly share code, notes, and snippets.

@ceteri
Created December 13, 2019 18:48
Show Gist options
  • Save ceteri/40bf41249858ffcc8097b750b0f2fec2 to your computer and use it in GitHub Desktop.
Save ceteri/40bf41249858ffcc8097b750b0f2fec2 to your computer and use it in GitHub Desktop.
S3 downloads in Python

Assuming you've installed the AWS SDK for Python:

pip install awscli
pip install boto3

Next, run aws configure on your laptop or server to add your AWS credentials.

For details, see https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html

Then you should be able to run this script to list files in the bucket and download them:

python download_s3.py
#!/usr/bin/env python
# encoding: utf-8
import boto3
# initialize access to the bucket
bucket_name = "richcontext"
bucket = boto3.resource("s3").Bucket(bucket_name)
# list the keys for files within our pseudo-directory
prefix = "corpus_docs/"
for obj in bucket.objects.filter(Prefix=prefix):
print(obj.key)
# example of how to download one file
key = "corpus_docs/pdfs/f6b1e82dd866c73e2533.pdf"
local_file = "f6b1e82dd866c73e2533.pdf"
bucket.download_file(key, local_file)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment