Skip to content

Instantly share code, notes, and snippets.

@aabadie
Created June 17, 2016 13:13
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save aabadie/074587354d97d872aff6abb65510f618 to your computer and use it in GitHub Desktop.
Save aabadie/074587354d97d872aff6abb65510f618 to your computer and use it in GitHub Desktop.
Dump arbitrary object in an Amazon S3 cloud storage using Joblib
"""Example of usage of Joblib with Amazon S3."""
import s3io
import joblib
import numpy as np
big_obj = [np.ones((500, 500)), np.random.random((1000, 1000))]
# Customize the following values with yours
bucket = "my-bucket"
key = "my_pickle.pkl"
compress = ('gzip', 3)
credentials = dict(
aws_access_key_id="<Public Key>",
aws_secret_access_key="Private Key",
)
# Dump in an S3 file is easy with Joblib
with s3io.open('s3://{0}/{1}'.format(bucket, key), mode='w',
**credentials) as s3_file:
joblib.dump(big_obj, s3_file, compress=compress)
with s3io.open('s3://{0}/{1}'.format(bucket, key), mode='r',
**credentials) as s3_file:
obj_reloaded = joblib.load(s3_file)
print("Correctly reloaded? {0}".format(all(np.allclose(x, y)
for x, y in zip(big_obj,
obj_reloaded))))
@aabadie
Copy link
Author

aabadie commented Jun 17, 2016

Use Joblib to dump arbitrary objects on Amazon S3

Some external packages are required

  1. awscli provides command line tools to interact and configure your
    Amazon S3 account
    pip install awscli
  2. boto: provides the API to
    connect and interact with Amazon S3 in python
    pip install boto2
  3. s3io: a python package
    providing a file object API to boto
    pip install s3io

Configure your access to Amazon S3

Configure your S3 account with aws configure. Give your
AWS Access Key ID and your AWS Secret Access Key keys.

You are done.

@premsaha24
Copy link

what if i dont have to compress ??

@marksweissma
Copy link

Nice gist! not sure if the intent here is to be stateless, but if it is, doesn't s3io write to disk locally, transfer, and then clean up?

@wcheek
Copy link

wcheek commented Sep 14, 2021

Thank you for this! Using compress=1 gives similar results without needing to specifically specify gzip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment