Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
How to use boto3 with google cloud storage and python to emulate s3 access.
from boto3.session import Session
from botocore.client import Config
from botocore.handlers import set_list_objects_encoding_type_url
import boto3
ACCESS_KEY = "xx"
SECRET_KEY = "yy"
boto3.set_stream_logger('')
session = Session(aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
region_name="US-CENTRAL1")
session.events.unregister('before-parameter-build.s3.ListObjects',
set_list_objects_encoding_type_url)
s3 = session.resource('s3', endpoint_url='https://storage.googleapis.com',
config=Config(signature_version='s3v4'))
bucket = s3.Bucket('yourbucket')
for f in bucket.objects.all():
print(f.key)
@gleicon

This comment has been minimized.

Copy link
Owner Author

@gleicon gleicon commented Dec 18, 2017

Google Cloud implements the latest s3 protocol. Most errors while trying to make it work were like the following:

botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the ListObjects operation: Invalid argument.

Which meant (in this order): Forgot to set region, mistyped google's region, have not set the proper protocol version and having boto append "encoding=url" as query string which Google Storage won't accept. The last one was tricky to unregister.

Setting boto logger helped track, as it helped reading the header dump (sometimes it would complains about an amz-sha256 header). Thanks to all github issues and pieces around I managed to make it work. Put it all together in this gist so other people can get over this and get work done.

@ruurtjan

This comment has been minimized.

Copy link

@ruurtjan ruurtjan commented Dec 30, 2017

Thanks for sharing! This also works from AWS lambda's, where boto3 is pre-installed :)

Took me a while to figure out what access key and secret to use, but here's what you need to do to get them: https://cloud.google.com/storage/docs/migrating#keys

@FISMAL

This comment has been minimized.

Copy link

@FISMAL FISMAL commented Jan 17, 2018

@gleicon have you got the working version of this? I am getting Invalid argument all the time.

@FISMAL

This comment has been minimized.

Copy link

@FISMAL FISMAL commented Jan 17, 2018

Never mind it works now.

@gleicon

This comment has been minimized.

Copy link
Owner Author

@gleicon gleicon commented Feb 19, 2018

Sorry folks, github doesn't makes it easy to track gist comments. This is working to me to move an AWS project to GCP. I should have mentioned that you need to create the storage instance in "interoperability mode" to get the AWS like credentials. I had to score the docs and boto code to figure that out, so I'm glad that helped.

@pgillet

This comment has been minimized.

Copy link

@pgillet pgillet commented Feb 22, 2018

Thank you, that helped a lot.
As for the generation of pre-signed URLs, just replace the 'AWSAccessKeyId' query param in the generated URL by 'GoogleAccessId' to make it work.
Neither boto3 nor botocore mention the literal 'GoogleAccessId' in their code, so you have to replace it by hand as follows:

url = s3.meta.client.generate_presigned_url(
    ClientMethod='get_object',
    Params={
        'Bucket': 'yourbucket',
        'Key': 'object.txt'
    }
)

url = url.replace('AWSAccessKeyId', 'GoogleAccessId')
@mcint

This comment has been minimized.

Copy link

@mcint mcint commented Jan 15, 2019

Thanks for a minimum viable solution to the gcp interop issue. (Linking because I didn't understand your solution until reading the issue thread).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment