Skip to content

Instantly share code, notes, and snippets.

@koepnick
Last active January 15, 2021 22:26
Show Gist options
  • Save koepnick/7a617a3abbffa207a9d167ad33863805 to your computer and use it in GitHub Desktop.
Save koepnick/7a617a3abbffa207a9d167ad33863805 to your computer and use it in GitHub Desktop.
Connecting to a remote S3 endpoint
"""
This is useful for applications such as Google's Colab which
have simple methods of connecting to each user's invdivdiual
cloud storage, but the speed and latency are terrible.
With this, you can use a much snappier service such as Wasabi
to host large files for machine learning consumption
Regular users: `pip install --user s3fs`
Virtualenv users: `pip install s3fs`
Conda users: `conda install -c conda-forge s3fs`
Jupyter/Colab users: `%pip install s3fs`
Please, for the sake of your future self; never ever use
sudo to install a package. It will always end in tears.
Note: Tested on Linux only
"""
import s3fs
fs = s3fs.S3FileSystem(
anon = False,
key = 'SOMENAME',
secret = 'S0M3V3rYS3kr3tK3y',
client_kwargs = {
'endpoint_url': 'https://s3.us-west-1.wasabisys.com'
}
)
print(fs.listdir('/'))
fs.get_file('/remote/path/data.tgz', '/var/tmp/data.tgz')
"""
If you're on Jupyter and don't feel like having to comment the
pip commands out every time you want to reconnect to a kernel,
replace the normal import command with this stanza:
try:
import s3fs
except ModuleNotFoundError:
%pip install s3fs
try:
import s3fs
except:
print("Well, that didn't work")
except e:
print("You've got bigger proglems than pip\n", e)
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment