Skip to content

Instantly share code, notes, and snippets.

@dpwrussell
Created August 12, 2015 17:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dpwrussell/e0cf36ec8c651a66738b to your computer and use it in GitHub Desktop.
Save dpwrussell/e0cf36ec8c651a66738b to your computer and use it in GitHub Desktop.
Extremely rudimentary s3cache code
import boto3
from botocore.exceptions import ClientError
import hashlib
import os
import errno
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as e:
if e.errno == errno.EEXIST and os.path.isdir(path):
pass
else:
raise
def hash_file(path):
sha1 = hashlib.sha1()
f = open(path, 'rb')
try:
sha1.update(f.read())
finally:
f.close()
return sha1.hexdigest()
cache_dir = '/tmp/s3cache/'
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')
bucket = s3.Bucket('dpwr')
def cache_file(path, fullpath):
mkdir_p(os.path.dirname(fullpath))
s3_client.download_file('dpwr', path, fullpath)
def get_file(path):
fullpath = os.path.join(cache_dir, path)
obj = bucket.Object(path)
try:
obj.get()
except ClientError as e:
print 'File %s not in S3' % path
exit(1)
if not os.path.isfile(fullpath):
cache_file(path, fullpath)
else:
# Check that the local cache matches S3
sha1_s3 = obj.metadata['sha1']
sha1_cache = hash_file(fullpath)
if sha1_s3 != sha1_cache:
cache_file(path, fullpath)
return fullpath
print get_file('s3test/hs.tif')
@dpwrussell
Copy link
Author

This relies on having user metadata on the s3 objects named 'sha1' containing the SHA1 calculated file hash.

@dpwrussell
Copy link
Author

@johnbachman This is the code that I was referring to yesterday. It is just test code so it is not fit for any kind of production usage, but it shows how you could create a basic caching mechanism. In this case I used the sha1 to ensure that updates to S3 would be reflected in the cache. Extremely thread unsafe for obvious reasons.

@johnbachman
Copy link

Thanks for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment