Skip to content

Instantly share code, notes, and snippets.

@oddskool
Created September 10, 2013 13:33
Show Gist options
  • Save oddskool/6509465 to your computer and use it in GitHub Desktop.
Save oddskool/6509465 to your computer and use it in GitHub Desktop.
Sums size of subdirs in a S3 bucket (and per storage class)
import sys
import boto
from collections import defaultdict
s3 = boto.connect_s3()
bucket = s3.lookup(sys.argv[1])
total_bytes = defaultdict(int)
def process(key):
prefix = "/".join(key.name.split('/')[:2])
total_bytes[key.storage_class[0]+'::'+prefix] += key.size
for n_objects, key in enumerate(bucket):
process(key)
if not n_objects % 300:
data = sorted([(k, v) for k, v in total_bytes.iteritems() if v > 1024**3],
key=lambda x: x[1],
reverse=True)
print "+"*80
for k, v in data:
print "%s : %.2f GB"%(k,v/(1024.0**3))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment