Skip to content

Instantly share code, notes, and snippets.

@hrwgc
Last active June 19, 2023 15:32
Show Gist options
  • Star 18 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hrwgc/3fedab87eb937772ca58 to your computer and use it in GitHub Desktop.
Save hrwgc/3fedab87eb937772ca58 to your computer and use it in GitHub Desktop.
aws-cli get total size of all objects within s3 prefix. (mimic behavior of `s3cmd du` with aws-cli)
#!/bin/bash
function s3du(){
bucket=`cut -d/ -f3 <<< $1`
prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
aws s3api list-objects --bucket $bucket --prefix=$prefix --output json --query '[sum(Contents[].Size), length(Contents[])]' | jq '. |{ size:.[0],num_objects: .[1]}'
}
s3du $1;
@Sam-Martin
Copy link

FYI as of the 28th of July 2015 you can get this information via CloudWatch.

 aws cloudwatch get-metric-statistics --namespace AWS/S3 --start-time 2015-07-15T10:00:00 --end-time 2015-07-31T01:00:00 --period 86400 --statistics Average --region eu-west-1 --metric-name BucketSizeBytes --dimensions Name=BucketName,Value=toukakoukan.com Name=StorageType,Value=StandardStorage

Important: You must specify both StorageType and BucketName in the dimensions argument otherwise you will get no results.

@bitless
Copy link

bitless commented Dec 22, 2015

the cloudwatch command does not appear to support prefixes below bucket level.

@joech4n
Copy link

joech4n commented Jan 29, 2016

If you want to sum by top level prefixes within a bucket, you can also try something like this

@sasikanumuri
Copy link

i tried running the script as is, but i couldn't succeed, can you help me with this. Thanks

@clintval
Copy link

clintval commented Dec 21, 2017

My not-so-terse adaption for printing human-readable sizes:

function aws3du(){
  bucket=`cut -d/ -f3 <<< $1`
  prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
  aws s3api \
    list-objects \
    --bucket $bucket \
    --prefix=$prefix \
    --output text \
    --query '[sum(Contents[].Size), length(Contents[])]' \
    | while read -r size num_objects; do
      jq '. |{ size:.[0],num_objects: .[1]}' <<< "[\"$(numfmt --to=si ${size})\",${num_objects}]"
     done
}

Usage:

❯ aws3du ${s3path}
{
  "size": "328K",
  "num_objects": 1
}

Will be made simpler if something like this gets implemented: jqlang/jq#147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment