Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Delete all versions of all files in s3 versioned bucket using AWS CLI and jq.
#!/bin/bash
bucket=$1
set -e
echo "Removing all versions from $bucket"
versions=`aws s3api list-object-versions --bucket $bucket |jq '.Versions'`
markers=`aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'`
let count=`echo $versions |jq 'length'`-1
if [ $count -gt -1 ]; then
echo "removing files"
for i in $(seq 0 $count); do
key=`echo $versions | jq .[$i].Key |sed -e 's/\"//g'`
versionId=`echo $versions | jq .[$i].VersionId |sed -e 's/\"//g'`
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
fi
let count=`echo $markers |jq 'length'`-1
if [ $count -gt -1 ]; then
echo "removing delete markers"
for i in $(seq 0 $count); do
key=`echo $markers | jq .[$i].Key |sed -e 's/\"//g'`
versionId=`echo $markers | jq .[$i].VersionId |sed -e 's/\"//g'`
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
fi
@geekifier

This comment has been minimized.

Copy link

commented Jul 12, 2017

Thanks for sharing! It really helped me out with a specific migration use case. It worked great!

@upendrasoft

This comment has been minimized.

Copy link

commented Jul 26, 2017

Thanks. I have updated the script to improve performance (for buckets with too many versions and objects). I hope this is useful for other too.

#!/bin/bash

bucket=$1

set -e

echo "Removing all versions from $bucket"

versions=`aws s3api list-object-versions --bucket $bucket |jq '.Versions'`
markers=`aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'`

echo "removing files"
for version in $(echo "${versions}" | jq -r '.[] | @base64'); do 
    version=$(echo ${version} | base64 --decode)

    key=`echo $version | jq -r .Key`
    versionId=`echo $version | jq -r .VersionId `
    cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
    echo $cmd
    $cmd
done

echo "removing delete markers"
for marker in $(echo "${markers}" | jq -r '.[] | @base64'); do 
    marker=$(echo ${marker} | base64 --decode)

    key=`echo $marker | jq -r .Key`
    versionId=`echo $marker | jq -r .VersionId `
    cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
    echo $cmd
    $cmd
done

@cbrinker

This comment has been minimized.

Copy link

commented Sep 5, 2017

The script was generally well behaved, however key's with spaces caused some troubles. Thanks for sharing!

@jeroenbaas

This comment has been minimized.

Copy link

commented Oct 2, 2017

This can be done much more efficiently by making use of the --query parameter:
aws s3api list-object-versions --bucket $bucket --prefix $somePrefixToFilterByIfYouNeedTo --query "[Versions,DeleteMarkers][].{Key: Key, VersionId: VersionId}"
after which you can just loop over the results in one go.
I was looking for something like this to solve a slow bucket with millions of deleted keys that can potentially speed the bucket up, but above code would sit for days acquiring hugely overheaded json (twice!)

example of the entire code (including a prefix in case you only want to clear a subset of a bucket), using --output text to loop over the results in text mode (even less overhead).

#!/bin/bash

bucket=$1
prefix=$2
set -e

echo "Removing all versions from $bucket, prefix $prefix"

OIFS="$IFS" ; IFS=$'\n' ; oset="$-" ; set -f
while IFS="$OIFS" read -a line 
do 
    key=`echo ${line[0]} | sed 's#SPACEREPLACE# #g'` # replace the TEMPTEXT by space again (needed to temp replace because of split by all spaces by read -a above)
    versionId=${line[1]}
    echo "key: ${key} versionId: ${versionId}"
    # use doublequotes (escaped) around the key to allow for spaces in the key.
    cmd="/usr/bin/aws s3api delete-object --bucket $bucket --key \"$key\" --version-id $versionId"
    echo $cmd
    eval $cmd
done < <(aws s3api list-object-versions --bucket $bucket --prefix $prefix --query "[Versions,DeleteMarkers][].{Key: Key, VersionId: VersionId}" --output text | sed 's# #SPACEREPLACE#g' )
@jnawk

This comment has been minimized.

Copy link

commented Oct 26, 2017

AWS CLI requires python, and there's a much much better way to do this using python:

import boto3
session = boto3.session()
s3 = session.resource(service_name='s3')
bucket = s3.Bucket('your_bucket_name')
bucket.object_versions.delete()
# bucket.delete()
@nicdoye

This comment has been minimized.

Copy link

commented Oct 30, 2017

@jnawk Nice! Minor typo: it should be boto3.Session() not boto3.session().

@mattbryson

This comment has been minimized.

Copy link

commented Nov 9, 2017

@jnawk Thats awesome! Saved me so much time. 2 quick questions...

  1. object_versions.delete() doesn't appear to remove zero byte keys, any idea how to get round that?
  2. is there a way to enable verbose output on boto3? .delete() can take a long time on large buckets, would be good to see some sort of progress...

UPDATE: its not Zero byte keys, its Mac OS "Icon?" files, when uploaded to S3, a newline gets appended to the file name, which stuffs all the S3 tooling, even the console. Have raised this with AWS.

@joelthompson

This comment has been minimized.

Copy link

commented Dec 12, 2017

There's a small bug when you have only a single object and/or delete marker. Basically, with this:

let count=`echo $versions |jq 'length'`-1

For some reason, if count is 0, bash counts that as an error, and because you do a set -e above, this causes the script to fail out.

@JohnVonNeumann

This comment has been minimized.

Copy link

commented Apr 16, 2018

You are a champion amongst men. Cheers.

@arichiardi

This comment has been minimized.

Copy link

commented May 9, 2018

Thanks for this script, I tweaked the original one in order to urldecode UTF-8 keys coming from the bucket:

#!/bin/bash

bucket=$1

set -e

echo "Removing all versions from $bucket"

function urldecode {
    echo $(python -c "import sys, urllib as ul; print ul.unquote_plus(sys.argv[1])" $1);
}

versions=`aws s3api list-object-versions --encoding-type url --bucket $bucket | jq '.Versions'`
markers=`aws s3api list-object-versions --encoding-type url --bucket $bucket | jq '.DeleteMarkers'`

echo "removing files"
for version in $(echo "${versions}" | jq -r '.[] | @base64'); do
    version=$(echo ${version} | base64 --decode)

    key=`echo $version | jq -r .Key`
    versionId=`echo $version | jq -r .VersionId`
    decodedVersionId=$(urldecode "$key")
    cmd="aws s3api delete-object --bucket $bucket --key $decodedVersionId --version-id $versionId"
    echo $cmd
    $cmd
done

echo "removing delete markers"
for marker in $(echo "${markers}" | jq -r '.[] | @base64'); do
    marker=$(echo ${marker} | base64 --decode)

    key=`echo $marker | jq -r .Key`
    versionId=`echo $marker | jq -r .VersionId`
    decodedVersionId=$(urldecode "$key")
    cmd="aws s3api delete-object --bucket $bucket --key $decodedVersionId --version-id $versionId"
    echo $cmd
    $cmd
done
@tokozedg

This comment has been minimized.

Copy link

commented May 15, 2018

bucket.object_versions.filter(
        Prefix='folder'
).delete()

This worked for me very well.

@ip1981

This comment has been minimized.

Copy link

commented Sep 24, 2018

Guys, there is set -x for this:

cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
@ip1981

This comment has been minimized.

Copy link

commented Sep 24, 2018

And you can use Jq to build up command lines:

... | jq -r '.Versions[] | "aws s3api delete-object --bucket capitalmatch-backups --key \"\(.Key)\" --version-id \"\(.VersionId)\""'
@felipekiko

This comment has been minimized.

Copy link

commented Dec 2, 2018

Works for me with some encoding filenames in version files. Thanks!!

@wknapik

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.