Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Delete all versions of all files in s3 versioned bucket using AWS CLI and jq.
#!/bin/bash
bucket=$1
set -e
echo "Removing all versions from $bucket"
versions=`aws s3api list-object-versions --bucket $bucket |jq '.Versions'`
markers=`aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'`
let count=`echo $versions |jq 'length'`-1
if [ $count -gt -1 ]; then
echo "removing files"
for i in $(seq 0 $count); do
key=`echo $versions | jq .[$i].Key |sed -e 's/\"//g'`
versionId=`echo $versions | jq .[$i].VersionId |sed -e 's/\"//g'`
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
fi
let count=`echo $markers |jq 'length'`-1
if [ $count -gt -1 ]; then
echo "removing delete markers"
for i in $(seq 0 $count); do
key=`echo $markers | jq .[$i].Key |sed -e 's/\"//g'`
versionId=`echo $markers | jq .[$i].VersionId |sed -e 's/\"//g'`
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
fi
@geekifier

This comment has been minimized.

Copy link

@geekifier geekifier commented Jul 12, 2017

Thanks for sharing! It really helped me out with a specific migration use case. It worked great!

@upendrasoft

This comment has been minimized.

Copy link

@upendrasoft upendrasoft commented Jul 26, 2017

Thanks. I have updated the script to improve performance (for buckets with too many versions and objects). I hope this is useful for other too.

#!/bin/bash

bucket=$1

set -e

echo "Removing all versions from $bucket"

versions=`aws s3api list-object-versions --bucket $bucket |jq '.Versions'`
markers=`aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'`

echo "removing files"
for version in $(echo "${versions}" | jq -r '.[] | @base64'); do 
    version=$(echo ${version} | base64 --decode)

    key=`echo $version | jq -r .Key`
    versionId=`echo $version | jq -r .VersionId `
    cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
    echo $cmd
    $cmd
done

echo "removing delete markers"
for marker in $(echo "${markers}" | jq -r '.[] | @base64'); do 
    marker=$(echo ${marker} | base64 --decode)

    key=`echo $marker | jq -r .Key`
    versionId=`echo $marker | jq -r .VersionId `
    cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
    echo $cmd
    $cmd
done

@cbrinker

This comment has been minimized.

Copy link

@cbrinker cbrinker commented Sep 5, 2017

The script was generally well behaved, however key's with spaces caused some troubles. Thanks for sharing!

@jeroenbaas

This comment has been minimized.

Copy link

@jeroenbaas jeroenbaas commented Oct 2, 2017

This can be done much more efficiently by making use of the --query parameter:
aws s3api list-object-versions --bucket $bucket --prefix $somePrefixToFilterByIfYouNeedTo --query "[Versions,DeleteMarkers][].{Key: Key, VersionId: VersionId}"
after which you can just loop over the results in one go.
I was looking for something like this to solve a slow bucket with millions of deleted keys that can potentially speed the bucket up, but above code would sit for days acquiring hugely overheaded json (twice!)

example of the entire code (including a prefix in case you only want to clear a subset of a bucket), using --output text to loop over the results in text mode (even less overhead).

#!/bin/bash

bucket=$1
prefix=$2
set -e

echo "Removing all versions from $bucket, prefix $prefix"

OIFS="$IFS" ; IFS=$'\n' ; oset="$-" ; set -f
while IFS="$OIFS" read -a line 
do 
    key=`echo ${line[0]} | sed 's#SPACEREPLACE# #g'` # replace the TEMPTEXT by space again (needed to temp replace because of split by all spaces by read -a above)
    versionId=${line[1]}
    echo "key: ${key} versionId: ${versionId}"
    # use doublequotes (escaped) around the key to allow for spaces in the key.
    cmd="/usr/bin/aws s3api delete-object --bucket $bucket --key \"$key\" --version-id $versionId"
    echo $cmd
    eval $cmd
done < <(aws s3api list-object-versions --bucket $bucket --prefix $prefix --query "[Versions,DeleteMarkers][].{Key: Key, VersionId: VersionId}" --output text | sed 's# #SPACEREPLACE#g' )
@jnawk

This comment has been minimized.

Copy link

@jnawk jnawk commented Oct 26, 2017

AWS CLI requires python, and there's a much much better way to do this using python:

import boto3
session = boto3.session()
s3 = session.resource(service_name='s3')
bucket = s3.Bucket('your_bucket_name')
bucket.object_versions.delete()
# bucket.delete()
@nicdoye

This comment has been minimized.

Copy link

@nicdoye nicdoye commented Oct 30, 2017

@jnawk Nice! Minor typo: it should be boto3.Session() not boto3.session().

@mattbryson

This comment has been minimized.

Copy link

@mattbryson mattbryson commented Nov 9, 2017

@jnawk Thats awesome! Saved me so much time. 2 quick questions...

  1. object_versions.delete() doesn't appear to remove zero byte keys, any idea how to get round that?
  2. is there a way to enable verbose output on boto3? .delete() can take a long time on large buckets, would be good to see some sort of progress...

UPDATE: its not Zero byte keys, its Mac OS "Icon?" files, when uploaded to S3, a newline gets appended to the file name, which stuffs all the S3 tooling, even the console. Have raised this with AWS.

@joelthompson

This comment has been minimized.

Copy link

@joelthompson joelthompson commented Dec 12, 2017

There's a small bug when you have only a single object and/or delete marker. Basically, with this:

let count=`echo $versions |jq 'length'`-1

For some reason, if count is 0, bash counts that as an error, and because you do a set -e above, this causes the script to fail out.

@JohnVonNeumann

This comment has been minimized.

Copy link

@JohnVonNeumann JohnVonNeumann commented Apr 16, 2018

You are a champion amongst men. Cheers.

@arichiardi

This comment has been minimized.

Copy link

@arichiardi arichiardi commented May 9, 2018

Thanks for this script, I tweaked the original one in order to urldecode UTF-8 keys coming from the bucket:

#!/bin/bash

bucket=$1

set -e

echo "Removing all versions from $bucket"

function urldecode {
    echo $(python -c "import sys, urllib as ul; print ul.unquote_plus(sys.argv[1])" $1);
}

versions=`aws s3api list-object-versions --encoding-type url --bucket $bucket | jq '.Versions'`
markers=`aws s3api list-object-versions --encoding-type url --bucket $bucket | jq '.DeleteMarkers'`

echo "removing files"
for version in $(echo "${versions}" | jq -r '.[] | @base64'); do
    version=$(echo ${version} | base64 --decode)

    key=`echo $version | jq -r .Key`
    versionId=`echo $version | jq -r .VersionId`
    decodedVersionId=$(urldecode "$key")
    cmd="aws s3api delete-object --bucket $bucket --key $decodedVersionId --version-id $versionId"
    echo $cmd
    $cmd
done

echo "removing delete markers"
for marker in $(echo "${markers}" | jq -r '.[] | @base64'); do
    marker=$(echo ${marker} | base64 --decode)

    key=`echo $marker | jq -r .Key`
    versionId=`echo $marker | jq -r .VersionId`
    decodedVersionId=$(urldecode "$key")
    cmd="aws s3api delete-object --bucket $bucket --key $decodedVersionId --version-id $versionId"
    echo $cmd
    $cmd
done
@tokozedg

This comment has been minimized.

Copy link

@tokozedg tokozedg commented May 15, 2018

bucket.object_versions.filter(
        Prefix='folder'
).delete()

This worked for me very well.

@ip1981

This comment has been minimized.

Copy link

@ip1981 ip1981 commented Sep 24, 2018

Guys, there is set -x for this:

cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
@ip1981

This comment has been minimized.

Copy link

@ip1981 ip1981 commented Sep 24, 2018

And you can use Jq to build up command lines:

... | jq -r '.Versions[] | "aws s3api delete-object --bucket capitalmatch-backups --key \"\(.Key)\" --version-id \"\(.VersionId)\""'
@felipekiko

This comment has been minimized.

Copy link

@felipekiko felipekiko commented Dec 2, 2018

Works for me with some encoding filenames in version files. Thanks!!

@wknapik

This comment has been minimized.

@kaosinc

This comment has been minimized.

Copy link

@kaosinc kaosinc commented Feb 16, 2020

Thank you SO MUCH!

@nashjain

This comment has been minimized.

Copy link

@nashjain nashjain commented Apr 8, 2020

There is actually a much simpler and faster approach:

bucket=$1
fileToDelete=$2
deleteBefore=$3
fileName='aws_delete.json'
rm $fileName
versionsToDelete=`aws s3api list-object-versions --bucket "$bucket" --prefix "$fileToDelete" --query "Versions[?(LastModified<'$deleteBefore')].{Key: Key, VersionId: VersionId}"`
cat << EOF > $fileName
{"Objects":$versionsToDelete, "Quiet":true}
EOF
aws s3api delete-objects --bucket "$bucket" --delete file://$fileName

s3api delete-objects can handle up to 1000 records.

Want to do more advance stuff? Check out my gist.

@RahulAdepu92

This comment has been minimized.

Copy link

@RahulAdepu92 RahulAdepu92 commented Jul 27, 2020

Note: Until and unless you don't have "s3:DeleteObjectVersion" included in policy under IAM role, all version deletion wont be working.

@marcuspaget

This comment has been minimized.

Copy link

@marcuspaget marcuspaget commented Jul 28, 2020

Thanks @nashjain ... here is my version off yours :)

(echo -n '{"Objects":';aws s3api list-object-versions --bucket "$bucket" --prefix "$prefix" --max-items 1000 --query "Versions[?(LastModified<'2020-07-21')].{Key: Key, VersionId: VersionId}" | sed 's#]$#] , "Quiet":true}#') > _TMP_DELETE && aws s3api delete-objects --bucket "$bucket" --delete file://_TMP_DELETE

To do 1000 at a time.

@marcuspaget

This comment has been minimized.

Copy link

@marcuspaget marcuspaget commented Jul 29, 2020

Found I could put in a loop and get through about 3 iterations (or 3k objects a minute). So produced this script which downloads 10k objects, then uses jq to slice 1k at a time and deletes, looping 4k times. Now up to around 4.5k objects a minute.

bucket=_BUCKET_NAME_
prefix=_PREFIX_

cnt=0
FN=/tmp/_TMP_DELETE
rm $FN 2> /dev/null

while [ $cnt -lt 4000 ]
do
	aws s3api list-object-versions --bucket "$bucket" --prefix "$prefix" --max-items 10000 --query "Versions[?(LastModified<'2019-07-21')].{Key: Key, VersionId: VersionId}" > $FN
	rm $FN.upload 2> /dev/null
	s=0
	while [ $s -lt 9999 ]
	do
		((e=s+999))
		#echo taking $s to $e
		(echo -n '{"Objects":';jq ".[$s:$e]" < $FN 2>&1 | sed 's#]$#] , "Quiet":true}#') > $FN.upload
		aws s3api delete-objects --bucket "$bucket" --delete file://$FN.upload && rm $FN.upload
		((s=e+1))
		#echo s is $s and e is $e
		echo -n "."
	done

((cnt++))
((tot=cnt*10))
echo on run $cnt total deleted ${tot}k objects

done
@marcuspaget

This comment has been minimized.

Copy link

@marcuspaget marcuspaget commented Jul 29, 2020

Okay ... faster still (~10k/min) - just dump all in the file then:

bucket=_BUCKET_
prefix=_PREFIX_
SRCFN=_DUMP_FILE_
FN=/tmp/_TMP_DELETE

aws s3api list-object-versions --bucket "$bucket" --prefix "$prefix" --query "Versions[?(LastModified<'2019-07-21')].{Key: Key, VersionId: VersionId}" > $SRCFN

rm $FN 2> /dev/null
s=0
c=`grep -c VersionId $SRCFN`

while [ $s -lt $c ]
do
	((e=s+999))
	echo taking $s to $e
	(echo -n '{"Objects":';jq ".[$s:$e]" < $SRCFN 2>&1 | sed 's#]$#] , "Quiet":true}#') > $FN
	aws s3api delete-objects --bucket "$bucket" --delete file://$FN && rm $FN
	((s=e+1))
	sleep 1
	#echo s is $s and e is $e
	#echo -n "."
done
@git-hemant

This comment has been minimized.

Copy link

@git-hemant git-hemant commented Aug 31, 2020

Yet another minor update to fix the issue when the key (file name) contain spaces

`#!/bin/bash

bucket=$1

set -e

echo "Removing all versions from $bucket"

versions=aws s3api list-object-versions --bucket $bucket |jq '.Versions'
markers=aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'
let count=echo $versions |jq 'length'-1

if [ $count -gt -1 ]; then
echo "removing files"
for i in $(seq 0 $count); do
key=echo $versions | jq .[$i].Key |sed -e 's/\"//g'
versionId=echo $versions | jq .[$i].VersionId |sed -e 's/\"//g'
cmd="aws s3api delete-object --bucket $bucket --key "$key" --version-id $versionId"
echo $cmd
eval $cmd
done
fi

let count=echo $markers |jq 'length'-1

if [ $count -gt -1 ]; then
echo "removing delete markers"

    for i in $(seq 0 $count); do
            key=`echo $markers | jq .[$i].Key |sed -e 's/\"//g'`
            versionId=`echo $markers | jq .[$i].VersionId |sed -e 's/\"//g'`
            cmd="aws s3api delete-object --bucket $bucket --key \"$key\" --version-id $versionId"
            echo $cmd
            eval $cmd
    done

fi`

@morufajibike

This comment has been minimized.

Copy link

@morufajibike morufajibike commented Dec 22, 2020

AWS CLI requires python, and there's a much much better way to do this using python:

import boto3
session = boto3.session()
s3 = session.resource(service_name='s3')
bucket = s3.Bucket('your_bucket_name')
bucket.object_versions.delete()
# bucket.delete()

This could be, if you want to use a named profile:

import boto3
session = boto3.session.Session(profile_name='your_profile_name')
s3 = session.resource(service_name='s3')
bucket = s3.Bucket('your_bucket_name')

## uncomment the line below to delete your bucket objects versions; BE CAREFUL!!!
# bucket.object_versions.delete()

## uncomment the line below to delete your bucket; BE CAREFUL!!!
# bucket.delete()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.