Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Delete all archives in an AWS Vault

AWS Glacier: Delete vault

Follow these steps to remove all archives from an AWS vault. After this is finished, you will be able to delete the vault itself through the browser console.

Step 1 / Retrieve inventory

This will create a job that collects required information about the vault.

$ aws glacier initiate-job --job-parameters '{"Type": "inventory-retrieval"}' --account-id YOUR_ACCOUNT_ID --region YOUR_REGION --vault-name YOUR_VAULT_NAME 

This can take hours or even days, depending on the size of the vault. Use the following command to check if it is ready:

aws glacier list-jobs --account-id YOUR_ACCOUNT_ID --region YOUR_REGION --vault-name YOUR_VAULT_NAME 

Copy the JobId (including the quotes) for the next step.

Step 2 / Get the ArchivesIds

The following command will result in a file listing all archive IDs, required for step 3.

$ aws glacier get-job-output --account-id YOUR_ACCOUNT_ID --region YOUR_REGION --vault-name YOUR_VAULT_NAME --job-id YOUR_JOB_ID ./output.json

Step 3 / Delete archives

Set the following parameters through environment variables:

export AWS_ACCOUNT_ID=YOUR_ACCOUNT_ID
export AWS_REGION=YOUR_REGION
export AWS_VAULT_NAME=cvast-YOUR_VAULT_NAME

Create a file with the following content and run it:

#!/bin/bash

file='./output.json'

if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_REGION} ]] || [[ -z ${AWS_VAULT_NAME} ]]; then
	echo "Please set the following environment variables: "
	echo "AWS_ACCOUNT_ID"
	echo "AWS_REGION"
	echo "AWS_VAULT_NAME"
	exit 1
fi

archive_ids=$(jq .ArchiveList[].ArchiveId < $file)

for archive_id in ${archive_ids}; do
    echo "Deleting Archive: ${archive_id}"
    aws glacier delete-archive --archive-id=${archive_id} --vault-name ${AWS_VAULT_NAME} --account-id ${AWS_ACCOUNT_ID} --region ${AWS_REGION}
done

echo "Finished deleting archives"

Acknowledgement

This tutorial is based on this one: https://gist.github.com/Remiii/507f500b5c4e801e4ddc

@joel1di1

This comment has been minimized.

Copy link

commented Feb 5, 2018

if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_ACCOUNT_ID} ]]; then

Should be :
if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_REGION} ]] || [[ -z ${AWS_VAULT_NAME} ]]; then

@jfreeman

This comment has been minimized.

Copy link

commented Mar 29, 2018

I'm getting an error from this line in the script: archive_ids=$(jq .ArchiveList[].ArchiveId < $file)

The error is: 'jq: command not found' .. any thoughts on what I need to change in the script in order to repair this situation? I really need to delete some glacier vaults asap. Thanks!

Joshua

@RafaelRuales

This comment has been minimized.

Copy link

commented May 30, 2018

@jfreeman, follow these steps: https://stedolan.github.io/jq/download/
to download jq

@cwilper

This comment has been minimized.

Copy link

commented May 1, 2019

Thanks for sharing this, @veuncent

Here's a tweaked version of the script that processes in a stream (lower memory requirement for huge vaults), gives counts and timestamps, incorporates @joel1di1's fix, and uses AWS_PROFILE, if defined.

#!/usr/bin/env bash

file='./output.json'
id_file='./output-archive-ids.txt'

if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_REGION} ]] || [[ -z ${AWS_VAULT_NAME} ]]; then
        echo "Please set the following environment variables: "
        echo "AWS_ACCOUNT_ID"
        echo "AWS_REGION"
        echo "AWS_VAULT_NAME"
        exit 1
fi

echo "Started at $(date)"

echo -n "Getting archive ids from $file..."
if [[ ! -f $id_file ]]; then
  cat $file | jq -r --stream ". | { (.[0][2]): .[1]} | select(.ArchiveId) | .ArchiveId" > $id_file 2> /dev/null
fi
total=$(wc -l $id_file | awk '{print $1}')
echo "got $total"

num=0
while read -r archive_id; do
  num=$((num+1))
  echo "Deleting archive $num/$total at $(date)"
  if [[ $AWS_PROFILE ]]; then
    aws --profile $AWS_PROFILE glacier delete-archive --archive-id=${archive_id} --vault-name ${AWS_VAULT_NAME} --account-id ${AWS_ACCOUNT_ID} --region ${AWS_REGION}
  else
    aws glacier delete-archive --archive-id=${archive_id} --vault-name ${AWS_VAULT_NAME} --account-id ${AWS_ACCOUNT_ID} --region ${AWS_REGION}
  fi
done < "$id_file"

echo "Finished at $(date)"
echo "Deleted archive ids are in $id_file"

I'd recommend naming it delete-archives.sh and running it in the background on a machine that's going to be on a network for a long time, e.g.:

chmod 755 delete-archives.sh
nohup ./delete-archives.sh > delete-archives.log 2>&1 &
tail -f delete-archives.log
@veuncent

This comment has been minimized.

Copy link
Owner Author

commented May 1, 2019

Nice, thanks for sharing @cwilper !

@veuncent

This comment has been minimized.

Copy link
Owner Author

commented May 31, 2019

if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_ACCOUNT_ID} ]]; then

Should be :
if [[ -z ${AWS_ACCOUNT_ID} ]] || [[ -z ${AWS_REGION} ]] || [[ -z ${AWS_VAULT_NAME} ]]; then

Thanks @joel1di1 ! I updated the script. (sorry for the late reply, I didn't get a notification when you posted this)

@m-leishman

This comment has been minimized.

Copy link

commented Jun 8, 2019

Hi @cwilper
I'm not returning anything from the output.json file. There are archive-ids in there but it returns "Getting archive ids from ./output.json...got 0". Any thoughts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.