Skip to content

Instantly share code, notes, and snippets.

@k3karthic
Last active July 12, 2022 18:56
Show Gist options
  • Star 22 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save k3karthic/4bc929885eef40dbe010 to your computer and use it in GitHub Desktop.
Save k3karthic/4bc929885eef40dbe010 to your computer and use it in GitHub Desktop.
Truncate all keys in a dynamodb table
#!/bin/bash
TABLE_NAME=$1
# Get id list
aws dynamodb scan --table-name $TABLE_NAME | grep ID | awk '{ print $2 }' > /tmp/truncate.list
# Delete from id list
cat /tmp/truncate.list | xargs -IID aws dynamodb delete-item --table-name $TABLE_NAME --key '{ "id": { "S": "ID" }}'
# Remove id list
rm /tmp/truncate.list
@toshke
Copy link

toshke commented Sep 28, 2017

@k3karthic I've made small improvement - so you can specify primary key, rather than relying on "ID" - https://gist.github.com/toshke/d972b56c6273639ace5f62361e1ffac1
it requires jq installed though

@grozmaistor
Copy link

grozmaistor commented Aug 29, 2019

Hello,
do you think this will work on a table with 100 million items? And how much time it will need to erase them one by one?

@k3karthic
Copy link
Author

Hello,
do you think this will work on a table with 100 million items? And how much time it will need to erase them one by one?

This script first download's all the keys and then deletes each item one by one. It can be made faster by using the parallel option of xargs command, but the initial scan would still be sequential.

For a large number of items, it would be better to write a program which uses DynamoDB parallel scan and use BatchWriteItem to delete multiple keys in one API call.

If downtime is not a problem, it should be possible to drop the table and recreate it later, but I have never tried this on such a large table before.

@jenslauterbach
Copy link

Just FYI: I just released a simple CLI program that does exactly that: Use segments and parallel scan etc to be as fast as possible.

It is very early in development, but feedback is always appreciated.

http://github.com/jenslauterbach/ddbt

@michaelrios
Copy link

@k3karthic I've made small improvement - so you can specify primary key, rather than relying on "ID" - https://gist.github.com/toshke/d972b56c6273639ace5f62361e1ffac1
it requires jq installed though

I added on to yours to be able to delete only rows with a given Partition Key prefix and Sort Key
https://gist.github.com/michaelrios/05dbf08efeb2efab86f12013bcb1129f

@dalazx
Copy link

dalazx commented Sep 30, 2021

aws dynamodb scan --table-name "$TABLE_NAME" | jq -c '.Items[]' | \
    xargs -L1 -I{} -0 aws dynamodb delete-item --table-name "$TABLE_NAME" --key '{}'

See --select SPECIFIC_ATTRIBUTES in aws dynamodb scan help to tune the key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment