Skip to content

Instantly share code, notes, and snippets.

@rproenca
Last active May 1, 2023 14:51
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save rproenca/5b5fbadd38647a40d8be4497fb8aeb8a to your computer and use it in GitHub Desktop.
Save rproenca/5b5fbadd38647a40d8be4497fb8aeb8a to your computer and use it in GitHub Desktop.
Delete multiple items in a DynamoDB table
const AWS = require('aws-sdk');
const {parallelScan} = require('@shelf/dynamodb-parallel-scan');
const TABLE_NAME = '[table-name]';
const PRIMARY_PARTITION_KEY = '[partition-key]';
async function fetchAll() {
const CONCURRENCY = 250;
const alias = `#${PRIMARY_PARTITION_KEY}`;
const name = PRIMARY_PARTITION_KEY;
const scanParams = {
TableName: TABLE_NAME,
ProjectionExpression: alias,
ExpressionAttributeNames: {[alias]: name},
};
const items = await parallelScan(scanParams, {concurrency: CONCURRENCY});
return items;
}
function prepareRequestParams(items) {
const requestParams = items.map((i) => ({
DeleteRequest: {
Key: {
[PRIMARY_PARTITION_KEY]: i[PRIMARY_PARTITION_KEY],
},
},
}));
return requestParams;
}
async function sliceInChunks(arr) {
let i;
let j;
const CHUNK_SIZE = 25; // DynamoDB BatchWriteItem limit
const chunks = [];
for (i = 0, j = arr.length; i < j; i += CHUNK_SIZE) {
chunks.push(arr.slice(i, i + CHUNK_SIZE));
}
return chunks;
}
async function deleteItems(chunks) {
const documentclient = new AWS.DynamoDB.DocumentClient();
const promises = chunks.map(async function(chunk) {
const params = {RequestItems: {[TABLE_NAME]: chunk}};
const res = await documentclient.batchWrite(params).promise();
return res;
});
return await Promise.all(promises);
}
(async () => {
const items = await fetchAll();
const params = prepareRequestParams(items);
const chunks = await sliceInChunks(params);
const res = await deleteItems(chunks);
console.log(JSON.stringify(res));
})();
@rproenca
Copy link
Author

Uses module @shelf/dynamodb-parallel-scan to retrieve items in parallel.
Ref: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html#Scan.ParallelScan

Uses BatchWriteItem to delete multiple items in chunks of 25.
Ref: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

@isaacSennerholt
Copy link

isaacSennerholt commented Feb 21, 2021

@rproenca, thank you for the example. Though I have questions regarding the internal retry functionality with the aws JS sdk. Let’s say that one of your batchWrite requests fail in above example, would the Promise.all reject on initial request failure? Or would it reject only if one of the batchWrite requests reached the internal, or set, retry limit? I assume by using Promise.all it’s the latter, otherwise i’d expect a Promise.allSettled with some retry logic. What’s your experience with this?

@brian-moore
Copy link

@isaacSennerholt Just to add some other thoughts here for future readers. You can have the SDK do retires natively like the following

const dynamodb = new AWS.DynamoDB({
  maxRetries: 5,
  retryDelayOptions: {
    base: 300
  }
});

const documentClient = new AWS.DynamoDB.DocumentClient({ service: dynamodb });

However, you still need to be able to handle partial failures. The SDK call will succeed, but can give items in the UnprocessedItem and you need to set up a retry for those failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment