Skip to content

Instantly share code, notes, and snippets.

@helephant
Last active November 23, 2022 10:30
Show Gist options
  • Save helephant/d78df737947f60427a6455f750698f94 to your computer and use it in GitHub Desktop.
Save helephant/d78df737947f60427a6455f750698f94 to your computer and use it in GitHub Desktop.
Do step functions wait for an available lambda if they hit a concurrency limit or do they fail?

We have a process that spawns lots of lambdas that we were worried might hit a concurrency limit. We wanted to know what a step function would do in that situation because the AWS documentation says that poll-based services have a built-in retry mechanism if lambdas are throttled and we were hoping that the subsequent steps would simply wait until there was enough capacity.

So I built a step function with three steps that had a sleep in them, and then set the reserve concurrency to 1 for the middle step. Then I kicked off three step functions at the same time.

The result was the second intance of the step function failed on the throttled lamda with the error message: "Lambda.TooManyRequestsException"

There are retry options in step functions, but it is interesting to know that they don't wait for a lambda slot to be available.

exports.handler = async (event) => {
// evil code just for debugging purposes
function sleep (time) {
return new Promise((resolve) => setTimeout(resolve, time));
}
// Usage!
await sleep(30000);
const response = {
statusCode: 200,
body: JSON.stringify("<number> function finished"),
};
return response;
};
{
"StartAt": "FirstStepFunction",
"States": {
"FirstStepFunction": {
"Type": "Task",
"Resource": "<arn to lambda 1>",
"Next": "SecondStepFunction"
},
"SecondStepFunction" : {
"Type": "Task",
"Resource": "<arn to lambda 2>",
"Next": "ThirdStepFunction"
},
"ThirdStepFunction" : {
"Type": "Task",
"Resource": "<arn to lambda 3>",
"End": true
}
}
}
@mhvelplund
Copy link

Like everyone here knows by now; no they don't.
Step functions allow you to configure a retry scheme with incrementing backoff, triggered by lambda throttling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment