Written by Robert Koch
I was scrolling on Linkedin the other day when I came across a sponsored post with a clickbait headline and while this seems like useful advice for setting appropriate retry limits it doesn't address the main issue which is recursion is bad - especially in the cloud.
This got me thinking about Lambda best practices and how functions in the cloud should be used. I think due to a general lack of education and "vibe coding" mentality people have built excessively complicated lambda workflows to solve a myriad of problems. A common example I've seen around are lambda functions that invoke themself in a recursive mess.
The problem isn’t just theoretical. A quick scan of LinkedIn and X reveals engineers actively promoting recursive patterns in cloud workflows.
I'm not sure why recursion has emerged as valid cloud computing paradigm. I think it has a lot to do with recursion not being properly taught or engineers not having been explained the dangers in using recursion.
You absolutely should not write a lambda function that calls itself.
Cloud costs make recursive Lambdas a terrible idea. They're slow, inefficient, and expensive. Lambdas are a tool that should be used when you have a short compute processes or asynchronous job (such as uploading files or writing to a database) and the frequency of the job does not warrant a full time compute resource like a container or instance.
So why shouldn't lambda functions call other lambda functions? There are two main reasons in my opinion.
Recursive lambdas are also an indication in my opinion of poorly written code and badly defined requirements.
Nearly all recursive functions can be transformed to a non recursive form so it's unlikely that what you're trying to do is a fundamentally recursive problem. That being said there are truly recursive operations - however if your code cannot escape using recursion you should be running it in one lambda. This Computerphile video explains one example of a non primate recursive function.
If you need retry logic or some type of loop you should use a step function or SQS queue to handle the state as these systems are designed to handle edge cases much better than your code.
Years ago when I worked at AWS one of the new grads had to create a project for their onboarding, as part of the project they created a recursive lambda function that quickly spiralled out of control. If I remember correctly this was before recursion detection but still when Lambda had a invocation limit of 1000 at any given time. This limit was the only thing that stopped the Lambdas from using the entire regions compute resources.
The good news for this Cloud Architect was that since this was running in an internal account the actual costs were zero. But if this was an external customer account there was nothing in place to prevent this runaway cost scenario at the time.
What people might find really interesting here is how auxiliary services such as KMS, CloudTrail, and CloudWatch take up significant costs as well as the Lambda. This is because by default these services are enabled in a somewhat noisy configuration so when a Lambda function runs it will log activity to CloudWatch, API calls will be logged in CloudTrail, and KMS will be used if there are any encryption keys required. CloudWatch is notorious for cost overruns because most of the time it's free or almost free, but after you pass the free tier limit the costs quickly skyrocket.
This little case study is why I will never recommend using a recursive lambda in any context. The dangers are to great and there are better alternatives that can be included in your design. So next time Copilot generates a lambda for you, make sure that it doesn't call itself.
#serverless tip #1: HANDLING LONG PROCESSES .. As you' all know, AWS lambda functions come with a constraint of 15-minute processing limit. If you want it to handle more heavy-duty work, it's achievable through recursion. Here's my approach. #CodeNewbie #100DaysOfCode #nodejs
"""Here's a common example: a factorial function where each multiplication ishandled in a new Lambda call—resulting in N invocations for a single result."""import boto3import jsonimport oslambda_client = boto3.client('lambda')def lambda_handler(event, context):n = event.get('n', 1)accumulator = event.get('accumulator', 1)if n <= 1:return {'result': accumulator}# Prepare next payload for recursive callnext_event = {'n': n - 1,'accumulator': accumulator * n}# Invoke this Lambda function recursivelyresponse = lambda_client.invoke(FunctionName=os.environ['AWS_LAMBDA_FUNCTION_NAME'],InvocationType='RequestResponse',Payload=json.dumps(next_event))# Read and return the result from the recursive callresult_payload = json.load(response['Payload'])return result_payload