☁️ AWS Lambda: Concurrency, Throttling, and Versions Explained

>>> Understanding Lambda Concurrency Types, Throttling, and Versioning with Examples

Sai Manasa
6 min readJul 30, 2024
AWS Lambda: Concurrency, Throttling, and Versions Explained by Sai Manasa

Hello World… In the ever-evolving landscape of serverless computing, AWS Lambda offers powerful tools for building and managing scalable applications. Understanding key concepts such as concurrency, throttling, and versioning is crucial for optimizing Lambda function performance and ensuring smooth deployments.

Let’s get started…

Overview:

  • Concurrency
  • Throttling
  • Example: Concurrency and Throttling
  • Types of Concurrency Controls
  • Versions and Alias

Concurrency:

  • It refers to the scenario where a single request is executed by more than one Lambda function at a given moment.
  • Here, the request refers to an event that triggers the Lambda function.
  • Default: 1000 units of concurrency per AWS account per region.

Throttling:

  • It is also known as RateExceeded.
  • This occurs when a lambda function rejects a request.

Example: Concurrency and Throttling

  • Uploading an image to S3: You have an S3 bucket configured to trigger a Lambda function whenever an image is uploaded. The Lambda function, ImageResize processes the uploaded images.
  • Initial Setup: By default, an account has a concurrency limit of 1,000 concurrent executions for Lambda functions.
  • High Volume of Image Uploads: Suppose 1,200 images are uploaded to the S3 bucket in a short period. Each image triggers the ImageResize Lambda function to process it.
  • Concurrent Executions: If no other Lambda functions are using the concurrency pool, the ImageResize function can scale up to handle these requests. It will start executing multiple instances to process the images. With a concurrency limit of 1,000, AWS Lambda will handle up to 1,000 image processing requests simultaneously.
  • Throttling: After the first 1,000 requests are processed, the remaining 200 requests will be throttled because the concurrency limit of 1,000 has been reached. These excess requests will be queued or rejected until some of the existing executions complete and free up concurrency slots.

Types of Concurrency controls:

  1. Unreserved Concurrency: It refers to the default pool of concurrent executions available to Lambda functions that do not have any reserved concurrency settings applied.
    This pool is shared across all functions in your AWS account. By default, all Lambda functions share the unreserved concurrency pool.
    If you have multiple Lambda functions, they compete for this pool of concurrency.
    Limits can be raised using an AWS Support Ticket.
    Best for general use where you do not have specific requirements for handling high traffic or need guaranteed capacity.
    Example: Suppose your AWS account has a total concurrency limit of 1,000. If you have three Lambda functions:
    - FunctionA can use up to 1,000 concurrent executions if it’s the only function running.
    - If FunctionA is using 600 concurrent executions, the remaining 400 are available for FunctionB and FunctionC.
    - If FunctionB and FunctionC receive requests, they will draw from the remaining 400 unreserved concurrency slots. If their combined requests exceed 400, some of these requests will be throttled.
    Pros:
    - Cost-Effective: No additional cost for setting up.
    - Flexible: Automatically scales based on demand, using available concurrency from the shared pool.
    Cons:
    - No Guarantees: No guaranteed capacity; if other functions consume a lot of concurrency, your function might be throttled.
  2. Reserved Concurrency: It allows you to allocate a specific number of concurrent executions exclusively for a particular Lambda function. This ensures that the function can always scale up to the reserved limit, regardless of how other functions are utilizing the concurrency pool. Reserved concurrency guarantees a certain number of concurrent executions for a function.
    It prevents other functions from using these reserved slots, protecting critical functions from being throttled.
    Best for Applications that require guaranteed capacity to ensure critical functions can always scale up and perform reliably.
    Example: Let’s say you have the same total concurrency limit of 1,000 and you allocate:
    - 200 concurrent executions as reserved for FunctionA.
    - This means FunctionA can use up to 200 concurrent executions at any time.
    - If FunctionA is processing 200 requests, FunctionB and FunctionC must share the remaining 800 concurrent executions.
    - If FunctionA is processing images and has a reserved concurrency of 200, it will always have access to these 200 slots. Even if other functions exhaust the unreserved pool, FunctionA will not be affected, and it will continue processing up to 200 requests simultaneously.
    Pros:
    - Guaranteed Capacity: Ensures that the function always has access to a specific number of concurrent executions.
    - Prevents Starvation: Protects critical functions from being throttled by other functions using up concurrency.
    Cons:
    - Limited Flexibility: The reserved concurrency is not available to other functions, which might reduce the overall concurrency pool for the account.
  3. Provisioned Concurrency: It keeps a specified number of Lambda function instances pre-warmed and ready to handle requests immediately.
    This reduces cold start latency by ensuring that a certain number of instances are always initialized and ready.
    We can set a fixed number of instances to be kept warm and ready.
    It helps the functions with predictable or high-traffic patterns by minimizing the delay caused by cold starts.
    Best for Applications that require low latency and consistent performance, especially where cold start times can impact user experience.
    Example: Let’s say FunctionA is used for real-time image processing and you expect high traffic. You configure provisioned concurrency with a value of 50.
    - AWS Lambda will maintain 50 pre-warmed instances of FunctionA at all times.
    - When an image is uploaded, one of these 50 pre-warmed instances handles the request immediately, reducing latency compared to starting a new instance from scratch.
    Pros:
    - Low Latency: Reduces cold start time by keeping a specified number of instances pre-warmed.
    - Predictable Performance: Ensures consistent performance even under variable load.
    Cons:
    - Higher Cost: Involves additional cost as you are paying for the pre-warmed instances.
    - Resource Usage: Consumes part of your concurrency pool regardless of actual traffic.

Versions:

  • Versioning allows us to manage different versions of your Lambda function code and configuration.
  • This is useful for deploying changes, testing, and rolling back if needed.
  • Each version represents a specific snapshot of your function’s code and configuration at a given time.
  • Working:
    Initial Setup: When you create a Lambda function, it starts with a version called $LATEST, which represents the most recent changes made to the function.
    Creating Versions: You can publish a version of your Lambda function by creating a new, immutable snapshot of the code and configuration. Each version is identified by a unique number (e.g., 1, 2, 3, etc.). Once published, a version cannot be modified. If you need to make changes, you create a new version.
    Aliases: Aliases are pointers to specific versions of your Lambda function. They allow you to manage different environments (e.g., dev, test, prod) and make deployments more manageable. For example, you can create an alias named PROD that points to version 3 of your function and another alias named STAGING that points to version 2.
  • Managing Versions and Aliases:
    Creating a Version: To publish a new version, use the AWS Management Console, CLI, or SDK. The new version will be immutable.
    Creating an Alias: You can create an alias that points to a specific version. This helps in routing requests to different versions without changing the function’s ARN in your code.
    Updating an Alias: You can update an alias to point to a new version when you deploy new changes.
  • Example: A Lambda function that processes user data, and you want to deploy a new feature without disrupting the current production environment.
    Initial Version: You start with version 1 of your Lambda function which is used in the production environment.
    Develop New Features: You develop a new feature and test it in the $LATEST version of your function.
    Publish New Version: Once you are satisfied with the new feature, you publish a new version of the function (e.g., version 2).
    Create an Alias: Create an alias named PROD that points to version 1 and another alias named TEST that points to version 2.
    Deploy to Production: After thorough testing, you update the PROD alias to point to version 2, making the new feature live for all users.
    Roll Back if Needed: If any issues arise, you can quickly roll back by updating the PROD alias to point back to version 1.
  • Benefits: Deployment Management, Quick Rollbacks, Testing.

Let’s Connect:

Feel free to reach out, share your thoughts, or ask any questions. I’m excited to engage with you and learn from each other as we navigate this exciting field!

LinkedIn: Sai Manasa

GitHub: Sai Manasa

Happy Throttling😄

Happy Learning💻

--

--