

# REL 7. How do you design your workload to adapt to changes in demand?


A scalable workload provides elasticity to add or remove resources automatically so that they closely match the current demand at any given point in time.

**Topics**
+ [

# REL07-BP01 Use automation when obtaining or scaling resources
](rel_adapt_to_changes_autoscale_adapt.md)
+ [

# REL07-BP02 Obtain resources upon detection of impairment to a workload
](rel_adapt_to_changes_reactive_adapt_auto.md)
+ [

# REL07-BP03 Obtain resources upon detection that more resources are needed for a workload
](rel_adapt_to_changes_proactive_adapt_auto.md)
+ [

# REL07-BP04 Load test your workload
](rel_adapt_to_changes_load_tested_adapt.md)

# REL07-BP01 Use automation when obtaining or scaling resources
REL07-BP01 Use automation when obtaining or scaling resources

 When replacing impaired resources or scaling your workload, automate the process by using managed AWS services, such as Amazon S3 and AWS Auto Scaling. You can also use third-party tools and AWS SDKs to automate scaling. 

 Managed AWS services include Amazon S3, Amazon CloudFront, AWS Auto Scaling, AWS Lambda, Amazon DynamoDB, AWS Fargate, and Amazon Route 53. 

 AWS Auto Scaling lets you detect and replace impaired instances. It also lets you build scaling plans for resources including [Amazon EC2](https://aws.amazon.com/ec2/) instances and Spot Fleets, [Amazon ECS](https://aws.amazon.com/ecs/) tasks, [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) tables and indexes, and [Amazon Aurora](https://aws.amazon.com/aurora/) Replicas. 

 When scaling EC2 instances, ensure that you use multiple Availability Zones (preferably at least three) and add or remove capacity to maintain balance across these Availability Zones. ECS tasks or Kubernetes pods (when using Amazon Elastic Kubernetes Service) should also be distributed across multiple Availability Zones. 

 When using AWS Lambda, instances scale automatically. Every time an event notification is received for your function, AWS Lambda quickly locates free capacity within its compute fleet and runs your code up to the allocated concurrency. You need to ensure that the necessary concurrency is configured on the specific Lambda, and in your Service Quotas. 

 Amazon S3 automatically scales to handle high request rates. For example, your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket. There are no limits to the number of prefixes in a bucket. You can increase your read or write performance by parallelizing reads. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second. 

 Configure and use Amazon CloudFront or a trusted content delivery network (CDN). A CDN can provide faster end-user response times and can serve requests for content from cache, therefore reducing the need to scale your workload. 

 **Common anti-patterns:** 
+  Implementing Auto Scaling groups for automated healing, but not implementing elasticity. 
+  Using automatic scaling to respond to large increases in traffic. 
+  Deploying highly stateful applications, eliminating the option of elasticity. 

 **Benefits of establishing this best practice:** Automation removes the potential for manual error in deploying and decommissioning resources. Automation removes the risk of cost overruns and denial of service due to slow response on needs for deployment or decommissioning. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance
+  Configure and use AWS Auto Scaling. This monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. Using AWS Auto Scaling, you can setup application scaling for multiple resources across multiple services. 
  +  [What is AWS Auto Scaling?](https://docs.aws.amazon.com/autoscaling/plans/userguide/what-is-aws-auto-scaling.html) 
    +  Configure Auto Scaling on your Amazon EC2 instances and Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, Amazon Aurora Replicas, and AWS Marketplace appliances as applicable. 
      +  [Managing throughput capacity automatically with DynamoDB Auto Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html) 
        +  Use service API operations to specify the alarms, scaling policies, warm up times, and cool down times. 
+  Use Elastic Load Balancing. Load balancers can distribute load by path or by network connectivity. 
  +  [What is Elastic Load Balancing?](https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/what-is-load-balancing.html) 
    +  Application Load Balancers can distribute load by path. 
      +  [What is an Application Load Balancer?](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html) 
        +  Configure an Application Load Balancer to distribute traffic to different workloads based on the path under the domain name. 
        +  Application Load Balancers can be used to distribute loads in a manner that integrates with AWS Auto Scaling to manage demand. 
          +  [Using a load balancer with an Auto Scaling group](https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-load-balancer.html) 
    +  Network Load Balancers can distribute load by connection. 
      +  [What is a Network Load Balancer?](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html) 
        +  Configure a Network Load Balancer to distribute traffic to different workloads using TCP, or to have a constant set of IP addresses for your workload. 
        +  Network Load Balancers can be used to distribute loads in a manner that integrates with AWS Auto Scaling to manage demand. 
+  Use a highly available DNS provider. DNS names allow your users to enter names instead of IP addresses to access your workloads and distributes this information to a defined scope, usually globally for users of the workload. 
  +  Use Amazon Route 53 or a trusted DNS provider. 
    +  [What is Amazon Route 53?](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html) 
  +  Use Route 53 to manage your CloudFront distributions and load balancers. 
    +  Determine the domains and subdomains you are going to manage. 
    +  Create appropriate record sets using ALIAS or CNAME records. 
      +  [Working with records](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/rrsets-working-with.html) 
+  Use the AWS global network to optimize the path from your users to your applications. AWS Global Accelerator continually monitors the health of your application endpoints and redirects traffic to healthy endpoints in less than 30 seconds. 
  +  AWS Global Accelerator is a service that improves the availability and performance of your applications with local or global users. It provides static IP addresses that act as a fixed entry point to your application endpoints in a single or multiple AWS Regions, such as your Application Load Balancers, Network Load Balancers or Amazon EC2 instances. 
    +  [What Is AWS Global Accelerator?](https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html) 
+  Configure and use Amazon CloudFront or a trusted content delivery network (CDN). A content delivery network can provide faster end-user response times and can serve requests for content that may cause unnecessary scaling of your workloads. 
  +  [What is Amazon CloudFront?](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Introduction.html) 
    +  Configure Amazon CloudFront distributions for your workloads, or use a third-party CDN. 
      +  You can limit access to your workloads so that they are only accessible from CloudFront by using the IP ranges for CloudFront in your endpoint security groups or access policies. 

## Resources
Resources

 **Related documents:** 
+  [APN Partner: partners that can help you create automated compute solutions](https://aws.amazon.com/partners/find/results/?facets=%27Product%20:%20Compute%27) 
+  [AWS Auto Scaling: How Scaling Plans Work](https://docs.aws.amazon.com/autoscaling/plans/userguide/how-it-works.html) 
+  [AWS Marketplace: products that can be used with auto scaling](https://aws.amazon.com/marketplace/search/results?searchTerms=Auto+Scaling) 
+  [Managing Throughput Capacity Automatically with DynamoDB Auto Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html) 
+  [Using a load balancer with an Auto Scaling group](https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-load-balancer.html) 
+  [What Is AWS Global Accelerator?](https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html) 
+  [What Is Amazon EC2 Auto Scaling?](https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html) 
+  [What is AWS Auto Scaling?](https://docs.aws.amazon.com/autoscaling/plans/userguide/what-is-aws-auto-scaling.html) 
+  [What is Amazon CloudFront?](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Introduction.html?ref=wellarchitected) 
+  [What is Amazon Route 53?](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html) 
+  [What is Elastic Load Balancing?](https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/what-is-load-balancing.html) 
+  [What is a Network Load Balancer?](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html) 
+  [What is an Application Load Balancer?](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html) 
+  [Working with records](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/rrsets-working-with.html) 

# REL07-BP02 Obtain resources upon detection of impairment to a workload
REL07-BP02 Obtain resources upon detection of impairment to a workload

 Scale resources reactively when necessary if availability is impacted, to restore workload availability. 

 You first must configure health checks and the criteria on these checks to indicate when availability is impacted by lack of resources. Then, either notify the appropriate personnel to manually scale the resource, or start automation to automatically scale it. 

 Scale can be manually adjusted for your workload (for example, changing the number of EC2 instances in an Auto Scaling group, or modifying throughput of a DynamoDB table through the AWS Management Console or AWS CLI). However, automation should be used whenever possible (refer to **Use automation when obtaining or scaling resources**). 

 **Desired outcome:** Scaling activities (either automatically or manually) are initiated to restore availability upon detection of a failure or degraded customer experience. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 Implement observability and monitoring across all components in your workload, to monitor customer experience and detect failure. Define the procedures, manual or automated, that scale the required resources. o For more information, see [REL11-BP01 Monitor all components of the workload to detect failures](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_withstand_component_failures_monitoring_health.html). 

### Implementation steps
Implementation steps
+  Define the procedures, manual or automated, that scale the required resources. 
  +  Scaling procedures depend on how the different components within your workload are designed. 
  +  Scaling procedures also vary depending on the underlying technology utilized. 
    +  Components using AWS Auto Scaling can use scaling plans to configure a set of instructions for scaling your resources. If you work with AWS CloudFormation or add tags to AWS resources, you can set up scaling plans for different sets of resources per application. Auto Scaling provides recommendations for scaling strategies customized to each resource. After you create your scaling plan, Auto Scaling combines dynamic scaling and predictive scaling methods together to support your scaling strategy. For more detail, see [How scaling plans work](https://docs.aws.amazon.com/autoscaling/plans/userguide/how-it-works.html). 
    +  Amazon EC2 Auto Scaling verifies that you have the correct number of Amazon EC2 instances available to handle the load for your application. You create collections of EC2 instances, called Auto Scaling groups. You can specify the minimum and maximum number of instances in each Auto Scaling group, and Amazon EC2 Auto Scaling ensures that your group never goes below or above these limits. For more detail, see [What is Amazon EC2 Auto Scaling?](https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html) 
    +  Amazon DynamoDB auto scaling uses the Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns. This allows a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic, without throttling. For more detail, see [Managing throughput capacity automatically with DynamoDB auto scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html). 

## Resources
Resources

 **Related best practices:** 
+ [ REL07-BP01 Use automation when obtaining or scaling resources ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_adapt_to_changes_autoscale_adapt.html)
+  [REL11-BP01 Monitor all components of the workload to detect failures](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_withstand_component_failures_monitoring_health.html) 

 **Related documents:** 
+  [AWS Auto Scaling: How Scaling Plans Work](https://docs.aws.amazon.com/autoscaling/plans/userguide/how-it-works.html) 
+  [Managing Throughput Capacity Automatically with DynamoDB Auto Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html) 
+  [What Is Amazon EC2 Auto Scaling?](https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html) 

# REL07-BP03 Obtain resources upon detection that more resources are needed for a workload
REL07-BP03 Obtain resources upon detection that more resources are needed for a workload

 Scale resources proactively to meet demand and avoid availability impact. 

 Many AWS services automatically scale to meet demand. If using Amazon EC2 instances or Amazon ECS clusters, you can configure automatic scaling of these to occur based on usage metrics that correspond to demand for your workload. For Amazon EC2, average CPU utilization, load balancer request count, or network bandwidth can be used to scale out (or scale in) EC2 instances. For Amazon ECS, average CPU utilization, load balancer request count, and memory utilization can be used to scale out (or scale in) ECS tasks. Using Target Auto Scaling on AWS, the autoscaler acts like a household thermostat, adding or removing resources to maintain the target value (for example, 70% CPU utilization) that you specify. 

 Amazon EC2 Auto Scaling can also do [Predictive Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html), which uses machine learning to analyze each resource's historical workload and regularly forecasts the future load. 

 Little’s Law helps calculate how many instances of compute (EC2 instances, concurrent Lambda functions, etc.) that you need. 

 *L* = *λW* 

 L = number of instances (or mean concurrency in the system) 

 λ = mean rate at which requests arrive (req/sec) 

 W = mean time that each request spends in the system (sec) 

 For example, at 100 rps, if each request takes 0.5 seconds to process, you will need 50 instances to keep up with demand. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance
+  Obtain resources upon detection that more resources are needed for a workload. Scale resources proactively to meet demand and avoid availability impact. 
  +  Calculate how many compute resources you will need (compute concurrency) to handle a given request rate. 
    +  [Telling Stories About Little's Law](https://brooker.co.za/blog/2018/06/20/littles-law.html) 
  +  When you have a historical pattern for usage, set up scheduled scaling for Amazon EC2 auto scaling. 
    +  [Scheduled Scaling for Amazon EC2 Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/schedule_time.html) 
  +  Use AWS predictive scaling. 
    +  [Predictive scaling for Amazon EC2 Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html) 

## Resources
Resources

 **Related documents:** 
+  [AWS Marketplace: products that can be used with auto scaling](https://aws.amazon.com/marketplace/search/results?searchTerms=Auto+Scaling) 
+  [Managing Throughput Capacity Automatically with DynamoDB Auto Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html) 
+  [Predictive Scaling for EC2, Powered by Machine Learning](https://aws.amazon.com/blogs/aws/new-predictive-scaling-for-ec2-powered-by-machine-learning/) 
+  [Scheduled Scaling for Amazon EC2 Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/schedule_time.html) 
+  [Telling Stories About Little's Law](https://brooker.co.za/blog/2018/06/20/littles-law.html) 
+  [What Is Amazon EC2 Auto Scaling?](https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html) 

# REL07-BP04 Load test your workload
REL07-BP04 Load test your workload

 Adopt a load testing methodology to measure if scaling activity meets workload requirements. 

 It’s important to perform sustained load testing. Load tests should discover the breaking point and test the performance of your workload. AWS makes it easy to set up temporary testing environments that model the scale of your production workload. In the cloud, you can create a production-scale test environment on demand, complete your testing, and then decommission the resources. Because you only pay for the test environment when it's running, you can simulate your live environment for a fraction of the cost of testing on premises. 

 Load testing in production should also be considered as part of game days where the production system is stressed, during hours of lower customer usage, with all personnel on hand to interpret results and address any problems that arise. 

 **Common anti-patterns:** 
+  Performing load testing on deployments that are not the same configuration as your production. 
+  Performing load testing only on individual pieces of your workload, and not on the entire workload. 
+  Performing load testing with a subset of requests and not a representative set of actual requests. 
+  Performing load testing to a small safety factor above expected load. 

 **Benefits of establishing this best practice:** You know what components in your architecture fail under load and be able to identify what metrics to watch to indicate that you are approaching that load in time to address the problem, preventing the impact of that failure. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance
+  Perform load testing to identify which aspect of your workload indicates that you must add or remove capacity. Load testing should have representative traffic similar to what you receive in production. Increase the load while watching the metrics you have instrumented to determine which metric indicates when you must add or remove resources. 
  +  [Distributed Load Testing on AWS: simulate thousands of connected users](https://aws.amazon.com/solutions/distributed-load-testing-on-aws/) 
    +  Identify the mix of requests. You may have varied mixes of requests, so you should look at various time frames when identifying the mix of traffic. 
    +  Implement a load driver. You can use custom code, open source, or commercial software to implement a load driver. 
    +  Load test initially using small capacity. You see some immediate effects by driving load onto a lesser capacity, possibly as small as one instance or container. 
    +  Load test against larger capacity. The effects will be different on a distributed load, so you must test against as close to a product environment as possible. 

## Resources
Resources

 **Related documents:** 
+  [Distributed Load Testing on AWS: simulate thousands of connected users](https://aws.amazon.com/solutions/distributed-load-testing-on-aws/) 
+  [Load testing applications](https://docs.aws.amazon.com/prescriptive-guidance/latest/load-testing/welcome.html) 

 **Related videos:** 
+  [AWS Summit ANZ 2023: Accelerate with confidence through AWS Distributed Load Testing](https://www.youtube.com/watch?v=4J6lVqa6Yh8) 