

# COST 9  How do you manage demand, and supply resources?


For a workload that has balanced spend and performance, ensure that everything you pay for is used and avoid significantly underutilizing instances. A skewed utilization metric in either direction has an adverse impact on your organization, in either operational costs (degraded performance due to over-utilization), or wasted AWS expenditures (due to over-provisioning).

**Topics**
+ [

# COST09-BP01 Perform an analysis on the workload demand
](cost_manage_demand_resources_cost_analysis.md)
+ [

# COST09-BP02 Implement a buffer or throttle to manage demand
](cost_manage_demand_resources_buffer_throttle.md)
+ [

# COST09-BP03 Supply resources dynamically
](cost_manage_demand_resources_dynamic.md)

# COST09-BP01 Perform an analysis on the workload demand
COST09-BP01 Perform an analysis on the workload demand

 Analyze the demand of the workload over time. Verify that the analysis covers seasonal trends and accurately represents operating conditions over the full workload lifetime. Analysis effort should reflect the potential benefit, for example, time spent is proportional to the workload cost. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance

Know the requirements of the workload. The organization requirements should indicate the workload response times for requests. The response time can be used to determine if the demand is managed, or if the supply of resources will change to meet the demand.

The analysis should include the predictability and repeatability of the demand, the rate of change in demand, and the amount of change in demand. Ensure that the analysis is performed over a long enough period to incorporate any seasonal variance, such as end-of- month processing or holiday peaks.

Ensure that the analysis effort reflects the potential benefits of implementing scaling. Look at the expected total cost of the component, and any increases or decreases in usage and cost over the workload lifetime.

You can use [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/) or [Amazon Quick](https://aws.amazon.com/quicksight/) with the AWS Cost and Usage Report (CUR) or your application logs to perform a visual analysis of workload demand.

**Implementation steps**
+ ** Analyze existing workload data: **Analyze data from the existing workload, previous versions of the workload, or predicted usage patterns. Use log files and monitoring data to gain insight on how customers use the workload. Typical metrics are the actual demand in requests per second, the times when the rate of demand changes or when it is at different levels, and the rate of change of demand. Ensure you analyze a full cycle of the workload, ensuring you collect data for any seasonal changes such as end of month or end of year events. The effort reflected in the analysis should reflect the workload characteristics. The largest effort should be placed on high-value workloads that have the largest changes in demand. The least effort should be placed on low-value workloads that have minimal changes in demand. Common metrics for value are risk, brand awareness, revenue or workload cost. 
+ ** Forecast outside influence: **Meet with team members from across the organization that can influence or change the demand in the workload. Common teams would be sales, marketing, or business development. Work with them to know the cycles they operate within, and if there are any events that would change the demand of the workload. Forecast the workload demand with this data. 

## Resources
Resources

 **Related documents:** 
+  [AWS Auto Scaling](https://aws.amazon.com/autoscaling/) 
+  [AWS Instance Scheduler](https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/) 
+  [Getting started with Amazon SQS](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-getting-started.html) 
+ [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/)
+ [Amazon Quick](https://aws.amazon.com/quicksight/)

# COST09-BP02 Implement a buffer or throttle to manage demand
COST09-BP02 Implement a buffer or throttle to manage demand

 Buffering and throttling modify the demand on your workload, smoothing out any peaks. Implement throttling when your clients perform retries. Implement buffering to store the request and defer processing until a later time. Verify that your throttles and buffers are designed so clients receive a response in the required time. 

 **Level of risk exposed if this best practice is not established:** Low 

## Implementation guidance
Implementation guidance

**Throttling:** If the source of the demand has retry capability, then you can implement throttling. Throttling tells the source that if it cannot service the request at the current time it should try again later. The source will wait for a period of time and then re-try the request. Implementing throttling has the advantage of limiting the maximum amount of resources and costs of the workload. In AWS, you can use [Amazon API Gateway](https://aws.amazon.com/api-gateway/) to implement throttling. Refer to the [Well-Architected Reliability pillar whitepaper](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/welcome.html) for more details on implementing throttling.

**Buffer based: **Similar to throttling, a buffer defers request processing, allowing applications that run at different rates to communicate effectively. A buffer-based approach uses a queue to accept messages (units of work) from producers. Messages are read by consumers and processed, allowing the messages to run at the rate that meets the consumers’ business requirements. You don’t have to worry about producers having to deal with throttling issues, such as data durability and backpressure (where producers slow down because their consumer is running slowly).

In AWS, you can choose from multiple services to implement a buffering approach. [Amazon Simple Queue Service(Amazon SQS)](https://aws.amazon.com/sqs/) is a managed service that provides queues that allow a single consumer to read individual messages. [Amazon Kinesis](https://aws.amazon.com/kinesis/) provides a stream that allows many consumers to read the same messages.

When architecting with a buffer-based approach, ensure that you architect your workload to service the request in the required time, and that you are able to handle duplicate requests for work.

**Implementation steps**
+ ** Analyze the client requirements: **Analyze the client requests to determine if they are capable of performing retries. For clients that cannot perform retries, buffers will need to be implemented. Analyze the overall demand, rate of change, and required response time to determine the size of throttle or buffer required. 
+ ** Implement a buffer or throttle:** Implement a buffer or throttle in the workload. A queue such as Amazon Simple Queue Service (Amazon SQS) can provide a buffer to your workload components. Amazon API Gateway can provide throttling for your workload components. 

## Resources
Resources

 **Related documents:** 
+  [AWS Auto Scaling](https://aws.amazon.com/autoscaling/) 
+  [AWS Instance Scheduler](https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/) 
+  [Amazon API Gateway](https://aws.amazon.com/api-gateway/) 
+  [Amazon Simple Queue Service](https://aws.amazon.com/sqs/) 
+  [Getting started with Amazon SQS](https://aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-getting-started.html) 
+  [Amazon Kinesis](https://aws.amazon.com/kinesis/) 

# COST09-BP03 Supply resources dynamically
COST09-BP03 Supply resources dynamically

 Resources are provisioned in a planned manner. This can be demand-based, such as through automatic scaling, or time-based, where demand is predictable and resources are provided based on time. These methods result in the least amount of over or under-provisioning. 

 **Level of risk exposed if this best practice is not established:** Low 

## Implementation guidance
Implementation guidance

You can use [AWS Auto Scaling](https://aws.amazon.com/autoscaling/), or incorporate scaling in your code with the [AWS API or SDKs](https://aws.amazon.com/developer/tools/). This reduces your overall workload costs by removing the operational cost from manually making changes to your environment, and can be performed much faster. This will ensure that the workload resourcing best matches the demand at any time.

**Demand-based supply:** Leverage the elasticity of the cloud to supply resources to meet changing demand. Take advantage of APIs or service features to programmatically vary the amount of cloud resources in your architecture dynamically. This allows you to scale components in your architecture, and automatically increase the number of resources during demand spikes to maintain performance, and decrease capacity when demand subsides to reduce costs.

[AWS Auto Scaling](https://aws.amazon.com/autoscaling/) helps you adjust your capacity to maintain steady, predictable performance at the lowest possible cost. It is a fully managed and free service that integrates with Amazon Elastic Compute Cloud (Amazon EC2) instances and Spot Fleets, Amazon Elastic Container Service (Amazon ECS), Amazon DynamoDB, and Amazon Aurora.

Auto Scaling provides automatic resource discovery to help find resources in your workload that can be configured, it has built-in scaling strategies to optimize performance, costs or a balance between the two, and provides predictive scaling to assist with regularly occurring spikes.

Auto Scaling can implement manual, scheduled or demand-based scaling. You can also use metrics and alarms from [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/) to trigger scaling events for your workload. Typical metrics can be standard Amazon EC2 metrics, such as CPU utilization, network throughput, and [Elastic Load Balancing(ELB) ](https://aws.amazon.com/elasticloadbalancing/)observed request or response latency. When possible, you should use a metric that is indicative of customer experience, which is typically a custom metric that might originate from application code within your workload.

When architecting with a demand-based approach keep in mind two key considerations. First, understand how quickly you must provision new resources. Second, understand that the size of margin between supply and demand will shift. You must be ready to cope with the rate of change in demand and also be ready for resource failures.

[ELB](https://aws.amazon.com/elasticloadbalancing/) helps you to scale by distributing demand across multiple resources. As you implement more resources, you add them to the load balancer to take on the demand. Elastic Load Balancing has support for Amazon EC2 Instances, containers, IP addresses, and AWS Lambda functions.

**Time-based supply:** A time-based approach aligns resource capacity to demand that is predictable or well-defined by time. This approach is typically not dependent upon utilization levels of the resources. A time-based approach ensures that resources are available at the specific time they are required, and can be provided without any delays due to start-up procedures and system or consistency checks. Using a time-based approach, you can provide additional resources or increase capacity during busy periods.

You can use scheduled Auto Scaling to implement a time-based approach. Workloads can be scheduled to scale out or in at defined times (for example, the start of business hours) thus ensuring that resources are available when users or demand arrives.

You can also leverage the [AWS APIs and SDKs](https://aws.amazon.com/developer/tools/) and [AWS CloudFormation](https://aws.amazon.com/cloudformation/) to automatically provision and decommission entire environments as you need them. This approach is well suited for development or test environments that run only in defined business hours or periods of time.

You can use APIs to scale the size of resources within an environment (vertical scaling). For example, you could scale up a production workload by changing the instance size or class. This can be achieved by stopping and starting the instance and selecting the different instance size or class. This technique can also be applied to other resources, such as Amazon Elastic Block Store (Amazon EBS) Elastic Volumes, which can be modified to increase size, adjust performance (IOPS) or change the volume type while in use.

When architecting with a time-based approach keep in mind two key considerations. First, how consistent is the usage pattern? Second, what is the impact if the pattern changes? You can increase the accuracy of predictions by monitoring your workloads and by using business intelligence. If you see significant changes in the usage pattern, you can adjust the times to ensure that coverage is provided.

**Implementation steps**
+ ** Configure time-based scheduling: **For predictable changes in demand, time-based scaling can provide the correct number of resources in a timely manner. It is also useful if resource creation and configuration is not fast enough to respond to changes on demand. Using the workload analysis configure scheduled scaling using AWS Auto Scaling. 
+ ** Configure Auto Scaling: **To configure scaling based on active workload metrics, use Amazon Auto Scaling. Use the analysis and configure auto scaling to trigger on the correct resource levels, and ensure that the workload scales in the required time. 

## Resources
Resources

 **Related documents:** 
+  [AWS Auto Scaling](https://aws.amazon.com/autoscaling/) 
+  [AWS Instance Scheduler](https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/) 
+  [Getting Started with Amazon EC2 Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/GettingStartedTutorial.html) 
+  [Getting started with Amazon SQS](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-getting-started.html) 
+  [Scheduled Scaling for Amazon EC2 Auto Scaling](https://docs.aws.amazon.com/autoscaling/ec2/userguide/schedule_time.html) 