View a markdown version of this page

Blind spot scenario - AWS Prescriptive Guidance

Blind spot scenario

This section provides an example of a scenario where an engineering team can make an IT decision that results in a blind spot that has unintentional consequences that can harm their organization. The hypothetical scenario assumes that an Amazon Elastic Block Store (Amazon EBS) storage volume is created in the AWS Cloud by using CloudFormation as part of an IaC approach.

In this example, assume that an engineering team writes and deploys a fully tested program that generates an automated response if the volume capacity exceeds 80% (an arbitrary threshold). The program responds to the event by calling APIs to increase the volume size. The engineering team expects that the volume will continue to grow over time but not lead to a production outage. The engineering team solves the storage problem, but another subtle and often neglected problem emerges—drift.

Note

This is just one example but other scenarios are possible. For example, you could replace the EBS volume with a security group where the engineering team creates an ingress rule that allows traffic from a dynamic IP address.

Unintentional consequences

Drift occurs when you change the properties of a provisioned CloudFormation resource outside of the IaC system. The drift introduced by using the API instead of CloudFormation can lead to unintentional problems. The issue is also magnified with the update deployment of the same stack in a DevOps process. Engineers in this scenario use an API to increase the EBS volume size.

Because the volume size changed outside of the code repository, the CloudFormation template isn't aware of the change. Therefore, any deployment will fail until the new volume size is incorporated into the CloudFormation template. More importantly, if engineers are unaware of a modified volume resource, then their templates won't have the necessary updates.

Volumes are particularly vulnerable to drift. After you increase volume storage size, you can't decrease it, so the CloudFormation template will not only fail to update but also fail to roll back because the size parameter can't be applied.

Problematic approach

The following diagram shows a problematic scenario. The user, as part of the workflow, follows an IaC approach and provisions an EBS volume by using a CloudFormation template. The production monitoring team uses an EDP automation approach and creates an Amazon CloudWatch Events rule. This rule is configured to invoke an AWS Lambda function when the EBS volume reaches a specified threshold. Then the Lambda function calls the API to increase the EBS volume size.

Problematic scenario with an EBS volume

The following diagram shows a recommended approach that aligns with the best practices covered in the Best practices section of this guide. This approach involves greater complexity and more AWS services, including Amazon Simple Notification Service (Amazon SNS) for sending notifications, AWS CodeCommit for source control, AWS CodePipeline for automated code deployment, and AWS Step Functions for serverless workflow orchestration. In this approach, CodePipeline adds the EBS volume size to Parameter Store (a capability of AWS Systems Manager) initially, and then Step Functions makes any subsequent updates.

Parameter Store contains historical values that it uses to assess changes. Parameter Store also integrates with multiple services, such as CloudFormation and Lambda. The Lambda function in this scenario doesn't directly interact with the EBS volume to increase volume size. Instead, the Lambda function interacts with CloudFormation to update the API. As a result, CloudFormation stacks are safeguarded from drift. This approach relies on CloudFormation as a single source for performing updates.

Recommended scenario with EBS volume