Secure sensitive data in CloudWatch Logs by using Amazon Macie - AWS Prescriptive Guidance

Secure sensitive data in CloudWatch Logs by using Amazon Macie

Anisha Salunkhe, Omar Franco, and David Guardiola, Amazon Web Services

Summary

This pattern shows you how to use Amazon Macie to automatically detect sensitive data in an Amazon CloudWatch Logs log group by implementing a comprehensive security monitoring workflow. The solution uses Amazon Data Firehose to stream CloudWatch Logs entries to Amazon Simple Storage Service (Amazon S3). Macie periodically scans this bucket for personally identifiable information (PII), financial data, and other sensitive content. The infrastructure is deployed through a AWS CloudFormation template that provisions all necessary AWS services and configurations.

CloudWatch Logs often contains application data that can inadvertently include sensitive user information. This can create compliance and security risks. Traditional log monitoring approaches lack automated sensitive data detection capabilities. This can make it difficult to identify and respond to potential data exposures in real-time.

This pattern helps security teams and compliance officers maintain data confidentiality by providing automated detection and alerting for sensitive data in logging systems. This solution enables proactive incident response through Amazon Simple Notification Service (Amazon SNS) notifications, and it automatically isolates sensitive data to a secure Amazon S3 bucket. You can customize the detection patterns and integrate the workflow with your existing security operations processes.

Prerequisites and limitations

Prerequisites

  • An active AWS account

  • Permissions to create a CloudFormation stack

  • A CloudWatch Logs log group that you want to monitor

  • An active email address to receive notifications from Amazon SNS

  • Access to AWS CloudShell

  • (Optional) Access to the AWS Command Line Interface (AWS CLI), installed and configured

Limitations

  • Macie is subject to service quotas. For more information, see Quotas for Macie in the Macie documentation.

Architecture

Target architecture

The following diagram shows the workflow for using Macie to examine CloudWatch Logs log entries for sensitive data.

 

The workflow shows the following steps:

  1. The CloudWatch Logs log group generates the logs, which are subject to the subscription filter.

  2. The subscription filter forwards the logs to Amazon Data Firehose.

  3. The logs are encrypted with an AWS Key Management Service (AWS KMS) key when they pass through the Amazon Data Firehose delivery stream.

  4. The delivery stream delivers the logs to the exported logs bucket in Amazon S3.

  5. At 4 AM each day, Amazon EventBridge initiates an AWS Lambda function that starts a Macie scan for sensitive data in the exported logs bucket.

  6. If Macie identifies sensitive data in the bucket, a Lambda function removes the log from the exported logs bucket and encrypts it with an AWS KMS key.

  7. The Lambda function isolates the logs that contain sensitive data in the data isolation bucket.

  8. The identification of sensitive data initiates an Amazon SNS topic.

  9. Amazon SNS sends an email notification to an email address that you configure with information about the logs that contain sensitive data.

Deployed resources

The CloudFormation template deploys the following resources in your target AWS account and AWS Region:

  • Two Amazon S3 buckets:

    • An exported logs bucket for storing the CloudWatch Logs data

    • A data isolation bucket to store the sensitive information

  • An Amazon EventBridge rule that responds to Macie findings

  • AWS Lambda functions that initiate events and export logs to Amazon S3 buckets

  • An Amazon SNS topic and subscription

  • An Amazon Data Firehose stream

  • A Macie session

  • A Macie custom data identifier

  • A CloudWatch Logs subscription filter

  • AWS KMS keys to encrypt the logs stored in the buckets

  • The necessary AWS Identity and Access Management (IAM) roles and policies for the solution

Tools

AWS services

  • AWS CloudFormation helps you set up AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle across AWS accounts and AWS Regions.

  • Amazon CloudWatch Logs helps you centralize the logs from all your systems, applications, and AWS services so you can monitor them and archive them securely.

  • Amazon Data Firehose helps you deliver real-time streaming data to other AWS services, custom HTTP endpoints, and HTTP endpoints owned by supported third-party service providers.

  • Amazon EventBridge is a serverless event bus service that helps you connect your applications with real-time data from a variety of sources. For example, sources such as AWS Lambda functions, HTTP invocation endpoints using API destinations, or event buses in other AWS accounts.

  • AWS Key Management Service (AWS KMS) helps you create and control cryptographic keys to help protect your data.

  • AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.

  • Amazon Macie helps you discover sensitive data, provides visibility into data security risks, and enables automated protection against those risks.

  • Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.

  • Amazon Simple Storage Service (Amazon S3) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.

Code repository

The code for this pattern is available in the GitHub sample-macie-for-securing-cloudwatch-logs repository.

Best practices

Follow the CloudFormation best practices in the CloudFormation documentation.

Epics

TaskDescriptionSkills required

Clone the code repository.

Enter the following command to clone the repository to your local workstation:

git clone https://github.com/aws-samples/sample-macie-for-securing-cloudwatch-logs
App developer

(Optional) Edit the CloudFormation template.

  1. Open the main.yaml file.

  2. Customize the template by doing any of the following:

    • You can rename resources.

    • In the parameter section, you can modify the default values.

    • You can change the subscription filter pattern. For more information, see Log group-level subscription filters in the CloudWatch Logs documentation.

  3. Save and close the main.yaml file.

App developer

Option 1 – Deploy using script with command-line parameters.

Enter the following command to deploy the solution by using command line parameters, where the value for enable-macie is true only if Amazon Macie is not already enabled:

./scripts/test-macie-solution.sh --deploy-stack \ --stack-name <stack name> \ --email <email address> \ --enable-macie <true or false> \ --region <region> \ --resource-name <prefix for all resources> \ --bucket-name <bucket name>
General AWS

Option 2 – Deploy using script with environment variables.

  1. Enter the following commands to define the environment variables, where the value for ENABLE_MACIE is true only if Amazon Macie is not already enabled:

    export STACK_NAME=<stack name> export SNS_EMAIL=<email address> export ENABLE_MACIE=<true or false> export REGION=<region> export RESOURCE_NAME=<prefix for all resources> export BUCKET_NAME=<bucket name>
  2. Enter the following command to validate the parameters before deployment:

    ./scripts/test-macie-solution.sh \ --validate-params \ --email <email address> \ --region <region>
  3. Enter the following command to deploy the solution:

    ./scripts/test-macie-solution.sh --deploy-stack
General AWS

Option 3 – Deploy using the AWS CLI.

Enter the following command to deploy the solution by using the AWS CLI, where the value for EnableMacie is true only if Amazon Macie is not already enabled:

aws cloudformation create-stack \ --region us-east-1 \ --stack-name macie-for-securing-cloudwatch-logs \ --template-body file://app/main.yml \ --capabilities CAPABILITY_IAM \ --parameters \ ParameterKey=ResourceName,ParameterValue=<prefix for all resources> \ ParameterKey=BucketName,ParameterValue=<bucket name> \ ParameterKey=LogGroupName,ParameterValue=<path for log group> \ ParameterKey=SNSTopicEndpointEmail,ParameterValue=<email address> \ ParameterKey=EnableMacie,ParameterValue=<true or false>

Option 4 – Deploy through the AWS Management Console.

  1. Open the AWS CloudFormation console.

  2. On the navigation bar at the top of the screen, choose the AWS Region to create the stack in.

  3. On the Stacks page, choose Create stack at top right, and then choose With new resources (standard).

  4. On the Create stack page, for Prerequisite - Prepare template, choose Choose an existing template.

  5. Under Specify template, choose Upload a template file, and then upload the main.yaml template from your cloned repository.

  6. Choose Next.

  7. On the Specify stack details page, in the Stack name box, enter a stack name.

  8. In the Parameters section, specify values for the following template parameters.

    • ResourceName: Prefix for all resources

    • BucketName: Unique name for the Amazon S3 bucket

    • LogGroupName: Log group name for CloudWatch Logs

    • SNSTopicEndpointEmail: Email address for notifications

    • EnableMacie: Set to true if Macie is not already enabled

    • (Optional) Region: The AWS Region where you want to deploy the stack.

    • (Optional) TemplatePath: Path to CloudFormation template

  9. Choose Next.

  10. For Capabilities, choose I acknowledge that this template may create IAM resources to specify that you want to use IAM resources in the template.

  11. Choose Next.

  12. On the Review and create page, review the details of your stack.

  13. Choose Submit to launch your stack.

General AWS

Monitor the deployment status and confirm deployment.

  1. Enter the following command to monitor the deployment status:

    ./scripts/test-macie-solution.sh \ --deployment-status \ --stack-name <stack name>
    Note

    You can also monitor the progress and status of the stack creation on the Events tab for your new stack. For more information, see Monitor stack progress.

  2. When the status changes to CREATE_COMPLETE, review the stack outputs for the resource information.

General AWS

Confirm the Amazon SNS subscription.

Follow the instructions in Confirm your Amazon SNS subscription in the Amazon SNS documentation to confirm your Amazon SNS subscription.

App developer
TaskDescriptionSkills required

Option 1 – Test with automated reporting.

If you used the default stack name, enter the following command to test the solution:

./scripts/test-macie-solution.sh \ --full-test

If you used a custom stack name, enter the following command to test the solution:

./scripts/test-macie-solution.sh \ --full-test \ --stack-name <stack name>

If you used a custom stack name and custom parameters, enter the following command to test the solution:

./scripts/test-macie-solution.sh --full-test \ --stack-name <stack name> \ --region <region> \ --log-group <log group path>
General AWS

Option 2 – Test with targeted validation.

  1. Enter the following command to generate test data with sensitive information:

    ./scripts/test-macie-solution.sh \ --generate-test-data \ --stack-name <stack name>

    This command does the following:

    • Creates CloudWatch Logs log entries that contain realistic sensitive data patterns, including employee IDs, patent IDs, credit card numbers, social security numbers, and email addresses

    • Generates both sensitive and non-sensitive log entries for comprehensive testing

    • Provides detailed logging of the test data generation process

  2. Enter the following command to verify the data pipeline flow:

    ./scripts/test-macie-solution.sh \ --verify-pipeline \ --stack-name <stack name>

    This command does the following:

    • Confirms that the CloudWatch Logs entries are streamed to Amazon Data Firehose

    • Validates that the log data was delivered to the Amazon S3 bucket with proper encryption

    • Checks that the Amazon S3 object storage has the correct prefix structure

    • Verifies the encryption status of the stored objects

    • Monitors data flow timing and provides wait periods for processing

  3. Enter the following command to initiate the Macie classification job:

    ./scripts/test-macie-solution.sh \ --trigger-macie-job \ --stack-name <stack name>

    This command does the following:

    • Manually triggers the Macie classification job through a Lambda function

    • Monitors the job execution status and provides feedback

    • Validates Macie service availability before execution

    • Handles cases where Macie is not enabled in the account

    • Provides detailed job execution results

  4. Enter the following command to validate the alerting and data isolation:

    ./scripts/test-macie-solution.sh \ --verify-alerts \ --stack-name <stack name>

    This command does the following:

    • Confirms that EventBridge rules are properly configured and active

    • Validates the Amazon SNS topic configuration and subscription status

    • Checks the data isolation bucket setup and permissions

    • Monitors for sensitive data movement to the isolation bucket

    • Verifies complete alerting workflow functionality

General AWS
TaskDescriptionSkills required

Option 1 – Perform automated cleanup.

If you used the default stack name, enter the following command to delete the stack:

./scripts/cleanup-macie-solution.sh \ --full-cleanup

If you used a custom stack name, enter the following command to delete the stack:

./scripts/cleanup-macie-solution.sh \ --full-cleanup \ --stack-name <stack name>

If you used a custom stack name and custom parameters, enter the following command to delete the stack:

./scripts/cleanup-macie-solution.sh \ --full-cleanup \ --stack-name <stack name> \ --region <region> \ --disable-macie <true or false>
General AWS

Option 2 – Perform step-by-step cleanup.

  1. Enter the following command to stop active processes:

    ./scripts/cleanup-macie-solution.sh \ --stop-processes \ --stack-name <stack name>

    This command does the following:

    • Turns off EventBridge rules to prevent new job executions

    • Stops any currently running Macie classification jobs

    • Cancels pending Macie job executions

    • Clears any pending Amazon SNS messages in the queue

    • Provides status updates for each stopped process

  2. Enter the following command to empty the Amazon S3 buckets:

    ./scripts/cleanup-macie-solution.sh \ --empty-buckets \ --stack-name <stack name>

    This command does the following:

    • Removes all objects from the bucket for CloudWatch Logs

    • Removes all objects from the data isolation Amazon S3 bucket

    • Deletes any incomplete multipart uploads

    • Handles versioned objects if Amazon S3 versioning is enabled

    • Provides object count and deletion progress updates

  3. Enter the following command to delete the CloudFormation stack:

    ./scripts/cleanup-macie-solution.sh \ --delete-stack \ --stack-name <stack name>

    This command does the following:

    • Initiates the CloudFormation stack deletion process

    • Monitors the deletion progress with real-time status updates

    • Handles deletion failures with detailed error reporting

    • Waits for complete stack removal before proceeding

    • Provides stack event history for troubleshooting

  4. Enter the following command to clean up the Macie resources:

    ./scripts/cleanup-macie-solution.sh \ --cleanup-macie \ --stack-name <stack name>

    This stack does the following:

    • Removes custom data identifiers created by the solution

    • Cleans up any remaining Macie job artifacts and findings

    • Disables the Macie session if it was enabled by the stack

    • Handles cases where Macie resources are shared with other applications

    • Provides the detailed cleanup status for each Macie component

General AWS

Verify clean up.

  1. Enter the following command to verify that the stack was deleted:

    aws cloudformation describe-stacks \ --stack-name <stack name> \ --region <region>
  2. Enter the following command to verify that the Amazon S3 buckets were deleted:

    aws s3 ls | grep macie
  3. Enter the following command to verify that the Macie custom data identifiers were removed:

    aws macie2 list-custom-data-identifiers \ --region <region>
  4. Enter the following command to check for any remaining resources:

    ./scripts/cleanup-macie-solution.sh \ --verify-cleanup \ --stack-name <stack name>
General AWS

Troubleshooting

IssueSolution

CloudFormation stack status shows CREATE_FAILED.

The CloudFormation template is configured to publish logs to CloudWatch Logs. You can view the logs in the AWS Management Console so that you don't have to connect to your Amazon EC2 instance. For more information, see View CloudFormation logs in the console (AWS blog post).

CloudFormation delete-stack command fails.

Some resources must be empty before they can be deleted. For example, you must delete all objects in an Amazon S3 bucket or remove all instances in an Amazon EC2 security group before you can delete the bucket or security group. For more information, see Delete stack fails in the Amazon S3 documentation.

Error when parsing a parameter.

When you use the AWS CLI or the CloudFormation console to pass in a value, add the quotation marks.

Related resources