Combining alarms
With CloudWatch, you can combine several alarms into one composite alarm to create a summarized, aggregated health indicator over a whole application or group of resources. Composite alarms are alarms that determine their state by monitoring the states of other alarms. You define rules to combine the status of those monitored alarms using Boolean logic.
You can use composite alarms to reduce alarm noise by taking actions only at an aggregated level. For example, you can create a composite alarm to send a notification to your web server team if any alarm related to your web server triggers. When any of those alarms goes into the ALARM state, the composite alarm goes itself in the ALARM state and sends a notification to your team. If other alarms related to your web server also go into the ALARM state, your team does not get overloaded with new notifications since the composite alarm has already notified them about the existing situation.
You can also use composite alarms to create complex alarming conditions and take actions only when many different conditions are met. For example, you can create a composite alarm that combines a CPU alarm and a memory alarm, and would only notify your team if both the CPU and the memory alarms have triggered.
Using composite alarms
When you use composite alarms, you have two options:
-
Configure the actions you want to take only at the composite alarm level, and create the underlying monitored alarms without actions
-
Configure a different set of actions at the composite alarm level. For example, the composite alarm actions could engage a different team in case of a widespread issue.
Composite alarms can take only the following actions:
-
Notify Amazon SNS topics
-
Invoke Lambda functions
-
Create OpsItems in Systems Manager Ops Center
-
Create incidents in Systems Manager Incident Manager
Note
All the underlying alarms in your composite alarm must be in the same account and the same Region as your composite alarm. However, if you set up a composite alarm in a CloudWatch cross-account observability monitoring account, the underlying alarms can watch metrics in different source accounts and in the monitoring account itself. For more information, see CloudWatch cross-account observability.
A single composite alarm can monitor 100 underlying alarms, and 150 composite alarms can monitor a single underlying alarm.
Rule expressions
All composite alarms contain rule expressions. Rule expressions tell composite alarms which other alarms to monitor and determine their states from. Rule expressions can refer to metric alarms and composite alarms. When you reference an alarm in a rule expression, you designate a function to the alarm that determines which of the following three states the alarm will be in:
-
ALARM
ALARM ("alarm-name or alarm-ARN") is TRUE if the alarm is in ALARM state.
-
OK
OK ("alarm-name or alarm-ARN") is TRUE if the alarm is in OK state.
-
INSUFFICIENT_DATA
INSUFFICIENT_DATA (“alarm-name or alarm-ARN") is TRUE if the named alarm is in INSUFFICIENT_DATA state.
Note
TRUE always evaluates to TRUE, and FALSE always evaluates to FALSE.
Alarm references
When referencing an alarm, using either the alarm name or ARN, the rule syntax can support referencing the alarm with or without quotation marks (") around the alarm name or ARN.
-
If specified without quotes, alarm names or ARNs must not contain spaces, round brackets, or commas.
-
If specified within quotes, alarm names or ARNs that include double quotes (") must enclose the " using backslash escape (\) characters for correct interpretation of the reference.
Syntax
The syntax of the expression you use to combine several alarms into one composite alarm uses boolean logic and functions. The following table describes the operators and functions available in rule expressions:
| Operator/Function | Description |
|---|---|
AND |
Logical AND operator. Returns TRUE when all specified conditions are TRUE. |
OR |
Logical OR operator. Returns TRUE when at least one of the specified conditions is TRUE. |
NOT |
Logical NOT operator. Returns TRUE when the specified condition is FALSE. |
AT_LEAST |
Function that returns TRUE when a minimum number or percentage of specified
alarms are in the required state. Format: AT_LEAST(M, STATE_CONDITION, (alarm1,
alarm2, ...alarmN)) where M can be an absolute number or percentage (for example,
50%), and STATE_CONDITION can be ALARM, OK, INSUFFICIENT_DATA, NOT ALARM, NOT OK, or
NOT INSUFFICIENT_DATA. |
You can use parentheses to group conditions and control the order of evaluation in complex expressions.
Example expressions
The request parameter AlarmRule supports the use of the logical operators
AND, OR, and NOT, as well as the AT_LEAST function, so you can combine multiple
functions into a single expressions. The following example expressions show how you can
configure the underlying alarms in your composite alarm:
-
ALARM(CPUUtilizationTooHigh) AND ALARM(DiskReadOpsTooHigh)The expression specifies that the composite alarm goes into
ALARMonly ifCPUUtilizationTooHighandDiskReadOpsTooHighare inALARM. -
AT_LEAST(2, ALARM, (WebServer1CPU, WebServer2CPU, WebServer3CPU, WebServer4CPU))The expression specifies that the composite alarm goes into
ALARMwhen at least 2 out of the 4 web server CPU alarms are inALARMstate. This allows you to trigger alerts based on a threshold of affected resources rather than requiring all or just one to be in alarm state. -
AT_LEAST(50%, OK, (DatabaseConnection1, DatabaseConnection2, DatabaseConnection3, DatabaseConnection4))The expression specifies that the composite alarm goes into
ALARMwhen at least 50% of the database connection alarms are inOKstate. Using percentages allows the rule to adapt dynamically as you add or remove monitored alarms. -
ALARM(CPUUtilizationTooHigh) AND NOT ALARM(DeploymentInProgress)The expression specifies that the composite alarm goes into
ALARMifCPUUtilizationTooHighis inALARMandDeploymentInProgressis not inALARM. This is an example of a composite alarm that reduces alarm noise during a deployment window. -
AT_LEAST(2, ALARM, (AZ1Health, AZ2Health, AZ3Health)) AND NOT ALARM(MaintenanceWindow)The expression specifies that the composite alarm goes into
ALARMwhen at least 2 out of 3 availability zone health alarms are inALARMstate and the maintenance window alarm is not inALARM. This combines the AT_LEAST function with other logical operators for more complex monitoring scenarios.