stats
Use stats
to create visualizations of your log data such as
bar charts, line charts, and stacked area charts. This helps you more
efficiently identify patterns in your log data. CloudWatch Logs Insights generates
visualizations for queries that use the stats
function and one
or more aggregation functions.
For example, the following query in a Route 53 log group returns visualizations showing the distribution of Route 53 records per hour, by query type.
stats count(*) by queryType, bin(1h)
All such queries can produce bar charts. If your query uses the
bin()
function to group the data by one field over time,
you can also see line charts and stacked area charts.
The following time units and abbreviations are supported with the
bin
function. For all units and abbreviations that include
more than one character, adding s to pluralize is supported. So both
hr
and hrs
work to specify hours.
-
millisecond
ms
msec
-
second
s
sec
-
minute
m
min
-
hour
h
hr
-
day
d
-
week
w
-
month
mo
mon
-
quarter
q
qtr
-
year
y
yr
Topics
Visualize time series data
Time series visualizations work for queries with the following characteristics:
-
The query contains one or more aggregation functions. For more information, see Aggregation Functions in the Stats Command.
-
The query uses the
bin()
function to group the data by one field.
These queries can produce line charts, stacked area charts, bar charts, and pie charts.
Examples
For a complete tutorial, see Tutorial: Run a query that produces a time series visualization.
Here are more example queries that work for time series visualization.
The following query generates a visualization of the average values of
the myfield1
field, with a data point created every five
minutes. Each data point is the aggregation of the averages of the
myfield1
values from the logs from the previous five
minutes.
stats avg(myfield1) by bin(5m)
The following query generates a visualization of three values based on
different fields, with a data point created every five minutes. The
visualization is generated because the query contains aggregate
functions and uses bin()
as the grouping field.
stats avg(myfield1), min(myfield2), max(myfield3) by bin(5m)
Line chart and stacked area chart restrictions
Queries that aggregate log entry information but don't use the
bin()
function can generate bar charts. However, the
queries cannot generate line charts or stacked area charts. For more
information about these types of queries, see Visualize log data
grouped by fields.
Visualize log data grouped by fields
You can produce bar charts for queries that use the stats
function and one or more aggregation functions. For more information,
see Aggregation Functions in the Stats Command.
To see the visualization, run your query. Then choose the Visualization tab, select the arrow next to Line, and choose Bar. Visualizations are limited to up to 100 bars in the bar chart.
Examples
For a complete tutorial, see Tutorial: Run a query that produces a visualization grouped by log fields. The following paragraphs include more example queries for visualization by fields.
The following VPC flow log query finds the average number of bytes transferred per session for each destination address.
stats avg(bytes) by dstAddr
You can also produce a chart that includes more than one bar for each resulting value. For example, the following VPC flow log query finds the average and maximum number of bytes transferred per session for each destination address.
stats avg(bytes), max(bytes) by dstAddr
The following query finds the number of Amazon Route 53 query logs for each query type.
stats count(*) by queryType
Use multiple stats commands in a single query
You can use as many as two stats
commands in a single
query. This enables you to perform an additional aggregation on the
output of the first aggregation.
Example: Query with two stats
commands
For example, the following query first find the total traffic volume in 5-minute bins, then calculates the highest, lowest, and average traffic volume among those 5-minute bins.
FIELDS strlen(@message) AS message_length | STATS sum(message_length)/1024/1024 as logs_mb BY bin(5m) | STATS max(logs_mb) AS peak_ingest_mb, min(logs_mb) AS min_ingest_mb, avg(logs_mb) AS avg_ingest_mb
Example: Combine multiple stats commands with
other functions such as filter
, fields
,
bin
You can combine two stats
commands with other commands
such as filter
and fields
in a single query.
For example, the following query finds the number of distinct IP
addresses in sessions and finds the number of sessions by client
platform, filters those IP addresses, and then finally finds the average
of session requests per client platform.
STATS count_distinct(client_ip) AS session_ips, count(*) AS requests BY session_id, client_platform | FILTER session_ips > 1 | STATS count(*) AS multiple_ip_sessions, sum(requests) / count(*) AS avg_session_requests BY client_platform
You can use bin
and dateceil
functions in
queries with multiple stats
commands. For example, the
following query first combines messages into 5-minute blocks, then
aggregates those 5-minute blocks into 10-minute blocks and calculates
the highest, lowest, and average traffic volumes within each 10-minute
block.
FIELDS strlen(@message) AS message_length | STATS sum(message_length) / 1024 / 1024 AS logs_mb BY BIN(5m) as @t | STATS max(logs_mb) AS peak_ingest_mb, min(logs_mb) AS min_ingest_mb, avg(logs_mb) AS avg_ingest_mb BY dateceil(@t, 10m)
Notes and limitations
A query can have a maximum of two stats
commands. This
quota can't be changed.
If you use a sort
or limit
command, it must
appear after the second stats
command. If it is before the
second stats
command, the query is not valid.
When a query has two stats
commands, the partial results
from the query do not begin displaying until the first
stats
aggregation is complete.
In the second stats
command in a single query, you can
refer only to fields that are defined in the first stats
command. For example, the following query is not valid because the
@message
field won't be available after the first
stats
aggregation.
FIELDS @message | STATS SUM(Fault) by Operation # You can only reference `SUM(Fault)` or Operation at this point | STATS MAX(strlen(@message)) AS MaxMessageSize # Invalid reference to @message
Any fields that you reference after the first stats
command must be defined in that first stats
command.
STATS sum(x) as sum_x by y, z | STATS max(sum_x) as max_x by z # You can only reference `max(sum_x)`, max_x or z at this point
Important
The bin
function always implicitly uses the
@timestamp
field. This means that you can't use
bin
in the second stats
command
without using the first stats
command to propagate the
timestamp
field. For example, the following query
is not valid.
FIELDS strlen(@message) AS message_length | STATS sum(message_length) AS ingested_bytes BY @logStream | STATS avg(ingested_bytes) BY bin(5m) # Invalid reference to @timestamp field
Instead, define the @timestamp
field in the first
stats
command, and then you can use it with
dateceil
in the second stats
command
as in the following example.
FIELDS strlen(@message) AS message_length | STATS sum(message_length) AS ingested_bytes, max(@timestamp) as @t BY @logStream | STATS avg(ingested_bytes) BY dateceil(@t, 5m)
Functions to use with stats
CloudWatch Logs Insights supports both stats aggregation functions and stats non-aggregation functions.
Use statsaggregation functions in the stats
command and
as arguments for other functions.
Function | Result type | Description |
---|---|---|
|
number |
The average of the values in the specified field. |
|
number |
Counts the log events. |
|
number |
Returns the number of unique values for the field.
If the field has very high cardinality (contains
many unique values), the value returned by
|
|
LogFieldValue |
The maximum of the values for this log field in the queried logs. |
|
LogFieldValue |
The minimum of the values for this log field in the queried logs. |
|
LogFieldValue |
A percentile indicates the relative standing of a
value in a dataset. For example,
|
|
number |
The standard deviation of the values in the specified field. |
|
number |
The sum of the values in the specified field. |
Stats non-aggregation functions
Use non-aggregation functions in the stats
command and
as arguments for other functions.
Function | Result type | Description |
---|---|---|
|
LogField |
Returns the value of |
|
LogField |
Returns the value of |
|
LogField |
Returns the value of |
|
LogField |
Returns the value of |