示例查询 - Amazon CloudWatch Logs

示例查询

本部分包含常用的查询命令列表,您可以在 CloudWatch 控制台 中运行这些命令。有关如何运行查询命令的信息,请参阅《Amazon CloudWatch Logs 用户指南》中的教程:运行和修改示例查询

有关查询语法的更多信息,请参阅 CloudWatch Logs Insights 语法查询语法

常规查询

查找 25 个最近添加的日志事件。

fields @timestamp, @message | sort @timestamp desc | limit 25

获取每小时异常数量的列表。

filter @message like /Exception/ | stats count(*) as exceptionCount by bin(1h) | sort exceptionCount desc

获取非异常的日志事件的列表。

fields @message | filter @message not like /Exception/

获取 server 字段每个唯一值的最新日志事件。

fields @timestamp, server, severity, message | sort @timestamp asc | dedup server

针对每个 severity 类型获取 server 字段每个唯一值的最新日志事件。

fields @timestamp, server, severity, message | sort @timestamp desc | dedup server, severity

Lambda 日志的查询

确定超额配置的内存量。

filter @type = "REPORT" | stats max(@memorySize / 1000 / 1000) as provisonedMemoryMB, min(@maxMemoryUsed / 1000 / 1000) as smallestMemoryRequestMB, avg(@maxMemoryUsed / 1000 / 1000) as avgMemoryUsedMB, max(@maxMemoryUsed / 1000 / 1000) as maxMemoryUsedMB, provisonedMemoryMB - maxMemoryUsedMB as overProvisionedMB

创建延迟报告。

filter @type = "REPORT" | stats avg(@duration), max(@duration), min(@duration) by bin(5m)

搜索慢速函数调用,并消除可能因重试或客户端代码而产生的重复请求。在此查询中,@duration 以毫秒为单位。

fields @timestamp, @requestId, @message, @logStream | filter @type = "REPORT" and @duration > 1000 | sort @timestamp desc | dedup @requestId | limit 20

Amazon VPC 流日志的查询

查找跨主机的前 15 个数据包传输:

stats sum(packets) as packetsTransferred by srcAddr, dstAddr | sort packetsTransferred desc | limit 15

查找给定子网上传输字节数最多的 15 个主机。

filter isIpv4InSubnet(srcAddr, "192.0.2.0/24") | stats sum(bytes) as bytesTransferred by dstAddr | sort bytesTransferred desc | limit 15

查找使用 UDP 作为数据传输协议的 IP 地址。

filter protocol=17 | stats count(*) by srcAddr

在捕获时段内查找跳过流记录的 IP 地址。

filter logStatus="SKIPDATA" | stats count(*) by bin(1h) as t | sort t

为每个连接查找一条记录,以帮助解决网络连接问题。

fields @timestamp, srcAddr, dstAddr, srcPort, dstPort, protocol, bytes | filter logStream = 'vpc-flow-logs' and interfaceId = 'eni-0123456789abcdef0' | sort @timestamp desc | dedup srcAddr, dstAddr, srcPort, dstPort, protocol | limit 20

Route 53 日志的查询

查找每小时每种查询类型的记录分布。

stats count(*) by queryType, bin(1h)

查找具有最高请求数的 10 个 DNS 解析程序。

stats count(*) as numRequests by resolverIp | sort numRequests desc | limit 10

按服务器未能完成 DNS 请求的域和子域查找记录数。

filter responseCode="SERVFAIL" | stats count(*) by queryName

CloudTrail 日志的查询

查找每项服务、事件类型和 AWS 区域的日志条目数。

stats count(*) by eventSource, eventName, awsRegion

查找给定 AWS 区域中已启动或已停止的 Amazon EC2 主机。

filter (eventName="StartInstances" or eventName="StopInstances") and awsRegion="us-east-2"

查找新建的 IAM 用户的 AWS 区域、用户名和 ARN。

filter eventName="CreateUser" | fields awsRegion, requestParameters.userName, responseElements.user.arn

查找在调用 API UpdateTrail 时发生异常的记录数。

filter eventName="UpdateTrail" and ispresent(errorCode) | stats count(*) by errorCode, errorMessage

查找使用 TLS 1.0 或 1.1 的日志条目

filter tlsDetails.tlsVersion in [ "TLSv1", "TLSv1.1" ] | stats count(*) as numOutdatedTlsCalls by userIdentity.accountId, recipientAccountId, eventSource, eventName, awsRegion, tlsDetails.tlsVersion, tlsDetails.cipherSuite, userAgent | sort eventSource, eventName, awsRegion, tlsDetails.tlsVersion

查找使用 TLS 1.0 或 1.1 的服务调用数

filter tlsDetails.tlsVersion in [ "TLSv1", "TLSv1.1" ] | stats count(*) as numOutdatedTlsCalls by eventSource | sort numOutdatedTlsCalls desc

对 Amazon API Gateway 的查询

查找最近 10 个 4XX 错误

fields @timestamp, status, ip, path, httpMethod | filter status>=400 and status<=499 | sort @timestamp desc | limit 10

确定 Amazon API Gateway 访问日志组中运行时间最长的 10 个 Amazon API Gateway 请求

fields @timestamp, status, ip, path, httpMethod, responseLatency | sort responseLatency desc | limit 10

返回 Amazon API Gateway 访问日志组中最受欢迎的 API 路径列表

stats count(*) as requestCount by path | sort requestCount desc | limit 10

为 Amazon API Gateway 访问日志组创建集成延迟报告

filter status=200 | stats avg(integrationLatency), max(integrationLatency), min(integrationLatency) by bin(1m)

NAT 网关的查询

如果您注意到您的 AWS 帐单高于正常成本,则可以使用 CloudWatch Logs Insights 查找主要耗用项。有关以下查询命令的更多信息,请参阅 AWS 高级支持页面中的如何在我的 VPC 中找到通过 NAT 网关的主要流量耗用项?

注意

在以下查询命令中,将 "x.x.x.x" 替换为 NAT 网关的私有 IP,然后用 VPC CIDR 范围的前两个八位字节替换 "y.y"。

查找通过 NAT 网关发送流量最多的实例。

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr | sort bytesTransferred desc | limit 10

确定 NAT 网关中进出实例的流量。

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') or (srcAddr like 'xxx.xx.xx.xx' and dstAddr like 'y.y.') | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr | sort bytesTransferred desc | limit 10

确定 VPC 中实例最常与之进行上传和下载的互联网目标。

对于上载

filter (srcAddr like 'x.x.x.x' and dstAddr not like 'y.y.') | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr | sort bytesTransferred desc | limit 10

对于下载

filter (dstAddr like 'x.x.x.x' and srcAddr not like 'y.y.') | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr | sort bytesTransferred desc | limit 10

查询 Apache 服务器日志

您可以使用 CloudWatch Logs Insights 查询 Apache 服务器日志。有关以下查询的更多信息,请参阅 AWS 云运营与迁移博客中的使用 CloudWatch Logs Insights 简化 Apache 服务器日志

找到最相关的字段,以便您可以在应用程序的 /admin 路径中审阅访问日志并查看流量。

fields @timestamp, remoteIP, request, status, filename| sort @timestamp desc | filter filename="/var/www/html/admin" | limit 20

使用状态代码“200”(成功)查找访问主页的唯一 GET 请求数量。

fields @timestamp, remoteIP, method, status | filter status="200" and referrer= http://34.250.27.141/ and method= "GET" | stats count_distinct(remoteIP) as UniqueVisits | limit 10

查找 Apache 服务重新启动的次数。

fields @timestamp, function, process, message | filter message like "resuming normal operations" | sort @timestamp desc | limit 20

针对 Amazon EventBridge 的查询

获取按事件详细信息类型分组的 EventBridge 事件的数量

fields @timestamp, @message | stats count(*) as numberOfEvents by `detail-type` | sort numberOfEvents desc

解析命令的示例

使用 glob 表达式来从日志字段 @message 提取字段 @user@method@latency,并对于 @method@user 的每个唯一组合返回平均延迟。

parse @message "user=*, method:*, latency := *" as @user, @method, @latency | stats avg(@latency) by @method, @user

使用正则表达式从日志字段 @message 提取字段 @user2@method2@latency2,并对于 @method2@user2 的每个唯一组合返回平均延迟。

parse @message /user=(?<user2>.*?), method:(?<method2>.*?), latency := (?<latency2>.*?)/ | stats avg(latency2) by @method2, @user2

提取字段 loggingTimeloggingTypeloggingMessage,筛选出包含 ERRORINFO 字符串的日志事件,然后仅显示包含 ERROR 字符串的事件的 loggingMessageloggingType 字段。

FIELDS @message | PARSE @message "* [*] *" as loggingTime, loggingType, loggingMessage | FILTER loggingType IN ["ERROR", "INFO"] | DISPLAY loggingMessage, loggingType = "ERROR" as isError