Amazon EKS 和 Kubernetes Container Insights 指標

下表列出 Container Insights 為 Amazon EKS 和 Kubernetes 收集的指標和維度。這些指標會在 ContainerInsights 命名空間中。如需詳細資訊，請參閱指標。

如果您沒有在主控台中看到任何容器洞見指標，請確定您已完成容器洞見的設定。在完整設定容器洞見前指標都不會出現。如需詳細資訊，請參閱設定 Container Insights。

指標名稱	維度	描述
`cluster_failed_node_count`	`ClusterName`	叢集中失敗的工作者節點數量。如果節點受困於任何節點條件，則會將其判定為失敗。如需詳細資訊，請參閱 Kubernetes 文件中的條件。
`cluster_node_count`	`ClusterName`	叢集中的工作者節點總數。
`namespace_number_of_running_pods`	`Namespace` `ClusterName` `ClusterName`	資源中每個命名空間執行的 pod 數量，該資源由您正在使用的維度所指定。
`node_cpu_limit`	`ClusterName`	可指派至此叢集中單一節點的 CPU 單位數量上限。
`node_cpu_reserved_capacity`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	為節點元件 (例如 kubelet、kube-proxy 和 Docker) 預留的 CPU 單位百分比。公式：`node_cpu_request / node_cpu_limit` 注意 `node_cpu_request` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`node_cpu_usage_total`	`ClusterName`	叢集中節點上正在使用的 CPU 單位數量。
`node_cpu_utilization`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	叢集中節點上正在使用的 CPU 單位百分比總數。公式：`node_cpu_usage_total / node_cpu_limit`
`node_gpu_limit`	`ClusterName` `ClusterName`, `InstanceId`, `NodeName`	節點上可用的 GPU 總數 (s)。
`node_gpu_usage_total`	`ClusterName` `ClusterName`, `InstanceId`, `NodeName`	執行中 Pod 在節點上使用的 GPU 數量 (s)。
`node_gpu_reserved_capacity`	`ClusterName` `ClusterName`, `InstanceId`, `NodeName`	節點上目前保留的 GPU 百分比。公式為 `node_gpu_request / node_gpu_limit`。注意 `node_gpu_request` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`node_filesystem_utilization`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	叢集中節點上正在使用的檔案系統容量百分比總數。公式：`node_filesystem_usage / node_filesystem_capacity` 注意 `node_filesystem_usage` 和 `node_filesystem_capacity` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`node_memory_limit`	`ClusterName`	可指派至此叢集中單一節點的記憶體數量上限 (以位元組為單位)。
`node_memory_reserved_capacity`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	目前在叢集中節點上使用的記憶體百分比。公式：`node_memory_request / node_memory_limit` 注意 `node_memory_request` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`node_memory_utilization`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	節點目前使用的記憶體百分比。這是節點記憶體使用量除以節點記憶體限制的百分比。公式：`node_memory_working_set / node_memory_limit`。
`node_memory_working_set`	`ClusterName`	叢集中運作中一組節點中正在使用的記憶體數量 (以位元組為單位)。
`node_network_total_bytes`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	叢集中每個節點每秒透過網路傳輸和接收的位元組總數。公式：`node_network_rx_bytes + node_network_tx_bytes` 注意 `node_network_rx_bytes` 和 `node_network_tx_bytes` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`node_number_of_running_containers`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	叢集中每個節點執行中的容器數。
`node_number_of_running_pods`	`NodeName`, `ClusterName`, `InstanceId` `ClusterName`	叢集中每個節點執行中的 pod 數。
`pod_cpu_reserved_capacity`	`PodName`, `Namespace`, `ClusterName` `ClusterName`	叢集中每個 pod 預留的 CPU 容量。公式：`pod_cpu_request / node_cpu_limit` 注意 `pod_cpu_request` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`pod_cpu_utilization`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 使用的 CPU 單位百分比。公式：`pod_cpu_usage_total / node_cpu_limit`
`pod_cpu_utilization_over_pod_limit`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 正在使用的 CPU 單位百分比，此百分比與 Pod 限制相對。公式：`pod_cpu_usage_total / pod_cpu_limit`
`pod_gpu_request`	`ClusterName` `ClusterName`, `Namespace`, `PodName` `ClusterName`, `FullPodName`, `Namespace`, `PodName`	Pod 的 GPU 請求。此值必須一律等於 `pod_gpu_limit`。
`pod_gpu_limit`	`ClusterName` `ClusterName`, `Namespace`, `PodName` `ClusterName`, `FullPodName`, `Namespace`, `PodName`	可指派給節點中 Pod 的 GPU 數量上限（上限）。
`pod_gpu_usage_total`	`ClusterName` `ClusterName`, `Namespace`, `PodName` `ClusterName`, `FullPodName`, `Namespace`, `PodName`	在 Pod 上配置的 GPU 數量 (s)。
`pod_gpu_reserved_capacity`	`ClusterName` `ClusterName`, `Namespace`, `PodName` `ClusterName`, `FullPodName`, `Namespace`, `PodName`	目前為 Pod 預留的 GPU 百分比。公式為 - pod_gpu_request / node_gpu_reserved_capacity。
`pod_memory_reserved_capacity`	`PodName`, `Namespace`, `ClusterName` `ClusterName`	為 Pod 保留的記憶體百分比。公式：`pod_memory_request / node_memory_limit` 注意 `pod_memory_request` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`pod_memory_utilization`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 目前使用的記憶體百分比。公式：`pod_memory_working_set / node_memory_limit`
`pod_memory_utilization_over_pod_limit`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 正在使用的記憶體百分比，此百分比與 Pod 限制相對。如果 Pod 中有任何容器未定義記憶體限制，這個指標將不會顯示。公式：`pod_memory_working_set / pod_memory_limit`
`pod_network_rx_bytes`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 透過網路每秒接收的位元組數。公式：`sum(pod_interface_network_rx_bytes)` 注意 `pod_interface_network_rx_bytes` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`pod_network_tx_bytes`	`PodName`, `Namespace`, `ClusterName` `Namespace`, `ClusterName` `Service`, `Namespace`, `ClusterName` `ClusterName`	Pod 透過網路每秒傳輸的位元組數。公式：`sum(pod_interface_network_tx_bytes)` 注意 `pod_interface_network_tx_bytes` 不會直接回報為指標，而是效能日誌事件中的欄位。如需詳細資訊，請參閱Amazon EKS 和 Kubernetes 效能日誌事件中的相關欄位。
`pod_number_of_container_restarts`	`PodName`, `Namespace`, `ClusterName`	Pod 中重新啟動的容器總數。
`service_number_of_running_pods`	`Service`, `Namespace`, `ClusterName` `ClusterName`	叢集中執行服務的 Pod 數量。

Kueue 指標

從 CloudWatch 可觀測性 EKS 附加元件v2.4.0-eksbuild.1的版本開始，適用於 Amazon EKS 的 Container Insights 支援從 Amazon EKS 叢集收集 Kueue 指標。如需附加元件的詳細資訊，請參閱使用 Amazon CloudWatch 可觀測性 EKS 附加元件或 Helm Chart 安裝 CloudWatch 代理程式 Amazon CloudWatch 。

如需啟用指標的資訊，請參閱啟用 Kueue 指標以啟用指標。

下表列出收集的 Kueue 指標。這些指標會發佈至 CloudWatch 中的ContainerInsights/Prometheus命名空間。其中一些指標使用以下維度：

ClusterQueue 是 ClusterQueue 的名稱
的可能值Status為 active和 inadmissible
的可能值Reason為 Preempted、PodsReadyTimeout、ClusterQueueStopped、 AdmissionCheck和 InactiveWorkload
Flavor 是參考的口味。
Resource 是指叢集電腦資源，例如 cpu、gpu、 memory等。

指標名稱	維度	描述
`kueue_pending_workloads`	`ClusterName`, `ClusterQueue`, `Status` `ClusterName`, `ClusterQueue` `ClusterName`, `Status` `ClusterName`	待處理工作負載的數量。
`kueue_evicted_workloads_total`	`ClusterName`, `ClusterQueue`, `Reason` `ClusterName`, `ClusterQueue` `ClusterName`, `Reason` `ClusterName`	已移出工作負載的總數。
`kueue_admitted_active_workloads`	`ClusterName`, `ClusterQueue` `ClusterName`	作用中的已認可工作負載數量（未暫停和未完成）。
`kueue_cluster_queue_resource_usage`	`ClusterName`, `ClusterQueue`, `Resource`, `Flavor` `ClusterName`, `ClusterQueue`, `Resource` `ClusterName`, `ClusterQueue`, `Flavor` `ClusterName`, `ClusterQueue` `ClusterName`	報告 ClusterQueue 的總資源用量。
`kueue_cluster_queue_nominal_quota`	`ClusterName`, `ClusterQueue`, `Resource`, `Flavor` `ClusterName`, `ClusterQueue`, `Resource` `ClusterName`, `ClusterQueue`, `Flavor` `ClusterName`, `ClusterQueue` `ClusterName`	報告 ClusterQueue 的資源配額。

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

具有增強型可觀測性指標的 Amazon EKS 和 Kubernetes Container Insights

效能日誌參考

Amazon EKS 和 Kubernetes Container Insights 指標

注意

注意

注意

注意

注意

注意

注意

注意

注意

Kueue 指標