

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 稽核和記錄
<a name="auditing-and-logging"></a>

**提示**  
 透過 Amazon EKS 研討會[探索](https://aws-experience.com/emea/smb/events/series/get-hands-on-with-amazon-eks?trk=4a9b4147-2490-4c63-bc9f-f8a84b122c8c&sc_channel=el)最佳實務。

收集和分析 【稽核】 日誌對於各種不同的原因很有用。日誌可協助進行根本原因分析和歸因，即對特定使用者 描述變更。當收集到足夠的日誌時，它們也可用於偵測異常行為。在 EKS 上，稽核日誌會傳送至 Amazon Cloudwatch Logs。EKS 的稽核政策如下：

```
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
  # Log full request and response for changes to aws-auth ConfigMap in kube-system namespace
  - level: RequestResponse
    namespaces: ["kube-system"]
    verbs: ["update", "patch", "delete"]
    resources:
      - group: "" # core
        resources: ["configmaps"]
        resourceNames: ["aws-auth"]
    omitStages:
      - "RequestReceived"
  # Do not log watch operations performed by kube-proxy on endpoints and services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
      - group: "" # core
        resources: ["endpoints", "services", "services/status"]
  # Do not log get operations performed by kubelet on nodes and their statuses
  - level: None
    users: ["kubelet"] # legacy kubelet identity
    verbs: ["get"]
    resources:
      - group: "" # core
        resources: ["nodes", "nodes/status"]
  # Do not log get operations performed by the system:nodes group on nodes and their statuses
  - level: None
    userGroups: ["system:nodes"]
    verbs: ["get"]
    resources:
      - group: "" # core
        resources: ["nodes", "nodes/status"]
  # Do not log get and update operations performed by controller manager, scheduler, and endpoint-controller on endpoints in kube-system namespace
  - level: None
    users:
      - system:kube-controller-manager
      - system:kube-scheduler
      - system:serviceaccount:kube-system:endpoint-controller
    verbs: ["get", "update"]
    namespaces: ["kube-system"]
    resources:
      - group: "" # core
        resources: ["endpoints"]
  # Do not log get operations performed by apiserver on namespaces and their statuses/finalizations
  - level: None
    users: ["system:apiserver"]
    verbs: ["get"]
    resources:
      - group: "" # core
        resources: ["namespaces", "namespaces/status", "namespaces/finalize"]
  # Do not log get and list operations performed by controller manager on metrics.k8s.io resources
  - level: None
    users:
      - system:kube-controller-manager
    verbs: ["get", "list"]
    resources:
      - group: "metrics.k8s.io"
  # Do not log access to health, version, and swagger non-resource URLs
  - level: None
    nonResourceURLs:
      - /healthz*
      - /version
      - /swagger*
  # Do not log events resources
  - level: None
    resources:
      - group: "" # core
        resources: ["events"]
  # Log request for updates/patches to nodes and pods statuses by kubelet and node problem detector
  - level: Request
    users: ["kubelet", "system:node-problem-detector", "system:serviceaccount:kube-system:node-problem-detector"]
    verbs: ["update", "patch"]
    resources:
      - group: "" # core
        resources: ["nodes/status", "pods/status"]
    omitStages:
      - "RequestReceived"
  # Log request for updates/patches to nodes and pods statuses by system:nodes group
  - level: Request
    userGroups: ["system:nodes"]
    verbs: ["update", "patch"]
    resources:
      - group: "" # core
        resources: ["nodes/status", "pods/status"]
    omitStages:
      - "RequestReceived"
  # Log delete collection requests by namespace-controller in kube-system namespace
  - level: Request
    users: ["system:serviceaccount:kube-system:namespace-controller"]
    verbs: ["deletecollection"]
    omitStages:
      - "RequestReceived"
  # Log metadata for secrets, configmaps, and tokenreviews to protect sensitive data
  - level: Metadata
    resources:
      - group: "" # core
        resources: ["secrets", "configmaps"]
      - group: authentication.k8s.io
        resources: ["tokenreviews"]
    omitStages:
      - "RequestReceived"
  # Log requests for serviceaccounts/token resources
  - level: Request
    resources:
      - group: "" # core
        resources: ["serviceaccounts/token"]
  # Log get, list, and watch requests for various resource groups
  - level: Request
    verbs: ["get", "list", "watch"]
    resources:
      - group: "" # core
      - group: "admissionregistration.k8s.io"
      - group: "apiextensions.k8s.io"
      - group: "apiregistration.k8s.io"
      - group: "apps"
      - group: "authentication.k8s.io"
      - group: "authorization.k8s.io"
      - group: "autoscaling"
      - group: "batch"
      - group: "certificates.k8s.io"
      - group: "extensions"
      - group: "metrics.k8s.io"
      - group: "networking.k8s.io"
      - group: "policy"
      - group: "rbac.authorization.k8s.io"
      - group: "scheduling.k8s.io"
      - group: "settings.k8s.io"
      - group: "storage.k8s.io"
    omitStages:
      - "RequestReceived"
  # Default logging level for known APIs to log request and response
  - level: RequestResponse
    resources:
      - group: "" # core
      - group: "admissionregistration.k8s.io"
      - group: "apiextensions.k8s.io"
      - group: "apiregistration.k8s.io"
      - group: "apps"
      - group: "authentication.k8s.io"
      - group: "authorization.k8s.io"
      - group: "autoscaling"
      - group: "batch"
      - group: "certificates.k8s.io"
      - group: "extensions"
      - group: "metrics.k8s.io"
      - group: "networking.k8s.io"
      - group: "policy"
      - group: "rbac.authorization.k8s.io"
      - group: "scheduling.k8s.io"
      - group: "settings.k8s.io"
      - group: "storage.k8s.io"
    omitStages:
      - "RequestReceived"
  # Default logging level for all other requests to log metadata only
  - level: Metadata
    omitStages:
      - "RequestReceived"
```

## 建議
<a name="_recommendations"></a>

### 啟用稽核日誌
<a name="_enable_audit_logs"></a>

稽核日誌是 EKS 管理的 EKS 受管 Kubernetes 控制平面日誌的一部分。如需啟用/停用控制平面日誌的指示，包括 Kubernetes API 伺服器、控制器管理員和排程器的日誌，以及稽核日誌，請參閱 https：//https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html\$1enabling-control-plane-log-export。

**注意**  
當您啟用控制平面記錄時，在 CloudWatch 中存放日誌會產生[成本](https://aws.amazon.com/cloudwatch/pricing/)。這引發了有關持續安全成本的更廣泛的問題。最後，您必須權衡這些成本與安全違規的成本，例如 財務損失、聲譽受損等。您可能會發現，您只能實作本指南中的一些建議，以充分保護您的環境。

**警告**  
CloudWatch Logs 項目的大小上限為 [1MB](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html)，Kubernetes API 請求大小上限為 1.5MiB。大於 1MB 的日誌項目將被截斷，或僅包含請求中繼資料。

### 使用稽核中繼資料
<a name="_utilize_audit_metadata"></a>

Kubernetes 稽核日誌包含兩個註釋，指出請求是否獲得授權，`authorization.k8s.io/decision`以及決策的原因`authorization.k8s.io/reason`。使用這些屬性來確定允許特定 API 呼叫的原因。

### 建立可疑事件的警示
<a name="_create_alarms_for_suspicious_events"></a>

建立警示，在 403 禁止和 401 未經授權的回應增加時自動提醒您，然後使用 `host`、 `sourceIPs`和 等屬性`k8s_user.username`來了解這些請求的來源。

### 使用 Log Insights 分析日誌
<a name="_analyze_logs_with_log_insights"></a>

使用 CloudWatch Log Insights 監控 RBAC 物件的變更，例如 Roles、RoleBindings、ClusterRoles 和 ClusterRoleBindings。以下顯示幾個範例查詢：

列出 ConfigMap `aws-auth` 的更新：

```
fields @timestamp, @message
| filter @logStream like "kube-apiserver-audit"
| filter verb in ["update", "patch"]
| filter objectRef.resource = "configmaps" and objectRef.name = "aws-auth" and objectRef.namespace = "kube-system"
| sort @timestamp desc
```

列出驗證 Webhook 的新 或變更的建立：

```
fields @timestamp, @message
| filter @logStream like "kube-apiserver-audit"
| filter verb in ["create", "update", "patch"] and responseStatus.code = 201
| filter objectRef.resource = "validatingwebhookconfigurations"
| sort @timestamp desc
```

列出角色的建立、更新、刪除操作：

```
fields @timestamp, @message
| sort @timestamp desc
| limit 100
| filter objectRef.resource="roles" and verb in ["create", "update", "patch", "delete"]
```

列出 RoleBindings 的建立、更新、刪除操作：

```
fields @timestamp, @message
| sort @timestamp desc
| limit 100
| filter objectRef.resource="rolebindings" and verb in ["create", "update", "patch", "delete"]
```

列出 ClusterRoles 的建立、更新、刪除操作：

```
fields @timestamp, @message
| sort @timestamp desc
| limit 100
| filter objectRef.resource="clusterroles" and verb in ["create", "update", "patch", "delete"]
```

列出 ClusterRoleBindings 的建立、更新、刪除操作：

```
fields @timestamp, @message
| sort @timestamp desc
| limit 100
| filter objectRef.resource="clusterrolebindings" and verb in ["create", "update", "patch", "delete"]
```

針對秘密繪製未經授權的讀取操作：

```
fields @timestamp, @message
| sort @timestamp desc
| limit 100
| filter objectRef.resource="secrets" and verb in ["get", "watch", "list"] and responseStatus.code="401"
| stats count() by bin(1m)
```

失敗的匿名請求清單：

```
fields @timestamp, @message, sourceIPs.0
| sort @timestamp desc
| limit 100
| filter user.username="system:anonymous" and responseStatus.code in ["401", "403"]
```

### 稽核 CloudTrail 日誌
<a name="_audit_your_cloudtrail_logs"></a>

使用服務帳戶 IAM 角色 (IRSA) 的 Pod 呼叫的 AWS APIs 會自動記錄到 CloudTrail 和服務帳戶的名稱。如果未明確授權呼叫 API 的服務帳戶名稱出現在日誌中，可能表示 IAM 角色的信任政策設定錯誤。一般而言，Cloudtrail 是對特定 IAM 主體進行 AWS API 呼叫的絕佳方式。

### 使用 CloudTrail Insights 來偵錯可疑活動
<a name="_use_cloudtrail_insights_to_unearth_suspicious_activity"></a>

CloudTrail 洞察會自動分析來自 CloudTrail 追蹤的寫入管理事件，並提醒您異常活動。這可協助您識別 AWS 帳戶中寫入 APIs的呼叫量何時增加，包括來自使用 IRSA 擔任 IAM 角色的 Pod。如需詳細資訊，請參閱[宣布 CloudTrail Insights：識別和回應異常 API 活動](https://aws.amazon.com/blogs/aws/announcing-cloudtrail-insights-identify-and-respond-to-unusual-api-activity/)。

### 其他資源
<a name="_additional_resources"></a>

隨著日誌數量的增加，使用 Log Insights 或其他日誌分析工具剖析和篩選日誌可能會變得無效。或者，建議您考慮執行 [Sysdig Falco](https://github.com/falcosecurity/falco) 和 [ekscloudwatch](https://github.com/sysdiglabs/ekscloudwatch)。Falco 會分析稽核日誌，並標記長時間的異常或濫用。ekscloudwatch 專案會將稽核日誌事件從 CloudWatch 轉送至 Falco 進行分析。Falco 提供一組[預設稽核規則](https://github.com/falcosecurity/plugins/blob/master/plugins/k8saudit/rules/k8s_audit_rules.yaml)，以及新增您自己的功能。

然而，另一個選項可能是將稽核日誌存放在 S3 中，並使用 SageMaker [隨機剪切森林](https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html)演算法來異常行為，這些行為需要進一步調查。

## 工具和資源
<a name="_tools_and_resources"></a>

下列商業和開放原始碼專案可用來評估叢集與既定最佳實務的一致性：
+  [Amazon EKS 安全浸入研討會 - Detective 控制](https://catalog.workshops.aws/eks-security-immersionday/en-US/5-detective-controls) 
+  [kubeaudit](https://github.com/Shopify/kubeaudit) 
+  [kube-scan](https://github.com/octarinesec/kube-scan) 根據 Kubernetes 常用組態評分系統架構，為叢集中執行的工作負載指派風險分數
+  [kubesec.io](https://kubesec.io/) 
+  [極性](https://github.com/FairwindsOps/polaris) 
+  [右](https://github.com/aquasecurity/starboard)舷 
+  [斯尼克](https://support.snyk.io/hc/en-us/articles/360003916138-Kubernetes-integration-overview) 
+  [Kubescape](https://github.com/kubescape/kubescape) Kubescape 是一種開放原始碼 kubernetes 安全工具，可掃描叢集、YAML 檔案和 Helm Chart。它根據多個架構 （包括 [NSA-CISA](https://www.armosec.io/blog/kubernetes-hardening-guidance-summary-by-armo/?utm_source=github&utm_medium=repository) 和 [MITRE ATT&CK®) ](https://www.microsoft.com/security/blog/2021/03/23/secure-containerized-environments-with-updated-threat-matrix-for-kubernetes/)偵測錯誤組態。