

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 使用 AWS FIS aws: ecs: task actions
<a name="ecs-task-actions"></a>

 您可以执行 **aws:ecs:task** 操作，为 Amazon ECS 任务注入故障。支持 Amazon EC2 和 Fargate 容量类型。

 这些操作使用 [AWS Systems Manager（SSM）文档](actions-ssm-agent.html#fis-ssm-docs)来注入故障。要使用`aws:ecs:task`操作，您需要将带有 SSM 代理的容器添加到您的亚马逊弹性容器服务 (Amazon ECS) 任务定义中。该容器运行一个 [AWS FIS 定义的脚本](#ecs-task-reference)，以在 SSM 服务中将 Amazon ECS 任务注册为托管实例。此外，该脚本还会检索任务元数据以向托管实例中添加标签。该设置将允许 AWS FIS 解决目标任务。本段落指的是下图中的**设置**。

 当您运行 FIS 实验定位 AWS 时`aws:ecs:task`， AWS FIS 会使用资源标签将您在 AWS FIS 实验模板中指定的目标 Amazon ECS 任务映射到一组 SSM 托管实例。`ECS_TASK_ARN`标签值是应执行 SSM 文档的关联 Amazon ECS 任务的 ARN。本段落指的是下图中的**故障注入**。

 下图通过包含一个现有容器的任务举例说明了设置和故障注入。

![该图显示了使用 SSM 代理容器进行的 Amazon ECS 任务错误注入设置](http://docs.aws.amazon.com/zh_cn/fis/latest/userguide/images/ecs-actions.png)


## 操作
<a name="supported-ecs-task-actions"></a>
+ [aws: ecs: task-cpu-stress](fis-actions-reference.md#task-cpu-stress)
+ [aws: ecs: task-io-stress](fis-actions-reference.md#task-io-stress)
+ [aws: ecs: 任务终止进程](fis-actions-reference.md#task-kill-process)
+ [aws: ecs: 任务网络黑洞端口](fis-actions-reference.md#task-network-blackhole-port)
+ [aws: ecs: 任务网络延迟](fis-actions-reference.md#task-network-latency)
+ [aws: ecs: task-network-packet-loss](fis-actions-reference.md#task-network-packet-loss)

## 限制
<a name="ecs-task-limitations"></a>
+ 以下操作无法并行运行：
  + aws:ecs:task-network-blackhole-port
  + aws:ecs:task-network-latency
  + aws:ecs:task-network-packet-loss
+ 如果您启用了 Amazon ECS Exec，则必须先将其禁用，然后才能使用这些操作。
+ 即使实验的状态为“已完成”，SSM 文档的执行也可能会处于“已取消”状态。在执行 Amazon ECS 操作时，客户提供的持续时间用于实验中的操作持续时间和 Amazon EC2 Systems Manager (SSM) 文档持续时间。操作启动后，SSM 文档需要一段时间才能开始运行。因此，当达到指定的操作持续时间时，SSM 文档可能还需要几秒钟时间才能完成执行。当达到实验操作持续时间时，操作将停止，并且 SSM 文档执行将被取消。故障注入成功。

## 要求
<a name="ecs-task-requirements"></a>
+ 向 AWS FIS [实验角色](getting-started-iam-service-role.md)添加以下权限：
  + `ecs:DescribeTasks`
  + `ssm:SendCommand`
  + `ssm:ListCommands`
  + `ssm:CancelCommand`
+ 为 Amazon ECS 的[任务 IAM 角色](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html)添加以下权限：
  + `ssm:CreateActivation`
  + `ssm:AddTagsToResource`
  + `iam:PassRole`

  请注意，您可以将托管实例角色的 ARN 指定为 `iam:PassRole` 资源。
+ 创建 Amazon ECS [任务执行 IAM 角色](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html)并添加[AmazonECSTaskExecutionRolePolicy](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonECSTaskExecutionRolePolicy.html)托管策略。
+ 在任务定义中，将环境变量 `MANAGED_INSTANCE_ROLE_NAME` 设置为[托管实例角色](https://docs.aws.amazon.com/systems-manager/latest/userguide/hybrid-multicloud-service-role.html)的名称。该角色将附加到在 SSM 中注册为托管实例的任务。
+ 为托管实例角色添加以下权限：
  + `ssm:DeleteActivation`
  + `ssm:DeregisterManagedInstance`
+ 将[AmazonSSMManagedInstanceCore](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSSMManagedInstanceCore.html)托管策略添加到托管实例角色。
+ 将 SSM 代理容器添加到 Amazon ECS 任务定义中。命令脚本将 Amazon ECS 任务注册为托管实例。

  ```
  {
      "name": "amazon-ssm-agent",
      "image": "public.ecr.aws/amazon-ssm-agent/amazon-ssm-agent:latest",
      "cpu": 0,
      "links": [],
      "portMappings": [],
      "essential": false,
      "entryPoint": [],
      "command": [
          "/bin/bash",
          "-c",
          "set -e; dnf upgrade -y; dnf install jq procps awscli -y; term_handler() { echo \"Deleting SSM activation $ACTIVATION_ID\"; if ! aws ssm delete-activation --activation-id $ACTIVATION_ID --region $ECS_TASK_REGION; then echo \"SSM activation $ACTIVATION_ID failed to be deleted\" 1>&2; fi; MANAGED_INSTANCE_ID=$(jq -e -r .ManagedInstanceID /var/lib/amazon/ssm/registration); echo \"Deregistering SSM Managed Instance $MANAGED_INSTANCE_ID\"; if ! aws ssm deregister-managed-instance --instance-id $MANAGED_INSTANCE_ID --region $ECS_TASK_REGION; then echo \"SSM Managed Instance $MANAGED_INSTANCE_ID failed to be deregistered\" 1>&2; fi; kill -SIGTERM $SSM_AGENT_PID; }; trap term_handler SIGTERM SIGINT; if [[ -z $MANAGED_INSTANCE_ROLE_NAME ]]; then echo \"Environment variable MANAGED_INSTANCE_ROLE_NAME not set, exiting\" 1>&2; exit 1; fi; if ! ps ax | grep amazon-ssm-agent | grep -v grep > /dev/null; then if [[ -n $ECS_CONTAINER_METADATA_URI_V4 ]] ; then echo \"Found ECS Container Metadata, running activation with metadata\"; TASK_METADATA=$(curl \"${ECS_CONTAINER_METADATA_URI_V4}/task\"); ECS_TASK_AVAILABILITY_ZONE=$(echo $TASK_METADATA | jq -e -r '.AvailabilityZone'); ECS_TASK_ARN=$(echo $TASK_METADATA | jq -e -r '.TaskARN'); ECS_TASK_REGION=$(echo $ECS_TASK_AVAILABILITY_ZONE | sed 's/.$//'); ECS_TASK_AVAILABILITY_ZONE_REGEX='^(af|ap|ca|cn|eu|me|sa|us|us-gov)-(central|north|(north(east|west))|south|south(east|west)|east|west)-[0-9]{1}[a-z]{1}$'; if ! [[ $ECS_TASK_AVAILABILITY_ZONE =~ $ECS_TASK_AVAILABILITY_ZONE_REGEX ]]; then echo \"Error extracting Availability Zone from ECS Container Metadata, exiting\" 1>&2; exit 1; fi; ECS_TASK_ARN_REGEX='^arn:(aws|aws-cn|aws-us-gov):ecs:[a-z0-9-]+:[0-9]{12}:task/[a-zA-Z0-9_-]+/[a-zA-Z0-9]+$'; if ! [[ $ECS_TASK_ARN =~ $ECS_TASK_ARN_REGEX ]]; then echo \"Error extracting Task ARN from ECS Container Metadata, exiting\" 1>&2; exit 1; fi; CREATE_ACTIVATION_OUTPUT=$(aws ssm create-activation --iam-role $MANAGED_INSTANCE_ROLE_NAME --tags Key=ECS_TASK_AVAILABILITY_ZONE,Value=$ECS_TASK_AVAILABILITY_ZONE Key=ECS_TASK_ARN,Value=$ECS_TASK_ARN Key=FAULT_INJECTION_SIDECAR,Value=true --region $ECS_TASK_REGION); ACTIVATION_CODE=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationCode); ACTIVATION_ID=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationId); if ! amazon-ssm-agent -register -code $ACTIVATION_CODE -id $ACTIVATION_ID -region $ECS_TASK_REGION; then echo \"Failed to register with AWS Systems Manager (SSM), exiting\" 1>&2; exit 1; fi; amazon-ssm-agent & SSM_AGENT_PID=$!; wait $SSM_AGENT_PID; else echo \"ECS Container Metadata not found, exiting\" 1>&2; exit 1; fi; else echo \"SSM agent is already running, exiting\" 1>&2; exit 1; fi"
      ],
      "environment": [
          {
              "name": "MANAGED_INSTANCE_ROLE_NAME",
              "value": "{{SSMManagedInstanceRole}}"
          }
      ],
      "environmentFiles": [],
      "mountPoints": [],
      "volumesFrom": [],
      "secrets": [],
      "dnsServers": [],
      "dnsSearchDomains": [],
      "extraHosts": [],
      "dockerSecurityOptions": [],
      "dockerLabels": {},
      "ulimits": [],
      "logConfiguration": {},
      "systemControls": []
  }
  ```

  有关可读性更强的脚本版本，请参阅 [脚本参考版本](#ecs-task-reference)。
+  通过在 Amazon ECS 任务定义中设置以下`enableFaultInjection`字段，启用 Amazon ECS 故障注入 API：

  ```
  "enableFaultInjection": true,
  ```
+ 在 Fargate 任务上使用`aws:ecs:task-network-blackhole-port``aws:ecs:task-network-latency`、或`aws:ecs:task-network-packet-loss`操作时，操作必须将`useEcsFaultInjectionEndpoints`参数设置为。`true`
+ 使用`aws:ecs:task-kill-process`、`aws:ecs:task-network-blackhole-port`、或`aws:ecs:task-network-packet-loss`操作时`aws:ecs:task-network-latency`，Amazon ECS 任务定义必须`pidMode`设置为`task`。
+ 对具有 EC2 启动类型的任务使用`aws:ecs:task-network-blackhole-port``aws:ecs:task-network-latency`、或`aws:ecs:task-network-packet-loss`操作时，[任务定义中的联网选项](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-networking.html)必须设置为`awsvpc`或`host`。

## 脚本参考版本
<a name="ecs-task-reference"></a>

以下是“需求”部分中更具可读性的脚本版本，供您参考。

```
#!/usr/bin/env bash

# This is the activation script used to register ECS tasks as Managed Instances in SSM
# The script retrieves information form the ECS task metadata endpoint to add three tags to the Managed Instance
#  - ECS_TASK_AVAILABILITY_ZONE: To allow customers to target Managed Instances / Tasks in a specific Availability Zone
#  - ECS_TASK_ARN: To allow customers to target Managed Instances / Tasks by using the Task ARN
#  - FAULT_INJECTION_SIDECAR: To make it clear that the tasks were registered as managed instance for fault injection purposes. Value is always 'true'.
# The script will leave the SSM Agent running in the background
# When the container running this script receives a SIGTERM or SIGINT signal, it will do the following cleanup:
#  - Delete SSM activation
#  - Deregister SSM managed instance

set -e # stop execution instantly as a query exits while having a non-zero

dnf upgrade -y
dnf install jq procps awscli -y

term_handler() {
  echo "Deleting SSM activation $ACTIVATION_ID"
  if ! aws ssm delete-activation --activation-id $ACTIVATION_ID --region $ECS_TASK_REGION; then
    echo "SSM activation $ACTIVATION_ID failed to be deleted" 1>&2
  fi

  MANAGED_INSTANCE_ID=$(jq -e -r .ManagedInstanceID /var/lib/amazon/ssm/registration)
  echo "Deregistering SSM Managed Instance $MANAGED_INSTANCE_ID"
  if ! aws ssm deregister-managed-instance --instance-id $MANAGED_INSTANCE_ID --region $ECS_TASK_REGION; then
    echo "SSM Managed Instance $MANAGED_INSTANCE_ID failed to be deregistered" 1>&2
  fi

  kill -SIGTERM $SSM_AGENT_PID
}
trap term_handler SIGTERM SIGINT

# check if the required IAM role is provided
if [[ -z $MANAGED_INSTANCE_ROLE_NAME ]] ; then
  echo "Environment variable MANAGED_INSTANCE_ROLE_NAME not set, exiting" 1>&2
  exit 1
fi

# check if the agent is already running (it will be if ECS Exec is enabled)
if ! ps ax | grep amazon-ssm-agent | grep -v grep > /dev/null; then

  # check if ECS Container Metadata is available
  if [[ -n $ECS_CONTAINER_METADATA_URI_V4 ]] ; then

    # Retrieve info from ECS task metadata endpoint
    echo "Found ECS Container Metadata, running activation with metadata"
    TASK_METADATA=$(curl "${ECS_CONTAINER_METADATA_URI_V4}/task")
    ECS_TASK_AVAILABILITY_ZONE=$(echo $TASK_METADATA | jq -e -r '.AvailabilityZone')
    ECS_TASK_ARN=$(echo $TASK_METADATA | jq -e -r '.TaskARN')
    ECS_TASK_REGION=$(echo $ECS_TASK_AVAILABILITY_ZONE | sed 's/.$//')

    # validate ECS_TASK_AVAILABILITY_ZONE
    ECS_TASK_AVAILABILITY_ZONE_REGEX='^(af|ap|ca|cn|eu|me|sa|us|us-gov)-(central|north|(north(east|west))|south|south(east|west)|east|west)-[0-9]{1}[a-z]{1}$'
    if ! [[ $ECS_TASK_AVAILABILITY_ZONE =~ $ECS_TASK_AVAILABILITY_ZONE_REGEX ]] ; then
      echo "Error extracting Availability Zone from ECS Container Metadata, exiting" 1>&2
      exit 1
    fi

    # validate ECS_TASK_ARN
    ECS_TASK_ARN_REGEX='^arn:(aws|aws-cn|aws-us-gov):ecs:[a-z0-9-]+:[0-9]{12}:task/[a-zA-Z0-9_-]+/[a-zA-Z0-9]+$'
    if ! [[ $ECS_TASK_ARN =~ $ECS_TASK_ARN_REGEX ]] ; then
      echo "Error extracting Task ARN from ECS Container Metadata, exiting" 1>&2
      exit 1
    fi

    # Create activation tagging with Availability Zone and Task ARN
    CREATE_ACTIVATION_OUTPUT=$(aws ssm create-activation \
      --iam-role $MANAGED_INSTANCE_ROLE_NAME \
      --tags Key=ECS_TASK_AVAILABILITY_ZONE,Value=$ECS_TASK_AVAILABILITY_ZONE Key=ECS_TASK_ARN,Value=$ECS_TASK_ARN Key=FAULT_INJECTION_SIDECAR,Value=true \
      --region $ECS_TASK_REGION)

    ACTIVATION_CODE=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationCode)
    ACTIVATION_ID=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationId)

    # Register with AWS Systems Manager (SSM)
    if ! amazon-ssm-agent -register -code $ACTIVATION_CODE -id $ACTIVATION_ID -region $ECS_TASK_REGION; then
      echo "Failed to register with AWS Systems Manager (SSM), exiting" 1>&2
      exit 1
    fi

    # the agent needs to run in the background, otherwise the trapped signal
    # won't execute the attached function until this process finishes
    amazon-ssm-agent &
    SSM_AGENT_PID=$!

    # need to keep the script alive, otherwise the container will terminate
    wait $SSM_AGENT_PID

  else
    echo "ECS Container Metadata not found, exiting" 1>&2
    exit 1
  fi

else
  echo "SSM agent is already running, exiting" 1>&2
  exit 1
fi
```

## 实验模板示例
<a name="example-ecs-task-experiment-template"></a>

以下是 [aws: ecs: task-cpu-stress](fis-actions-reference.md#task-cpu-stress) 操作的实验模板示例。

```
{
    "description": "Run CPU stress on the target ECS tasks",
    "targets": {
        "myTasks": {
            "resourceType": "aws:ecs:task",
            "resourceArns": [
                "arn:aws:ecs:{{us-east-1}}:{{111122223333}}:task/{{my-cluster}}/{{09821742c0e24250b187dfed8EXAMPLE}}"
            ],
            "selectionMode": "{{ALL}}"
        }
    },
    "actions": {
        "EcsTask-cpu-stress": {
            "actionId": "aws:ecs:task-cpu-stress",
            "parameters": {
                "duration": "{{PT1M}}"
            },
            "targets": {
                "Tasks": "myTasks"
            }
        }
    },
    "stopConditions": [
        {
            "source": "none",
        }
    ],
    "roleArn": "arn:aws:iam::{{111122223333}}:role/{{fis-experiment-role}}",
    "tags": {}
}
```