

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 使用 Amazon Bedrock 数据自动化 CLI
<a name="bda-cli-guide"></a>

 Amazon Bedrock 数据自动化 (BDA) 功能为处理数据提供了简化的 CLI 工作流程。对于所有模态，此工作流均包含三个主要步骤：创建项目、创建用于自定义输出的蓝图以及处理文档。本指南将引导您了解使用 BDA 的关键 CLI 命令。

## 创建您的第一个数据自动化项目
<a name="create-data-automation-project-cli"></a>

要开始使用 BDA，请先使用 `create-data-automation-project` 命令创建一个项目。

看看我们将要处理的这个护照样本：

![\[alt text not found\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/bda/passport2.png)


创建项目时，您必须为要处理的文件类型定义配置设置。以下命令展示了创建图像处理项目的非常简单的工作示例：

```
aws bedrock-data-automation create-data-automation-project \
    --project-name "ImageProcessingProject" \
    --standard-output-configuration '{
        "image": {
            "extraction": {
                "category": {
                    "state": "ENABLED",
                    "types": ["TEXT_DETECTION"]
                },
                "boundingBox": {
                    "state": "ENABLED"
                }
            },
            "generativeField": {
                "state": "ENABLED"
            }
        }
    }'
```

该命令验证输入配置，然后创建具有唯一 ARN 的新项目。响应中将包括项目 ARN 和阶段：

```
{
    "projectArn": "Amazon Resource Name (ARN)",
    "projectStage": "DEVELOPMENT",
    "status": "IN_PROGRESS"
}
```

如果创建了没有参数的项目，则将应用默认设置。例如，在处理图像时，默认情况下将启用图像汇总和文本检测。

## 完整参数参考
<a name="create-project-parameters"></a>

下表显示 `create-data-automation-project` 命令的所有可用参数。


**的参数 create-data-automation-project**  

| 参数 | 必需 | 默认值 | Description | 
| --- | --- | --- | --- | 
| --project-name | 是 | 不适用 | 数据自动化项目的名称 | 
| --project-type | 否 | 项目的类型定义了它可以与哪个运行时处理 API 一起使用。 ASYNC项目只能与 invoke-bedrock-data-automation-async API 一起使用，而SYNC项目只能与 invoke-bedrock-data-automation API 一起使用。 | 
| --project-stage | 否 | 实时 | 项目的阶段（DEVELOPMENT 或 LIVE） | 
| --standard-output-configuration | 是 | 不适用 | 用于标准输出处理的 JSON 配置 | 
| --custom-output-configuration | 否 | 不适用 | 用于自定义输出处理的 JSON 配置 | 
| --encryption-configuration | 否 | 不适用 | 项目的加密设置 | 
| --client-token | 否 | 自动生成 | 请求幂等性的唯一标识符 | 

## 创建蓝图
<a name="create-blueprint-cli"></a>

创建项目后，您可以使用 `create-blueprint` 命令创建蓝图来定义数据处理的结构。

以下是非常简单的工作示例，用于创建专为处理护照而定制的蓝图：

```
aws bedrock-data-automation create-blueprint \
    --blueprint-name "passport-blueprint" \
    --type "IMAGE" \
    --blueprint-stage "DEVELOPMENT" \
    --schema '{
        "class": "Passport",
        "description": "Blueprint for processing passport images",
        "properties": {
            "passport_number": {
                "type": "string",
                "inferenceType": "explicit",
                "instruction": "The passport identification number"
            },
            "full_name": {
                "type": "string",
                "inferenceType": "explicit",
                "instruction": "The full name of the passport holder"
            }
        }
    }'
```

该命令创建具有指定架构的新蓝图。然后，您可以在处理文档时使用此蓝图，这样就能根据您定义的架构提取结构化数据。

## 使用蓝图
<a name="using-blueprint-cli"></a>

### 向项目添加蓝图
<a name="adding-blueprint-to-project"></a>

要向项目添加蓝图，请使用 `update-data-automation-project` 命令：

```
aws bedrock-data-automation update-data-automation-project \
    --project-arn "Amazon Resource Name (ARN)" \
    --standard-output-configuration '{
        "image": {
            "extraction": {
                "category": {
                    "state": "ENABLED",
                    "types": ["TEXT_DETECTION"]
                },
                "boundingBox": {
                    "state": "ENABLED"
                }
            },
            "generativeField": {
                "state": "ENABLED",
                "types": ["IMAGE_SUMMARY"]
            }
        }
    }' \
    --custom-output-configuration '{
        "blueprints": [
            {
                "blueprintArn": "Amazon Resource Name (ARN)",
                "blueprintVersion": "1",
                "blueprintStage": "LIVE"
            }
        ]
    }'
```

### 验证蓝图集成
<a name="verifying-blueprint-integration"></a>

您可以使用 `get-data-automation-project` 命令验证蓝图集成：

```
aws bedrock-data-automation get-data-automation-project \
    --project-arn "Amazon Resource Name (ARN)"
```

### 管理多个蓝图
<a name="managing-multiple-blueprints"></a>

使用 `list-blueprints` 命令查看您的所有蓝图：

```
aws bedrock-data-automation list-blueprints
```

## 异步处理文档
<a name="invoke-data-automation-cli"></a>

在使用 BDA 处理文档之前，必须先将文档上传到 S3 存储桶。设置完项目后，您可以使用以下命令处理文档：`invoke-data-automation-async`

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
    --input-configuration '{
        "s3Uri": "s3://my-bda-documents/invoices/invoice-123.pdf"
    }' \
    --output-configuration '{
        "s3Uri": "s3://my-bda-documents/output/"
    }' \
    --data-automation-configuration '{
        "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
        "stage": "LIVE"
    }' \
    --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

该命令会返回一个调用 ARN，供您用来查看处理的状态：

```
{
    "invocationArn": "Amazon Resource Name (ARN)"
}
```

## 查看处理状态
<a name="get-data-automation-status-cli"></a>

要使用 `get-data-automation-status` 命令查看您的处理作业的状态，请执行以下操作：

```
aws bedrock-data-automation-runtime get-data-automation-status \
    --invocation-arn "Amazon Resource Name (ARN)"
```

该命令会返回正在处理的作业的当前状态：

```
{
    "status": "COMPLETED",
    "creationTime": "2025-07-09T12:34:56.789Z",
    "lastModifiedTime": "2025-07-09T12:45:12.345Z",
    "outputLocation": "s3://my-bda-documents/output/efgh5678/"
}
```

可能的状态值包括：
+ `IN_PROGRESS`：处理作业当前正在运行。
+ `COMPLETED`：处理作业已成功完成。
+ `FAILED`：处理作业已失败。查看响应以了解错误详细信息。
+ `STOPPED`：处理作业已手动停止。

## 检索结果
<a name="retrieve-results-cli"></a>

处理完成后，您可以列出 S3 存储桶中的输出文件：

```
aws s3 ls s3://my-bda-documents/output/efgh5678/
```

要将结果下载到本地计算机，请执行以下操作：

```
aws s3 cp s3://my-bda-documents/output/efgh5678/ ~/Downloads/bda-results/ --recursive
```

输出包括基于您的项目配置和所应用的任意蓝图的结构化数据。

## 同步处理文档
<a name="process-docs-sync"></a>

在使用 BDA 处理文档之前，必须先将文档上传到 S3 存储桶。同步 API 支持通过 S3 存储桶或图像字节输入（即在没有 S3 的情况下处理文档）。该命令会根据您的项目配置和您应用的所有蓝图返回结构化数据：

```
aws bedrock-data-automation-runtime invoke-data-automation \
    --input-configuration '{
        "s3Uri": "s3://my-bda-documents/invoices/invoice-123.pdf"
    }' \
    --data-automation-configuration '{
        "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
        "stage": "LIVE"
    }' \
    --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

## 同步处理图像
<a name="process-images-sync"></a>

该命令会根据您的项目配置和您应用的所有蓝图返回结构化数据：

```
aws bedrock-data-automation-runtime invoke-data-automation \
    --input-configuration '{
        "s3Uri": "s3://my-bda-documents/invoices/advertisement_latest.jpeg"
    }' \
    --data-automation-configuration '{
        "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
        "stage": "LIVE"
    }' \
    --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

# 蓝图操作 CLI
<a name="bda-blueprint-operations"></a>

本指南涵盖了通过 AWS Amazon Bedrock 数据自动化命令行接口 (CLI) (BDA) 提供的蓝图操作。

## 创建蓝图
<a name="create-blueprints-cli"></a>

蓝图定义了要从文档、图像、音频或视频文件中提取的数据的结构和属性。使用 create-blueprint 命令定义新的蓝图。

以下命令创建一个专为从护照图像中提取数据而定制的新蓝图。

**语法**

```
aws bedrock-data-automation create-blueprint \
      --blueprint-name "passport-blueprint" \
      --type "IMAGE" \
      --blueprint-stage "DEVELOPMENT" \
      --schema '{
        "class": "Passport",
        "description": "Blueprint for processing passport images",
        "properties": {
          "passport_number": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The passport identification number"
          },
          "full_name": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The full name of the passport holder"
          },
          "expiration_date": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The passport expiration date"
          }
        }
      }'
```

## 完整参数参考
<a name="create-blueprint-parameters"></a>

下表显示 `create-blueprint` 命令的所有可用参数。


**create-blueprint 的参数**  

| 参数 | 必需 | 默认值 | Description | 
| --- | --- | --- | --- | 
| --blueprint-name | 是 | 不适用 | 蓝图的名称 | 
| --type | 是 | 不适用 | 内容类型（IMAGE、DOCUMENT、AUDIO、VIDEO） | 
| --blueprint-stage | 否 | 实时 | 蓝图的阶段（DEVELOPMENT 或 LIVE） | 
| --schema | 是 | 不适用 | 定义蓝图结构的 JSON 架构 | 
| --client-token | 否 | 自动生成 | 请求幂等性的唯一标识符 | 

## 查看蓝图配置
<a name="view-blueprint-cli"></a>

**列出所有蓝图**

使用 list-blueprints 命令检索与您账户关联的所有蓝图的列表。

**语法**

```
aws bedrock-data-automation list-blueprints
```

**查看蓝图详细信息**

要查看有关特定蓝图的详细信息，包括其架构和配置，请使用 get-blueprint 命令。

**语法**

```
aws bedrock-data-automation get-blueprint \
      --blueprint-arn "Amazon Resource Name (ARN)"
```

**检查特定版本**

使用具有版本控制的蓝图时，通过 get-blueprint 命令和 --blueprint-version 选项来查看特定版本。

**语法**

```
      aws bedrock-data-automation get-blueprint \
      --blueprint-arn "Amazon Resource Name (ARN)" \
      --blueprint-version "version-number"
```

**检查特定阶段**

要查看处于 DEVELOPMENT 或 LIVE 阶段的蓝图，请使用：

```
      aws bedrock-data-automation get-blueprint \
      --blueprint-arn "Amazon Resource Name (ARN)" \
      --blueprint-stage "LIVE"
```

## 编辑蓝图规范
<a name="edit-blueprint-cli"></a>

**更新蓝图设置**

要修改现有蓝图的架构或属性，请使用 update-blueprint 命令。

**语法**

```
aws bedrock-data-automation update-blueprint \
      --blueprint-arn "Amazon Resource Name (ARN)" \
      --schema '{
        "class": "Passport",
        "description": "Updated blueprint for processing passport images",
        "properties": {
          "passport_number": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The passport identification number"
          },
          "full_name": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The full name of the passport holder"
          },
          "expiration_date": {
            "type": "string",
            "inferenceType": "explicit",
            "instruction": "The passport expiration date"
          }
        }
      }'
```

**注意：**更新蓝图时，即使对于您没有进行更改的字段，也必须提供完整的架构。

**提升为 LIVE**

要将蓝图从 DEVELOPMENT 阶段转到 LIVE 阶段用于生产，请使用 update-blueprint 命令和 --blueprint-阶段选项。

**语法**

```
aws bedrock-data-automation update-blueprint \
      --blueprint-arn "Amazon Resource Name (ARN)" \
      --blueprint-stage "LIVE"
```

**蓝图版本控制**

在使用 create-blueprint-version命令进行重大更改之前，创建蓝图的新版本以保留其当前状态。

**语法**

```
aws bedrock-data-automation create-blueprint-version \
      --blueprint-arn "Amazon Resource Name (ARN)"
```

## 管理蓝图标签
<a name="tag-management-cli"></a>

标签可帮助用户整理和分类蓝图，从而简化管理。

**添加标签**

通过添加标签将元数据应用到蓝图。

**语法**

```
aws bedrock-data-automation tag-resource \
      --resource-arn "Amazon Resource Name (ARN)" \
      --tags '{"Department":"Finance","Project":"PassportProcessing"}'
```

**移除标签**

使用 untag-resource 命令从蓝图中移除特定标签。

**语法**

```
aws bedrock-data-automation untag-resource \
      --resource-arn "Amazon Resource Name (ARN)" \
      --tag-keys '["Department","Project"]'
```

**查看标签**

使用 list-tags-for-resource命令列出与您的蓝图关联的所有标签。

**语法**

```
aws bedrock-data-automation list-tags-for-resource \
      --resource-arn "Amazon Resource Name (ARN)"
```

## 删除蓝图
<a name="delete-blueprint-cli"></a>

**删除整个蓝图**

使用 delete-blueprint 命令永久删除蓝图及其所有版本。

**语法**

```
aws bedrock-data-automation delete-blueprint \
          --blueprint-arn "Amazon Resource Name (ARN)"
```

**注意：**此命令将永久删除蓝图且无法将其恢复。

**重要提示：**您不能删除当前由任何项目使用的蓝图。删除蓝图之前，请确保该蓝图未在任何项目的自定义输出配置中引用。

## 蓝图优化
<a name="blueprint-optimization-cli"></a>

### 调用蓝图优化
<a name="invoking-blueprint-optimization"></a>

启动异步蓝图优化作业，以改进蓝图对每个蓝图字段的指令和结果的准确性。

**语法**

```
aws bedrock-data-automation invoke-blueprint-optimization-async \
    --blueprint blueprintArn="arn:aws:bedrock:<region>:<account_id>:blueprint/<blueprint_id>",stage="DEVELOPMENT" \
    --samples '[
        {
            "assetS3Object": {
                "s3Uri": "s3://my-optimization-bucket/samples/document1.pdf"
            },
            "groundTruthS3Object": {
                "s3Uri": "s3://my-optimization-bucket/ground-truth/document1-expected.json"
            }
        }
    ]' \
    --output-configuration s3Object='{s3Uri="s3://my-optimization-bucket/results/optimization-output"}' \
    --data-automation-profile-arn "Amazon Resource Name (ARN):data-automation-profile/default"
```

### 检查蓝图优化状态
<a name="checking-blueprint-optimization-status"></a>

监控蓝图优化作业的进度和结果。

**语法**

```
aws bedrock-data-automation get-blueprint-optimization-status \
    --invocation-arn "arn:aws:bedrock:<region>:<account_id>:blueprint-optimization-invocation/opt-12345abcdef"
```

使用此命令跟踪优化作业的状态。完成后，响应包括当前状态（已创建 InProgress ServiceError、、成功或 ClientError）和输出配置详细信息。

### 复制蓝图阶段
<a name="copying-blueprint-stages"></a>

将蓝图从一个关卡复制到另一个阶段

**语法**

```
aws bedrock-data-automation copy-blueprint-stage \
    --blueprint-arn "arn:aws:bedrock:<region>:<account_id>:blueprint/<blueprint_id>" \
    --source-stage "DEVELOPMENT" \
    --target-stage "LIVE"
```

**注意：**此命令将整个蓝图配置从源阶段复制到目标阶段，覆盖目标阶段中的任何现有配置。

**重要：**在复制到生产 (LIVE) 阶段之前，请确保蓝图在源代码阶段经过全面测试。无法方便地撤消该操作。

# 通过 CLI 进行处理
<a name="bda-document-processing-cli"></a>

使用 BDA 处理文档之前，您必须先将文档上传到 S3 存储桶：

**语法**

```
aws s3 cp <source> <target> [--options]
```

示例：

```
aws s3 cp /local/path/document.pdf s3://my-bda-bucket/input/document.pdf
```

------
#### [ Async ]

**基本处理命令结构**

使用 `invoke-data-automation-async` 命令处理文件：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://amzn-s3-demo-bucket/sample-images/sample-image.jpg"
        }' \
        --output-configuration '{
            "s3Uri": "s3://amzn-s3-demo-bucket/output/"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

**高级处理命令结构**

**使用时间段的视频处理**

对于视频文件，您可以指定要处理的时间段：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://my-bucket/video.mp4",
            "assetProcessingConfiguration": {
                "video": {
                    "segmentConfiguration": {
                        "timestampSegment": {
                            "startTimeMillis": 0,
                            "endTimeMillis": 300000
                        }
                    }
                }
            }
        }' \
        --output-configuration '{
            "s3Uri": "s3://my-bucket/output/"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

**使用自定义蓝图**

您可以直接在命令中指定自定义蓝图：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://my-bucket/document.pdf"
        }' \
        --output-configuration '{
            "s3Uri": "s3://my-bucket/output/"
        }' \
        --blueprints '[
            {
                "blueprintArn": "Amazon Resource Name (ARN)",
                "version": "1",
                "stage": "LIVE"
            }
        ]' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

**添加加密配置**

为了增强安全性，您可以添加加密配置：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://my-bucket/document.pdf"
        }' \
        --output-configuration '{
            "s3Uri": "s3://my-bucket/output/"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --encryption-configuration '{
            "kmsKeyId": "Amazon Resource Name (ARN)",
            "kmsEncryptionContext": {
                "Department": "Finance",
                "Project": "DocumentProcessing"
            }
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

**事件通知**

启用处理完成 EventBridge 通知：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://my-bucket/document.pdf"
        }' \
        --output-configuration '{
            "s3Uri": "s3://my-bucket/output/"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --notification-configuration '{
            "eventBridgeConfiguration": {
                "eventBridgeEnabled": true
            }
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

**检查处理状态**

使用 `get-data-automation-status` 命令检查您的处理作业的状态：

```
aws bedrock-data-automation-runtime get-data-automation-status \
        --invocation-arn "Amazon Resource Name (ARN)"
```

响应将包括当前状态：

```
{
        "status": "COMPLETED",
        "creationTime": "2025-07-24T12:34:56.789Z",
        "lastModifiedTime": "2025-07-24T12:45:12.345Z",
        "outputLocation": "s3://my-bucket/output/abcd1234/"
        }
```

**检索处理结果**

**在 S3 中查找输出文件**

列出 S3 存储桶中的输出文件：

```
aws s3 ls s3://amzn-s3-demo-bucket/output/
```

将结果下载到本地计算机：

```
aws s3 cp s3://amzn-s3-demo-bucket/output/ ~/Downloads/bda-results/ --recursive
```

**了解输出结构**

输出通常包括：
+ `standard-output.json`：包含标准提取结果
+ `custom-output.json`：包含来自自定义蓝图的结果
+ `metadata.json`：包含处理元数据和置信度分数

**常见的响应字段**

标准输出通常包括：
+ `extractedData`：主要提取的信息
+ `confidence`：提取的每个字段的置信度分数
+ `metadata`：处理信息，包括时间戳和模型详细信息
+ `boundingBoxes`：检测到的元素的位置信息（如果启用）

**错误处理和故障排除**

常见的错误场景和解决方案：
+ **S3 URI 无效**：确保您的 S3 存储桶存在并且您拥有适当的权限
+ **缺失 data-automation-profile-arn**：所有处理请求都需要此参数
+ **未找到项目**：确保您的项目 ARN 正确并且项目存在
+ **不支持的文件格式**：检查 BDA 是否支持您的文件格式

**为处理任务添加标签**

您可以添加标签来帮助整理和跟踪处理作业：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
        --input-configuration '{
            "s3Uri": "s3://my-bucket/document.pdf"
        }' \
        --output-configuration '{
            "s3Uri": "s3://my-bucket/output/"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --tags '[
            {
                "key": "Department",
                "value": "Finance"
            },
            {
                "key": "Project",
                "value": "InvoiceProcessing"
            }
        ]' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
```

------
#### [ Sync ]

**基本处理命令结构**

使用 `invoke-data-automation` 命令处理文件：

```
        aws bedrock-data-automation-runtime invoke-data-automation \
        --input-configuration '{
            "s3Uri": "s3://amzn-s3-demo-bucket/sample-images/sample-image.jpg"
        }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
        --region "aws-region"
```

**高级处理命令结构**

输出到 S3 存储桶

```
        aws bedrock-data-automation-runtime invoke-data-automation \
        --input-configuration '{
            "s3Uri": "s3://amzn-s3-demo-bucket/sample-images/sample-image.jpg"
        }' \
        --output-configuration '{"s3Uri": "s3://amzn-s3-demo-bucket/output/" }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
        --region "aws-region"   //document only
```

使用字节输入

```
        aws bedrock-data-automation-runtime invoke-data-automation \
        --input-configuration '{
            "bytes": #blob input
        }' \
        --output-configuration '{"s3Uri": "s3://amzn-s3-demo-bucket/output/" }' \
        --data-automation-configuration '{
            "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
            "stage": "LIVE"
        }' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
        --region "aws-region"
```

**注意**  
**字节**  
一组 base64 编码的文档字节。以 blob 字节形式提供的文档的最大大小为 50 MB。类型应为 Base64 编码的二进制数据对象。

**使用自定义蓝图（仅适用于图像）**

```
        aws bedrock-data-automation-runtime invoke-data-automation \
        --input-configuration '{
            "s3Uri": "s3://amzn-s3-demo-bucket/sample-images/sample-image.jpg"
        }' \
        --blueprints '[{"blueprintArn": "Amazon Resource Name (ARN)", "version": "1", "stage": "LIVE" } ]' \
        --data-automation-profile-arn "Amazon Resource Name (ARN)"
        --region "aws-region"
```

------

# 处理使用案例
<a name="bda-document-processing-examples"></a>

利用 Amazon Bedrock 数据自动化功能，您可以通过命令行界面（CLI）处理文档、图像、音频和视频。对于每种模态，工作流都包括创建项目、调用分析和检索结果。

选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Documents ]

**从 W2 中提取数据**

![\[带有标准字段的样本 W2 表单，展示布局和要提取的数据字段。\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/bda/W2.png)


处理 W2 表单时，示例架构如下所示：

```
{
  "class": "W2TaxForm",
  "description": "Simple schema for extracting key information from W2 tax forms",
  "properties": {
    "employerName": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The employer's company name"
    },
    "employeeSSN": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The employee's Social Security Number (SSN)"
    },
    "employeeName": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The employee's full name"
    },
    "wagesAndTips": {
      "type": "number",
      "inferenceType": "explicit",
      "instruction": "Wages, tips, other compensation (Box 1)"
    },
    "federalIncomeTaxWithheld": {
      "type": "number",
      "inferenceType": "explicit",
      "instruction": "Federal income tax withheld (Box 2)"
    },
    "taxYear": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The tax year for this W2 form"
    }
  }
}
```

为处理 W2 所调用的命令如下所示：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
  --input-configuration '{
    "s3Uri": "s3://w2-processing-bucket-301678011486/input/W2.png"
  }' \
  --output-configuration '{
    "s3Uri": "s3://w2-processing-bucket-301678011486/output/"
  }' \
  --data-automation-configuration '{
    "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
    "stage": "LIVE"
  }' \
  --data-automation-profile-arn "Amazon Resource Name (ARN):data-automation-profile/default"
```

预期的输出示例是：

```
{
  "documentType": "W2TaxForm",
  "extractedData": {
    "employerName": "The Big Company",
    "employeeSSN": "123-45-6789",
    "employeeName": "Jane Doe",
    "wagesAndTips": 48500.00,
    "federalIncomeTaxWithheld": 6835.00,
    "taxYear": "2014"
  },
  "confidence": {
    "employerName": 0.99,
    "employeeSSN": 0.97,
    "employeeName": 0.99,
    "wagesAndTips": 0.98,
    "federalIncomeTaxWithheld": 0.97,
    "taxYear": 0.99
  },
  "metadata": {
    "processingTimestamp": "2025-07-23T23:15:30Z",
    "documentId": "w2-12345",
    "modelId": "amazon.titan-document-v1",
    "pageCount": 1
  }
}
```

------
#### [ Images ]

**旅游广告样本**

![\[样本图片，展示用户如何从广告中提取信息。\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/bda/TravelAdvertisement.jpg)


用于旅游广告的示例架构如下所示：

```
{
  "class": "TravelAdvertisement",
  "description": "Schema for extracting information from travel advertisement images",
  "properties": {
    "destination": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The name of the travel destination being advertised"
    },
    "tagline": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The main promotional text or tagline in the advertisement"
    },
    "landscapeType": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The type of landscape shown (e.g., mountains, beach, forest, etc.)"
    },
    "waterFeatures": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Description of any water features visible in the image (ocean, lake, river, etc.)"
    },
    "dominantColors": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The dominant colors present in the image"
    },
    "advertisementType": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The type of travel advertisement (e.g., destination promotion, tour package, etc.)"
    }
  }
}
```

为处理旅游广告所调用的命令如下所示：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
  --input-configuration '{
    "s3Uri": "s3://travel-ads-bucket-301678011486/input/TravelAdvertisement.jpg"
  }' \
  --output-configuration '{
    "s3Uri": "s3://travel-ads-bucket-301678011486/output/"
  }' \
  --data-automation-configuration '{
    "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
    "stage": "LIVE"
  }' \
  --data-automation-profile-arn "Amazon Resource Name (ARN):data-automation-profile/default"
```

预期的输出示例是：

```
{
  "documentType": "TravelAdvertisement",
  "extractedData": {
    "destination": "Kauai",
    "tagline": "Travel to KAUAI",
    "landscapeType": "Coastal mountains with steep cliffs and valleys",
    "waterFeatures": "Turquoise ocean with white surf along the coastline",
    "dominantColors": "Green, blue, turquoise, brown, white",
    "advertisementType": "Destination promotion"
  },
  "confidence": {
    "destination": 0.98,
    "tagline": 0.99,
    "landscapeType": 0.95,
    "waterFeatures": 0.97,
    "dominantColors": 0.96,
    "advertisementType": 0.92
  },
  "metadata": {
    "processingTimestamp": "2025-07-23T23:45:30Z",
    "documentId": "travel-ad-12345",
    "modelId": "amazon.titan-image-v1",
    "imageWidth": 1920,
    "imageHeight": 1080
  }
}
```

------
#### [ Audio ]

**转录电话通话**

用于电话呼叫的示例架构如下所示：

```
{
  "class": "AudioRecording",
  "description": "Schema for extracting information from AWS customer call recordings",
  "properties": {
    "callType": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The type of call (e.g., technical support, account management, consultation)"
    },
    "participants": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The number and roles of participants in the call"
    },
    "mainTopics": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The main topics or AWS services discussed during the call"
    },
    "customerIssues": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Any customer issues or pain points mentioned during the call"
    },
    "actionItems": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Action items or next steps agreed upon during the call"
    },
    "callDuration": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The duration of the call"
    },
    "callSummary": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "A brief summary of the entire call"
    }
  }
}
```

为处理电话通话所调用的命令如下所示：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
  --input-configuration '{
    "s3Uri": "s3://audio-analysis-bucket-301678011486/input/AWS_TCA-Call-Recording-2.wav"
  }' \
  --output-configuration '{
    "s3Uri": "s3://audio-analysis-bucket-301678011486/output/"
  }' \
  --data-automation-configuration '{
    "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
    "stage": "LIVE"
  }' \
  --data-automation-profile-arn "Amazon Resource Name (ARN):data-automation-profile/default"
```

预期的输出示例是：

```
{
  "documentType": "AudioRecording",
  "extractedData": {
    "callType": "Technical consultation",
    "participants": "3 participants: AWS Solutions Architect, AWS Technical Account Manager, and Customer IT Director",
    "mainTopics": "AWS Bedrock implementation, data processing pipelines, model fine-tuning, and cost optimization",
    "customerIssues": "Integration challenges with existing ML infrastructure, concerns about latency for real-time processing, questions about data security compliance",
    "actionItems": [
      "AWS team to provide documentation on Bedrock data processing best practices",
      "Customer to share their current ML architecture diagrams",
      "Schedule follow-up meeting to review implementation plan",
      "AWS to provide cost estimation for proposed solution"
    ],
    "callDuration": "45 minutes and 23 seconds",
    "callSummary": "Technical consultation call between AWS team and customer regarding implementation of AWS Bedrock for their machine learning workloads. Discussion covered integration approaches, performance optimization, security considerations, and next steps for implementation planning."
  },
  "confidence": {
    "callType": 0.94,
    "participants": 0.89,
    "mainTopics": 0.92,
    "customerIssues": 0.87,
    "actionItems": 0.85,
    "callDuration": 0.99,
    "callSummary": 0.93
  },
  "metadata": {
    "processingTimestamp": "2025-07-24T00:30:45Z",
    "documentId": "audio-12345",
    "modelId": "amazon.titan-audio-v1",
    "audioDuration": "00:45:23",
    "audioFormat": "WAV",
    "sampleRate": "44.1 kHz"
  },
  "transcript": {
    "segments": [
      {
        "startTime": "00:00:03",
        "endTime": "00:00:10",
        "speaker": "Speaker 1",
        "text": "Hello everyone, thank you for joining today's call about implementing AWS Bedrock for your machine learning workloads."
      },
      {
        "startTime": "00:00:12",
        "endTime": "00:00:20",
        "speaker": "Speaker 2",
        "text": "Thanks for having us. We're really interested in understanding how Bedrock can help us streamline our document processing pipeline."
      },
      {
        "startTime": "00:00:22",
        "endTime": "00:00:35",
        "speaker": "Speaker 3",
        "text": "Yes, and specifically we'd like to discuss integration with our existing systems and any potential latency concerns for real-time processing requirements."
      }
      // Additional transcript segments would continue here
    ]
  }
}
```

------
#### [ Video ]

**处理视频**

用于视频的示例架构如下所示：

```
{
  "class": "VideoContent",
  "description": "Schema for extracting information from video content",
  "properties": {
    "title": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The title or name of the video content"
    },
    "contentType": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The type of content (e.g., tutorial, competition, documentary, advertisement)"
    },
    "mainSubject": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "The main subject or focus of the video"
    },
    "keyPersons": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Key people appearing in the video (hosts, participants, etc.)"
    },
    "keyScenes": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Description of important scenes or segments in the video"
    },
    "audioElements": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "Description of notable audio elements (music, narration, dialogue)"
    },
    "summary": {
      "type": "string",
      "inferenceType": "explicit",
      "instruction": "A brief summary of the video content"
    }
  }
}
```

为处理视频所调用的命令如下所示：

```
aws bedrock-data-automation-runtime invoke-data-automation-async \
  --input-configuration '{
    "s3Uri": "s3://video-analysis-bucket-301678011486/input/MakingTheCut.mp4",
    "assetProcessingConfiguration": {
      "video": {
        "segmentConfiguration": {
          "timestampSegment": {
            "startTimeMillis": 0,
            "endTimeMillis": 300000
          }
        }
      }
    }
  }' \
  --output-configuration '{
    "s3Uri": "s3://video-analysis-bucket-301678011486/output/"
  }' \
  --data-automation-configuration '{
    "dataAutomationProjectArn": "Amazon Resource Name (ARN)",
    "stage": "LIVE"
  }' \
  --data-automation-profile-arn "Amazon Resource Name (ARN):data-automation-profile/default"
```

预期的输出示例是：

```
{
  "documentType": "VideoContent",
  "extractedData": {
    "title": "Making the Cut",
    "contentType": "Fashion design competition",
    "mainSubject": "Fashion designers competing to create the best clothing designs",
    "keyPersons": "Heidi Klum, Tim Gunn, and various fashion designer contestants",
    "keyScenes": [
      "Introduction of the competition and contestants",
      "Design challenge announcement",
      "Designers working in their studios",
      "Runway presentation of designs",
      "Judges' critique and elimination decision"
    ],
    "audioElements": "Background music, host narration, contestant interviews, and design feedback discussions",
    "summary": "An episode of 'Making the Cut' fashion competition where designers compete in a challenge to create innovative designs. The episode includes the challenge announcement, design process, runway presentation, and judging."
  },
  "confidence": {
    "title": 0.99,
    "contentType": 0.95,
    "mainSubject": 0.92,
    "keyPersons": 0.88,
    "keyScenes": 0.90,
    "audioElements": 0.87,
    "summary": 0.94
  },
  "metadata": {
    "processingTimestamp": "2025-07-24T00:15:30Z",
    "documentId": "video-12345",
    "modelId": "amazon.titan-video-v1",
    "videoDuration": "00:45:23",
    "analyzedSegment": "00:00:00 - 00:05:00",
    "resolution": "1920x1080"
  },
  "transcript": {
    "segments": [
      {
        "startTime": "00:00:05",
        "endTime": "00:00:12",
        "speaker": "Heidi Klum",
        "text": "Welcome to Making the Cut, where we're searching for the next great global fashion brand."
      },
      {
        "startTime": "00:00:15",
        "endTime": "00:00:25",
        "speaker": "Tim Gunn",
        "text": "Designers, for your first challenge, you'll need to create a look that represents your brand and can be sold worldwide."
      }
      // Additional transcript segments would continue here
    ]
  }
}
```

------