

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 为使用人工工作人员的模型评测作业创建自定义提示数据集
<a name="model-evaluation-prompt-datasets-custom-human"></a>

要创建使用人工工作人员的模型评测作业，您必须指定自定义提示数据集。这些提示随后会在推理过程中用于所选的待评测模型。

如果您想使用已生成的响应来评测非 Amazon Bedrock 模型，请按[使用自己的推理响应数据执行评测作业](#model-evaluation-prompt-datasets-custom-human-byoir)中所述，将这些响应包含在提示数据集内。当您提供自己的推理响应数据时，Amazon Bedrock 会跳过模型-调用步骤，并使用您提供的数据执行评测作业。

自定义提示数据集必须存储在 Amazon S3 中，使用 JSON 行格式和 `.jsonl` 文件扩展名。每行必须是有效的 JSON 对象。每个自动评估作业的数据集中最多可以有 1000 条提示。

对于使用控制台创建的人工评估任务，您必须在 S3 输出存储桶上配置跨源资源共享 (CORS)。这是在注释门户中向人类注释者显示提示和推理结果所必需的。要了解有关所需 CORS 权限的更多信息，请参阅 [必需的 S3 存储桶的跨源资源共享（CORS）权限](model-evaluation-security-cors.md)。

## 执行 Amazon Bedrock 在其中为您调用模型的评测作业
<a name="model-evaluation-prompt-datasets-custom-human-invoke"></a>

要运行 Amazon Bedrock 在其中为您调用模型的评测作业，请提供包含以下键-值对的提示数据集：
+ `prompt` – 您希望模型进行响应的提示。
+ `referenceResponse` –（可选）工作人员在评测期间可参考的基础事实响应。
+ `category` –（可选）在模型评测报告卡片中查看结果时用于筛选结果的键。

工作人员可以在自己的 UI 中看到您为 `prompt` 和 `referenceResponse` 指定的内容。

下面是一个包含 6 个输入并使用了 JSON 行格式的自定义数据集示例。

```
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
{"prompt":"{{Provide the prompt you want the model to use during inference}}","category":"{{(Optional) Specify an optional category}}","referenceResponse":"{{(Optional) Specify a ground truth response}}."}
```

为清楚起见，以下示例显示了一个展开的条目。在实际提示数据集内，每一行都必须是一个有效的 JSON 对象。

```
{
  "prompt": "What is high intensity interval training?",
  "category": "Fitness",
  "referenceResponse": "High-Intensity Interval Training (HIIT) is a cardiovascular exercise approach that involves short, intense bursts of exercise followed by brief recovery or rest periods."
}
```

## 使用自己的推理响应数据执行评测作业
<a name="model-evaluation-prompt-datasets-custom-human-byoir"></a>

要使用已生成的响应运行评测作业，您需要提供一个包含以下键-值对的提示数据集：
+ `prompt` – 您的模型用来生成响应的提示。
+ `referenceResponse` –（可选）工作人员在评测期间可参考的基础事实响应。
+ `category` –（可选）在模型评测报告卡片中查看结果时用于筛选结果的键。
+ `modelResponses` – 来自要评测的推理的响应。您可以在 `modelResponses` 列表中提供一个或两个具有以下属性的条目。
  + `response` – 包含模型推理响应的字符串。
  + `modelIdentifier` – 标识生成了响应的模型的字符串。

提示数据集内的每一行都必须包含相同数量的响应（一个或两个）。此外，您必须在每行中指定一个或多个相同的模型标识符，并且在单个数据集内，用于 `modelIdentifier` 的唯一值不得超过 2 个。

下面是一个包含 6 个采用 JSON 行格式的输入的自定义数据集示例。

```
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
{"prompt":{{"The prompt you used to generate the model responses"}},"referenceResponse":{{"(Optional) a ground truth response"}},"category":{{"(Optional) a category for the prompt"}},"modelResponses":[{"response":{{"The response your first model generated"}},"modelIdentifier":{{"A string identifying your first model"}}},{"response":{{"The response your second model generated"}},"modelIdentifier":{{"A string identifying your second model"}}}]}
```

为清楚起见，下面的示例显示了提示数据集内一个已展开的条目。

```
{
    "prompt": "What is high intensity interval training?",
    "referenceResponse": "High-Intensity Interval Training (HIIT) is a cardiovascular exercise approach that involves short, intense bursts of exercise followed by brief recovery or rest periods.",
    "category": "Fitness",
     "modelResponses": [
        {
            "response": "High intensity interval training (HIIT) is a workout strategy that alternates between short bursts of intense, maximum-effort exercise and brief recovery periods, designed to maximize calorie burn and improve cardiovascular fitness.",
            "modelIdentifier": "Model1"
        },
        {
            "response": "High-intensity interval training (HIIT) is a cardiovascular exercise strategy that alternates short bursts of intense, anaerobic exercise with less intense recovery periods, designed to maximize calorie burn, improve fitness, and boost metabolic rate.",
            "modelIdentifier": "Model2"
        }
    ]
}
```