

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 在 Amazon Bedrock 中创建用于 RAG 评测的提示数据集
<a name="knowledge-base-evaluation-prompt"></a>

要评测 Amazon Bedrock 知识库或您自己的检索增强生成（RAG）系统的检索和生成，您需要提供一个提示数据集。当您提供自己的 RAG 系统中的响应数据时，Amazon Bedrock 会跳过知识库调用步骤，直接根据您的数据执行评测作业。

提示数据集必须存储在 Amazon S3 中，并使用 JSON 行格式和 `.jsonl` 文件扩展名。每行必须是有效的 JSON 对象。每个评测作业的数据集内最多可以有 1000 条提示。对于 retrieve-and-generate评估作业，每次对话的最大回合数为 5。对于仅限检索评测，只能指定一个回合。

对于使用控制台创建的作业，您必须更新 S3 存储桶上的跨源资源共享（CORS）配置。要了解有关所需 CORS 权限的更多信息，请参阅 [必需的 S3 存储桶的跨源资源共享（CORS）权限](model-evaluation-security-cors.md)。

请参阅以下主题，详细了解根据所选评测作业的类型而需提供的键值对。

**Topics**
+ [创建用于仅限检索 RAG 评测作业的提示数据集](knowledge-base-evaluation-prompt-retrieve.md)
+ [为 retrieve-and-generate RAG 评估作业创建提示数据集](knowledge-base-evaluation-prompt-retrieve-generate.md)

# 创建用于仅限检索 RAG 评测作业的提示数据集
<a name="knowledge-base-evaluation-prompt-retrieve"></a>

“仅检索”评测作业需要采用 JSON 行格式的提示数据集。数据集最多可包含 1000 条提示。

## 为 Amazon Bedrock 在其中调用知识库的“仅检索”评测作业准备数据集
<a name="knowledge-base-evaluation-prompt-retrieve-invoke"></a>

要创建 Amazon Bedrock 在其中调用知识库的“仅检索”评测作业，您的提示数据集必须包含以下键值对：
+ `referenceResponses`— 此父密钥用于指定您期望 end-to-end RAG 系统返回的地面真相响应。此参数并不代表您期望从知识库中检索到的目标段落或分块。在 `text` 键中指定基础事实。如果您在评测作业中选择**上下文覆盖程度**指标，则 `referenceResponses` 是必需的。
+ `prompt` – 此父键用于指定您希望 RAG 系统进行响应的提示（用户查询）。

下面是一个包含 6 个输入并使用了 JSON 行格式的自定义数据集示例。

```
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
```

为清楚起见，已展开以下提示。在实际提示数据集内，每一行（提示）都必须是一个有效的 JSON 对象。

```
{
    "conversationTurns": [
        {
            "prompt": {
                "content": [
                    {
                        "text": "What is the recommended service interval for your product?"
                    }
                ]
            },
            "referenceResponses": [
                {
                    "content": [
                        {
                            "text": "The recommended service interval for our product is two years."
                        }
                    ]
                }
            ]
        }
    ]
}
```

## 使用自己的推理响应数据准备用于“仅检索”评测作业的数据集
<a name="knowledge-base-evaluation-prompt-retrieve-byoir"></a>

要创建您在其中提供自己的推理响应数据的“仅检索”评测作业，您的提示数据集必须包含：
+ `prompt` – 此父键用于指定用来生成推理响应数据的提示（用户查询）。
+ `referenceResponses`— 此父密钥用于指定您期望 end-to-end RAG 系统返回的地面真相响应。此参数并不代表您期望从知识库中检索到的目标段落或分块。在 `text` 键中指定基础事实。如果您在评测作业中选择**上下文覆盖程度**指标，则 `referenceResponses` 是必需的。
+ `referenceContexts`（可选）– 此可选父键用于指定您希望从 RAG 来源中检索到的基础事实段落。如果您需要在自己的自定义评测指标中使用此键，只需包含此键即可。Amazon Bedrock 提供的内置指标不使用此属性。
+ `knowledgeBaseIdentifier` – 客户定义的字符串，可标识用于生成检索结果的 RAG 来源。
+ `retrievedResults` – 包含检索结果列表的 JSON 对象。对于每个结果，您可以提供以键值对的形式指定的可选 `name` 和可选 `metadata`。

下面是一个包含 6 个输入并使用了 JSON 行格式的自定义数据集示例。

```
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"The prompt you used to generate your response"}]},"referenceResponses":[{"content":[{"text":"A ground-truth response"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedResults":{"retrievalResults":[{"name":"(Optional) a name for your reference context","content":{"text":"The output from your RAG inference"},"metadata":{"(Optional) a key for your metadata":"(Optional) a metadata value"}}]}}}]}
```

为清楚起见，已展开以下提示。在实际提示数据集内，每一行（提示）都必须是一个有效的 JSON 对象。

```
{
  "conversationTurns": [
    {
      "prompt": {
        "content": [
          {
            "text": "What is the recommended service interval for your product?"
          }
        ]
      },
      "referenceResponses": [
        {
          "content": [
            {
              "text": "The recommended service interval for our product is two years."
            }
          ]
        }
      ],
      "referenceContexts": [
        {
          "content": [
            {
              "text": "A ground truth for a received passage"
            }
          ]
        }
      ],
       "output": {
        "knowledgeBaseIdentifier": "RAG source 1",
        "retrievedResults": {
          "retrievalResults": [
            {
              "name": "(Optional) a name for your retrieval",
              "content": {
                "text": "The recommended service interval for our product is two years."
              },
              "metadata": {
                "(Optional) a key for your metadata": "(Optional) a value for your metadata"
              }
            }
          ]
        }
      }
    }
  ]
}
```

# 为 retrieve-and-generate RAG 评估作业创建提示数据集
<a name="knowledge-base-evaluation-prompt-retrieve-generate"></a>

 retrieve-and-generate评估任务需要使用 JSON 行格式的提示数据集。数据集最多可包含 1000 条提示

## 为 Amazon Bedrock 调用您的知识库的 retrieve-and-generate评估任务准备数据集
<a name="knowledge-base-evaluation-prompt-retrieve-generate-invoke"></a>

要创建 Amazon Bedrock 在其中调用知识库的“仅检索”评测作业，您的提示数据集必须包含以下键值对：
+ `referenceResponses` – 此父键用于指定您期望 [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) 返回的基础事实响应。在 `text` 键中指定基础事实。如果您在评测作业中选择**上下文覆盖程度**指标，则 `referenceResponses` 是必需的。
+ `prompt` – 此父键用于指定在评测作业运行时，您希望模型进行响应的提示（用户查询）。

下面是一个包含 6 个输入并使用了 JSON 行格式的自定义数据集示例。

```
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you want to use during inference"}]},"referenceResponses":[{"content":[{"text":"Specify a ground-truth response"}]}]}]}
```

为清楚起见，已展开以下提示。在实际提示数据集内，每一行（提示）都必须是一个有效的 JSON 对象。

```
{
    "conversationTurns": [
        {
            "prompt": {
                "content": [
                    {
                        "text": "What is the recommended service interval for your product?"
                    }
                ]
            },
            "referenceResponses": [
                {
                    "content": [
                        {
                            "text": "The recommended service interval for our product is two years."
                        }
                    ]
                }
            ]
        }
    ]
}
```

## 使用您自己的推理响应数据为 retrieve-and-generate评估任务准备数据集
<a name="knowledge-base-evaluation-prompt-retrieve-generate-byoir"></a>

要创建在其中提供自己的推理响应数据的 retrieve-and-generate评估作业，您的提示数据集是对话回合的列表，其中包含每回合的以下内容。每个作业只能评测一个 RAG 来源。
+ `prompt` – 您提供给模型以用于生成结果的提示。
+ `referenceResponses` – 此父键用于指定在 LLM 摄取检索结果和输入查询后，您期望其生成的最终输出所对应的基础事实响应。
+ `referenceContexts`（可选）– 此可选父键用于指定您希望从 RAG 来源中检索到的基础事实段落。如果您需要在自己的自定义评测指标中使用此键，只需包含此键即可。Amazon Bedrock 提供的内置指标不使用此属性。
+ `output` – 来自 RAG 来源的输出，包括以下内容：
  + `text` – 来自 RAG 系统中 LLM 的最终输出。
  + `retrievedPassages` – 此父键用于指定 RAG 来源检索到的内容。

您的 `output` 数据还必须包含字符串 `knowledgeBaseIdentifier`，该字符串定义用于生成推理响应的 RAG 来源。您还可以包含一个可选 `modelIdentifier` 字符串，用于标识您使用的 LLM。对于 `retrievalResults` 和 `retrievedReferences`，您可以提供可选的名称和元数据。

下面是一个包含 6 个输入并使用了 JSON 行格式的自定义数据集示例。

```
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
{"conversationTurns":[{"prompt":{"content":[{"text":"Provide the prompt you used to generate the response"}]},"referenceResponses":[{"content":[{"text":"A ground truth for the final response generated by the LLM"}]}],"referenceContexts":[{"content":[{"text":"A ground truth for a received passage"}]}],"output":{"text":"The output of the LLM","modelIdentifier":"(Optional) a string identifying your model","knowledgeBaseIdentifier":"A string identifying your RAG source","retrievedPassages":{"retrievalResults":[{"name":"(Optional) a name for your retrieval","content":{"text":"The retrieved content"},"metadata":{"(Optional) a key for your metadata":"(Optional) a value for your metadata"}}]}}}]}
```

为清楚起见，下面显示了展开的提示数据集格式。在实际提示数据集内，每一行（提示）都必须是一个有效的 JSON 对象。

```
{
    "conversationTurns": [
        {
            "prompt": {
                "content": [
                    {
                        "text": "Provide the prompt you used to generate the responses"
                    }
                ]
            },
            "referenceResponses": [
                {
                    "content": [
                        {
                            "text": "A ground truth for the final response generated by the LLM"
                        }
                    ]
                }
            ],
            "referenceContexts": [
                {
                    "content": [
                        {
                            "text": "A ground truth for a received passage"
                        }
                    ]
                }
            ],
            "output": {
                "text": "The output of the LLM",
                "modelIdentifier": "(Optional) a string identifying your model",
                "knowledgeBaseIdentifier": "A string identifying your RAG source",
                "retrievedPassages": {
                    "retrievalResults": [
                        {
                            "name": "(Optional) a name for your retrieval",
                            "content": {
                                "text": "The retrieved content"
                            },
                            "metadata": {
                                "(Optional) a key for your metadata": "(Optional) a value for your metadata"
                            }
                        }
                    ]
                }
            }
        }
    ]
}
```