准备训练数据集以进行提炼

在启动模型自定义作业之前，您至少需要准备训练数据集。要为自定义模型准备输入数据集，您需要创建.jsonl文件，文件中的每一行都是与记录对应的 JSON 对象。您创建的文件必须符合您选择的模型蒸馏和模型格式。其中的记录还必须符合大小要求。

按照提示提供输入数据。Amazon Bedrock 使用输入数据生成来自教师模型的响应，并使用生成的响应来微调学生模型。有关 Amazon Bedrock 使用的输入以及如何选择最适合您的用例的选项的更多信息，请参阅Amazon Bedrock 模型蒸馏功能的工作原理。有几个选项可用于准备输入数据集。

注意

Amazon Nova型号对蒸馏有不同的要求。有关更多信息，请参阅蒸馏Amazon Nova模型。

主题

支持的蒸馏方式

中列出的型号仅Amazon 基岩模型蒸馏支持的型号和区域支持该 text-to-text模态。

优化合成数据生成的输入提示

在模型提炼过程中，Amazon Bedrock 会生成一个合成数据集，用于针对您的特定用例微调您的学生模型。有关更多信息，请参阅 Amazon Bedrock 模型蒸馏功能的工作原理。

您可以根据所需的用例对输入提示进行格式化，从而优化合成数据生成过程。例如，如果您的提炼模型的用例是检索增强生成 (RAG)，则提示的格式与希望模型侧重于代理用例的方式不同。

以下是如何为 RAG 或代理用例格式化输入提示的示例。

RAG prompt example


{
  "schemaVersion": "bedrock-conversation-2024",
  "system": [
    {
      "text": "You are a financial analyst charged with answering questions about 10K and 10Q SEC filings. Given the context below, answer the following question."
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "<context>\nDocument 1: Multiple legal actions have been filed against us as a result of the October 29, 2018 accident of Lion Air Flight 610 and the March 10, 2019 accident of Ethiopian Airlines Flight 302.\n</context>\n\n<question>Has Boeing reported any materially important ongoing legal battles from FY2022?</question>"
        }
      ]
    }
  ]
}

Agent prompt example


{
    "schemaVersion": "bedrock-conversation-2024",
    "system": [
        {
            "text": 'You are an expert in composing functions. You are given a question and a set of possible functions. Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
                    Here is a list of functions in JSON format that you can invoke.
                    [
                        {
                            "name": "lookup_weather",
                            "description: "Lookup weather to a specific location",
                            "parameters": {
                                "type": "dict",
                                "required": [
                                    "city"
                                ],
                                "properties": {
                                    "location": {
                                        "type": "string",
                                    },
                                    "date": {
                                        "type": "string",
                                    }
                                }
                            }
                        }
                    ]'
        }
    ],
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": "What's the weather tomorrow?"
                }
            ]
        },
        {
            "role": "assistant",
            "content": [
               {
                   "text": "[lookup_weather(location=\"san francisco\", date=\"tomorrow\")]"
               }
            ]
        }
    ]
}

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

选择教师和学生模型进行蒸馏