準備 Amazon Nova 模型的資料

當您使用強化微調來微調 Amazon Nova 模型時，您可以自備提示或使用現有的 Amazon Bedrock API 調用日誌作為訓練資料。

訓練資料需求和來源

您可以透過下列其中一個選項提供訓練資料：

注意

我們僅支援 OpenAI 聊天完成格式。

收集提示並以 .jsonl 檔案格式存放這些提示。您可以上傳 JSONL 格式的自訂資料集，或從 Amazon S3 選取現有的資料集。JSONL 中的每筆記錄都必須在下列結構中使用 OpenAI 聊天完成格式：

messages：在此欄位中，包含使用者、系統或助理角色，其中包含提供給模型的輸入提示。
reference_answer：在此欄位中，它應該包含預期輸出或評估條件，您的獎勵函數會用來對模型的回應進行評分。它不限於結構化輸出，它可以包含任何格式，協助您的獎勵函數評估品質。
【選用】您可以新增分級器 Lambda 用於分級的欄位。

使用要求：

JSONL 格式，其中包含 OpenAI 聊天完成格式的提示（每行一個提示）
訓練資料集至少 100 筆記錄
Amazon Bedrock 會自動驗證訓練資料集格式

當您建立強化微調任務時，可以讓 Amazon Bedrock 使用 S3 儲存貯體中的現有調用日誌做為訓練資料。對於 Amazon Bedrock，調用日誌是模型調用的詳細記錄。

您可以從 Amazon S3 使用客戶端儲存的調用/轉換 API 調用日誌進行訓練。

使用要求：

必須為您的 Amazon Bedrock 用量啟用 API 記錄
日誌必須是支援的格式 (Amazon Bedrock Invoke/Converse API)
至少 100 個提示範例

若要使用調用日誌進行強化微調，請在上設定模型調用日誌、使用其中一個模型調用操作，並確定您已將 Amazon S3 儲存貯體設定為日誌的目的地。如需設定調用日誌的詳細資訊，請參閱使用 CloudWatch Logs 和 Amazon S3 監控模型調用。

您必須先提供 Amazon Bedrock 許可，以從 S3 儲存貯體存取 S3 儲存貯體的日誌，才能啟動強化微調任務做為輸入。如需詳細資訊，請參閱模型自訂存取和安全性。

您可以選擇使用其中一個模型調用操作，將請求中繼資料新增至調用日誌中的提示-回應對，然後使用這個內容來篩選日誌。Amazon Bedrock 可以使用篩選的日誌來微調模型。

將請求中繼資料新增至調用日誌中的提示和回應

透過調用日誌，您可以識別您希望 Amazon Bedrock 使用連接到調用日誌的請求中繼資料進行強化微調的訓練資料集提示。

以下是來自 InvokeModel呼叫的調用日誌範例requestMetadata：


{
"schemaType": "ModelInvocationLog",
    "schemaVersion": "1.0",
    "timestamp": "2024-09-11T17:16:18Z",
    "accountId": "XXXXX",
    "identity": {
"arn": "arn:aws:sts::XXXXXXX:assumed-role/Admin/XXXXXX"
    },
    "region": "us-east-1",
    "requestId": "7ee514f3-9eff-467c-af75-8fd564ce58e5",
    "operation": "InvokeModel",
    "modelId": "meta.llama3-1-405b-instruct-v1:0",
    "input": {
"inputContentType": "application/json",
        "inputBodyJson": {
"prompt": "story of two dogs",
            "max_tokens_to_sample": 300
        },
        "inputTokenCount": 13
    },
    "output": {
"outputContentType": "application/json",
        "outputBodyJson": {
"type": "completion",
            "completion": " Here is a story about two dogs:\n\nRex and Charlie were best friends who lived next door to each other. Every morning, their owners would let them out into the backyard to play. Rex was a golden retriever with long, shaggy fur. He loved to run and chase sticks. Charlie was a corgi with short legs and a big smile. He was happy just sniffing the grass and rolling around. \n\nThe two dogs spent every day playing together. Rex would grab a stick and bring it over to Charlie, wagging his tail excitedly. Charlie would take the other end of the stick in his mouth and they'd have a friendly game of tug-of-war. After tiring themselves out, they'd flop down in the warm sunshine for a nap. \n\nAt lunchtime, their owners would call them in for food. Rex would gobble up his kibble hungrily while Charlie ate his at a more leisurely pace. After lunch, it was right back outside for more playtime. The afternoon sunbeams would light up their fur as they chased each other around the yard. \n\nWhen it started getting late, their owners called them in for the night. Rex and Charlie would head inside, tired but happy after another fun day of play. After slurping up fresh water from their bowls, they'd curl up on their beds, Rex's fluffy golden tail tucked over his nose and little",
            "stop_reason": "max_tokens",
            "stop": null
        },
        "outputTokenCount": 300
    },
    "requestMetadata": {
"project": "CustomerService",
        "intent": "ComplaintResolution",
        "priority": "High"
    }
}

當您啟動強化微調任務時，您可以將調用日誌指定為輸入資料來源。您可以使用 API AWS CLI或 SDK，透過 Amazon Bedrock 主控台啟動強化微調任務。

提供請求中繼資料的需求

請求中繼資料必須符合下列需求：

以 JSON key:value 格式提供。
金鑰和值對必須是最多 256 個字元的字串。
提供最多 16 個鍵/值對。

使用請求中繼資料篩選器

使用具有請求中繼資料的調用日誌後，您可以根據請求中繼資料套用篩選條件，以選擇性地選擇要包含哪些提示來微調模型。例如，您可能只想包含具有 "project": "CustomerService"和 "priority": "High"請求中繼資料的。

若要使用多個請求中繼資料篩選日誌，請使用單一布林運算子AND或 OR。您無法合併這些運算子。對於單一請求中繼資料篩選，請使用 Equals或 Not Equals運算子。

有效訓練資料的特性

有效的 RFT 訓練資料需要三個關鍵特性：

清晰度和一致性 – 使用明確、不明確且格式一致的提示。避免矛盾的標籤、模棱兩可的指示，或誤導訓練的衝突參考答案。
多樣性 - 包括各種輸入格式、邊緣案例和難度層級，反映不同使用者類型和案例的生產使用模式。
高效獎勵函數 – 設計快速執行（秒，而非分鐘） AWS Lambda、平行化並傳回一致分數的函數，以進行符合成本效益的訓練。

其他屬性

RFT 資料格式支援核心結構描述需求之外的自訂欄位 (messages 和 reference_answer)。此彈性可讓您新增獎勵函數進行適當評估所需的任何其他資料。

注意

您不需要在配方中設定此項目。資料格式本質上支援其他欄位。只要將它們包含在訓練資料 JSON 中，它們就會在 metadata 欄位中傳遞給您的獎勵函數。

常見的其他屬性

task_id – 用於追蹤的唯一識別符
difficulty_level – 問題複雜性指標
domain – 主題區域或類別
expected_reasoning_steps – 解決方案中的步驟數量

這些額外的欄位會在評估期間傳遞給您的獎勵函數，以根據您的特定使用案例量身打造複雜的評分邏輯。

具有其他屬性的範例

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

存取和安全性

設定獎勵函數