本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
选项 2:使用调用日志进行数据准备
在创建强化微调任务时,您可以让 Amazon Bedrock 使用您的 S3 存储桶中的现有调用日志作为训练数据。对于 Amazon Bedrock,调用日志是模型调用的详细记录。
要使用调用日志进行强化微调,请将模型调用日志设置为开启,使用其中一个模型调用操作,并确保已将 Amazon S3 存储桶设置为日志的目标。有关设置调用日志的更多信息,请参阅使用 CloudWatch 日志和 Amazon S3 监控模型调用。
在以 S3 存储桶中的调用日志作为输入开始强化微调任务之前,必须提供 Amazon Bedrock 访问来自 S3 存储桶的日志的权限。有关更多信息,请参阅 调用日志访问权限。
您可以选择使用其中一个模型调用操作向调用日志中的提示-响应对添加请求元数据,稍后再使用该元数据来筛选日志。Amazon Bedrock 可以使用筛选后的日志来微调模型。
在调用日志中的提示和响应中添加请求元数据以进行强化微调
通过调用日志,您可以使用附加到调用日志的请求元数据来识别您希望 Amazon Bedrock 用于强化微调的训练数据集提示。
以下是来自调用的调用日志的InvokeModel示例:requestMetadata
{ "schemaType": "ModelInvocationLog", "schemaVersion": "1.0", "timestamp": "2024-09-11T17:16:18Z", "accountId": "XXXXX", "identity": { "arn": "arn:aws:sts::XXXXXXX:assumed-role/Admin/XXXXXX" }, "region": "us-east-1", "requestId": "7ee514f3-9eff-467c-af75-8fd564ce58e5", "operation": "InvokeModel", "modelId": "meta.llama3-1-405b-instruct-v1:0", "input": { "inputContentType": "application/json", "inputBodyJson": { "prompt": "story of two dogs", "max_tokens_to_sample": 300 }, "inputTokenCount": 13 }, "output": { "outputContentType": "application/json", "outputBodyJson": { "type": "completion", "completion": " Here is a story about two dogs:\n\nRex and Charlie were best friends who lived next door to each other. Every morning, their owners would let them out into the backyard to play. Rex was a golden retriever with long, shaggy fur. He loved to run and chase sticks. Charlie was a corgi with short legs and a big smile. He was happy just sniffing the grass and rolling around. \n\nThe two dogs spent every day playing together. Rex would grab a stick and bring it over to Charlie, wagging his tail excitedly. Charlie would take the other end of the stick in his mouth and they'd have a friendly game of tug-of-war. After tiring themselves out, they'd flop down in the warm sunshine for a nap. \n\nAt lunchtime, their owners would call them in for food. Rex would gobble up his kibble hungrily while Charlie ate his at a more leisurely pace. After lunch, it was right back outside for more playtime. The afternoon sunbeams would light up their fur as they chased each other around the yard. \n\nWhen it started getting late, their owners called them in for the night. Rex and Charlie would head inside, tired but happy after another fun day of play. After slurping up fresh water from their bowls, they'd curl up on their beds, Rex's fluffy golden tail tucked over his nose and little", "stop_reason": "max_tokens", "stop": null }, "outputTokenCount": 300 }, "requestMetadata": { "project": "CustomerService", "intent": "ComplaintResolution", "priority": "High" } }
开始钢筋微调作业时,您可以将调用日志指定为输入数据源。您可以通过 Amazon Bedrock 控制台、AWS CLI API 或 SDK 开始钢筋微调工作。
提供请求元数据的要求
请求元数据必须满足以下要求:
-
以 JSON
key:value格式提供。 -
键值对必须是长度不超过 256 个字符的字符串。
-
最多提供 16 个键值对。
使用请求元数据筛选条件
一旦包含请求元数据的调用日志可用,您就可以根据请求元数据应用过滤器,有选择地选择要包含哪些提示以微调模型。例如,您可能只想包含那些带有"project": "CustomerService"和"priority": "High"请求元数据的内容。
要使用多个请求元数据筛选日志,请使用单个布尔运算符AND或OR。您不能组合这些运算符。要筛选单个请求的元数据,请使用Equals或Not Equals运算符。