OpenAI 模型 - Amazon Bedrock

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

OpenAI 模型

OpenAI提供以下开放式重量型号:

  • gpt-oss-20b— 针对较低的延迟和本地或特殊用例进行了优化的小型模型。

  • gpt-oss-120b— 针对生产和通用或高推理用例进行了优化的更大模型。

下表汇总了有关模型的信息:

信息 gpt-oss-20b gpt-oss-120b
发行日期 2025年8月5日 2025年8月5日
模型 ID openai.gpt-oss-20b-1:0 openai.gpt-oss-120b-1:0
产品 ID 不适用 不适用
支持的输入模式 文本 文本
支持的输出模式 文本 文本
上下文窗口 128,000 128,000

这些OpenAI型号支持以下功能:

OpenAI请求正文

有关请求正文中的参数及其描述的信息,请参阅OpenAI文档中的创建聊天完成

按以下方式使用请求正文字段:

  • InvokeModel或OpenAI聊天完成请求中,请在请求正文中添加字段。

  • Converse 请求中,执行以下操作:

    • messages如下方式映射:

      • 对于角色为的每条消息developer,在system数组SystemContentBlock中添加 content a。

      • 对于角色为user或的每条消息assistant,将添加到content字段ContentBlock中的,并在messages数组role消息role字段中指定。content

    • 将以下字段的值映射到inferenceConfig对象中的相应字段:

      OpenAI 字段 匡威字段
      最大完成令牌 maxTokens
      stop stopSequences
      温度 温度
      top_p topP
    • additionalModelRequestFields对象中包含任何其他字段。

构造请求正文时的注意事项
  • 这些OpenAI模型仅支持文本输入和文本输出。

  • model字段中的值必须与标题中的值相匹配。您可以省略此字段,使其自动填充与标题相同的值。

  • stream字段中的值必须与您使用的 API 操作相匹配。您可以省略此字段,让它自动填充正确的值。

    • 如果使用 InvokeModel,则该stream值必须为false

OpenAI响应正文

OpenAI模型的响应正文符合返回的聊天完成对象。OpenAI有关响应字段的更多信息,请参阅OpenAI文档中的聊天完成对象

注意

如果使用InvokeModel,则模型推理(周围环绕着<reasoning>标签)将位于响应的文本内容之前。

OpenAI模型的用法示例

本节提供了一些如何使用OpenAI模型的示例。

在尝试这些示例之前,请检查您是否满足了先决条件:

展开要查看的示例的部分:

要查看使用OpenAI创建聊天完成 API 的示例,请选择首选方法的选项卡,然后按照以下步骤操作:

OpenAI SDK (Python)

以下 Python 脚本使用 Pyth OpenAI on 软件开发工具包调用创建聊天完成 API:

from openai import OpenAI client = OpenAI( base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", api_key="$AWS_BEARER_TOKEN_BEDROCK" ) completion = client.chat.completions.create( model="openai.gpt-oss-20b-1:0", messages=[ { "role": "developer", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] ) print(completion.choices[0].message)
HTTP request using curl

你可以在终端中运行以下命令,使用 curl 调用创建聊天完成 API:

curl -X POST https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK" \ -d '{ "model": "openai.gpt-oss-20b-1:0", "messages": [ { "role": "developer", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }'

选择您首选方法的选项卡,然后按照以下步骤操作:

Python
import boto3 import json # Initialize the Bedrock Runtime client client = boto3.client('bedrock-runtime') # Model ID model_id = 'openai.gpt-oss-20b-1:0' # Create the request body native_request = { "model": model_id, # You can omit this field "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "assistant", "content": "Hello! How can I help you today?" }, { "role": "user", "content": "What is the weather like today?" } ], "max_completion_tokens": 150, "temperature": 0.7, "top_p": 0.9, "stream": False # You can omit this field } # Make the InvokeModel request response = client.invoke_model( modelId=model_id, body=json.dumps(native_request) ) # Parse and print the message for each choice in the chat completion response_body = json.loads(response['body'].read().decode('utf-8')) for choice in response_body['choices']: print(choice['message']['content'])

使用统一的 Converse API 时,需要将 “OpenAI创建聊天完成” 字段映射到 Converse 请求正文中的相应字段。

例如,将以下聊天完成请求正文与其相应的 Converse 请求正文进行比较:

Create chat completion request body
{ "model": "openai.gpt-oss-20b-1:0", "messages": [ { "role": "developer", "content": "You are a helpful assistant." }, { "role": "assistant", "content": "Hello! How can I help you today?" }, { "role": "user", "content": "What is the weather like today?" } ], "max_completion_tokens": 150, "temperature": 0.7 }
Converse request body
{ "messages": [ { "role": "user", "content": [ { "text": "Hello! How can I help you today?" } ] }, { "role": "user", "content": [ { "text": "What is the weather like today?" } ] } ], "system": [ { "text": "You are a helpful assistant." } ], "inferenceConfig": { "maxTokens": 150, "temperature": 0.7 } }

选择您首选方法的选项卡,然后按照以下步骤操作:

Python
# Use the Conversation API to send a text message to Anthropic Claude. import boto3 from botocore.exceptions import ClientError # Initialize the Bedrock Runtime client client = boto3.client("bedrock-runtime") # Set the model ID model_id = "openai.gpt-oss-20b-1:0" # Set up messages and system message messages = [ { "role": "assistant", "content": [ { "text": "Hello! How can I help you today?" } ] }, { "role": "user", "content": [ { "text": "What is the weather like today?" } ] } ] system = [ { "text": "You are a helpful assistant." } ] try: # Send the message to the model, using a basic inference configuration. response = client.converse( modelId=model_id, messages=messages, system=system, inferenceConfig={ "maxTokens": 150, "temperature": 0.7, "topP": 0.9 }, ) # Extract and print the response text. for content_block in response["output"]["message"]["content"]: print(content_block) except (ClientError, Exception) as e: print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}") exit(1)

通过在模型调用请求的标题中指定护栏 ID、版本以及是否启用护栏跟踪,在运行模型调用时设置防护栏。

选择您首选方法的选项卡,然后按照以下步骤操作:

Python
import boto3 from botocore.exceptions import ClientError import json # Initiate the Amazon Bedrock Runtime client bedrock_runtime = boto3.client("bedrock-runtime") # Model ID model_id = "openai.gpt-oss-20b-1:0" # Replace with actual values from your guardrail guardrail_id = "GR12345" guardrail_version = "DRAFT" # Create the request body native_request = { "model": model_id, # You can omit this field "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "assistant", "content": "Hello! How can I help you today?" }, { "role": "user", "content": "What is the weather like today?" } ], "max_completion_tokens": 150, "temperature": 0.7, "top_p": 0.9, "stream": False # You can omit this field } try: response = bedrock_runtime.invoke_model( modelId=model_id, body=json.dumps(native_request), guardrailIdentifier=guardrail_id, guardrailVersion=guardrail_version, trace='ENABLED', ) response_body = json.loads(response.get('body').read()) print("Received response from InvokeModel API (Request Id: {})".format(response['ResponseMetadata']['RequestId'])) print(json.dumps(response_body, indent=2)) except ClientError as err: print("RequestId = " + err.response['ResponseMetadata']['RequestId']) raise err

要查看在OpenAI聊天完成时使用护栏的示例,请选择首选方法的选项卡,然后按照以下步骤操作:

OpenAI SDK (Python)
import openai from openai import OpenAIError # Endpoint for Amazon Bedrock Runtime bedrock_endpoint = "https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1" # Model ID model_id = "openai.gpt-oss-20b-1:0" # Replace with actual values bedrock_api_key = "$AWS_BEARER_TOKEN_BEDROCK" guardrail_id = "GR12345" guardrail_version = "DRAFT" client = openai.OpenAI( api_key=bedrock_api_key, base_url=bedrock_endpoint, ) try: response = client.chat.completions.create( model=model_id, # Specify guardrail information in the header extra_headers={ "X-Amzn-Bedrock-GuardrailIdentifier": guardrail_id, "X-Amzn-Bedrock-GuardrailVersion": guardrail_version, "X-Amzn-Bedrock-Trace": "ENABLED", }, # Additional guardrail information can be specified in the body extra_body={ "amazon-bedrock-guardrailConfig": { "tagSuffix": "xyz" # Used for input tagging } }, messages=[ { "role": "system", "content": "You are a helpful assistant." }, { "role": "assistant", "content": "Hello! How can I help you today?" }, { "role": "user", "content": "What is the weather like today?" } ] ) request_id = response._request_id print(f"Request ID: {request_id}") print(response) except OpenAIError as e: print(f"An error occurred: {e}") if hasattr(e, 'response') and e.response is not None: request_id = e.response.headers.get("x-request-id") print(f"Request ID: {request_id}")
OpenAI SDK (Java)
import com.openai.client.OpenAIClient; import com.openai.client.okhttp.OpenAIOkHttpClient; import com.openai.core.http.HttpResponseFor; import com.openai.models.chat.completions.ChatCompletion; import com.openai.models.chat.completions.ChatCompletionCreateParams; // Endpoint for Amazon Bedrock Runtime String bedrockEndpoint = "http://bedrock-runtime.us-west-2.amazonaws.com/openai/v1" // Model ID String modelId = "openai.gpt-oss-20b-1:0" // Replace with actual values String bedrockApiKey = "$AWS_BEARER_TOKEN_BEDROCK" String guardrailId = "GR12345" String guardrailVersion = "DRAFT" OpenAIClient client = OpenAIOkHttpClient.builder() .apiKey(bedrockApiKey) .baseUrl(bedrockEndpoint) .build() ChatCompletionCreateParams request = ChatCompletionCreateParams.builder() .addUserMessage("What is the temperature in Seattle?") .model(modelId) // Specify additional headers for the guardrail .putAdditionalHeader("X-Amzn-Bedrock-GuardrailIdentifier", guardrailId) .putAdditionalHeader("X-Amzn-Bedrock-GuardrailVersion", guardrailVersion) // Specify additional body parameters for the guardrail .putAdditionalBodyProperty( "amazon-bedrock-guardrailConfig", JsonValue.from(Map.of("tagSuffix", JsonValue.of("xyz"))) // Allows input tagging ) .build(); HttpResponseFor<ChatCompletion> rawChatCompletionResponse = client.chat().completions().withRawResponse().create(request); final ChatCompletion chatCompletion = rawChatCompletionResponse.parse(); System.out.println(chatCompletion);

Batch 推理允许您使用多个提示异步运行模型推理。要使用OpenAI模型运行批量推理,请执行以下操作:

  1. 创建一个 JSONL 文件,并在其中填充至少最少数量的 JSON 对象,每个对象之间用换行符分隔。每个modelInput对象都必须符合OpenAI创建聊天完成请求正文的格式。以下是包含请求正文的 JSONL 文件前两行的示例。OpenAI

    { "recordId": "RECORD1", "modelInput": { "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Can you generate a question with a factual answer?" } ], "max_completion_tokens": 1000 } } { "recordId": "RECORD2", "modelInput": { "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the weather like today?" } ], "max_completion_tokens": 1000 } } ...
    注意

    model字段是可选的,因为如果您省略标题,批量推理服务将根据标题为您插入该字段。

    检查您的 JSONL 文件是否符合中概述的批量推理配额。设置格式并上传批量推理数据

  2. 将该文件上传到 Amazon S3 存储桶。

  3. 使用 Amazon Bedrock 控制平面终端节点发送CreateModelInvocationJob请求,字段中指定了上一步中的 S3 存储桶,并在inputDataConfig字段中指定了OpenAImodelId模型。

有关 end-to-end代码示例,请参阅批量推理的代码示例。替换为OpenAI模型的正确配置。