OpenAI models
OpenAI offers the following open-weight models:
-
gpt-oss-20b
– A smaller model optimized for lower latency and local or specialized use cases. -
gpt-oss-120b
– A larger model optimized for production and general purpose or high-reasoning use cases.
The following table summarizes information about the models:
Information | gpt-oss-20b | gpt-oss-120b |
---|---|---|
Release date | August 5, 2025 | August 5, 2025 |
Model ID | openai.gpt-oss-20b-1:0 | openai.gpt-oss-120b-1:0 |
Product ID | N/A | N/A |
Input modalities supported | Text | Text |
Output modalities supported | Text | Text |
Context window | 128,000 | 128,000 |
The OpenAI models support the following features:
-
Model invocation with the following operations:
-
Guardrails application through the use of headers in the model invocation operations.
OpenAI request body
For information about the parameters in the request body and their descriptions, see Create chat completion
Use the request body fields in the following ways:
-
In an InvokeModel or OpenAI Chat Completions request, include the fields in the request body.
-
In a Converse request, do the following:
-
Map the
messages
as follows:-
For each message whose role is
developer
, add thecontent
a SystemContentBlock in thesystem
array. -
For each message whose role is
user
orassistant
, add thecontent
to a ContentBlock in thecontent
field and specify therole
in therole
field of a Message in themessages
array.
-
-
Map the values for the following fields to the corresponding fields in the
inferenceConfig
object:OpenAI field Converse field max_completion_tokens maxTokens stop stopSequences temperature temperature top_p topP -
Include any other fields in the
additionalModelRequestFields
object.
-
Considerations when constructing the request body
-
The OpenAI models support only text input and text output.
-
The value in the
model
field must match the one in the header. You can omit this field to let it be automatically populated with the same value as the header. -
The value in the
stream
field must match the API operation that you use. You can omit this field to let it be automatically populated with the correct value.-
If you use InvokeModel, the
stream
value must befalse
.
-
OpenAI response body
The response body for OpenAI models conforms to the chat completion object returned by OpenAI. For more information about the response fields, see The chat completion object
Note
If you use InvokeModel
, the model reasoning, surrounded by <reasoning>
tags, precedes the text content of the response.
Example usage of OpenAI models
This section provides some examples of how to use the OpenAI models.
Before trying out these examples, check that you've fulfilled the prerequisites:
-
Authentication – You can authenticate with either your AWS credentials or with an Amazon Bedrock API key.
Set up your AWS credentials or generate an Amazon Bedrock API key to authenticate your request.
To learn about setting up your AWS credentials, see Programmatic access with AWS security credentials.
To learn about Amazon Bedrock API keys and how to generate them, see Generate Amazon Bedrock API keys to easily authenticate to the Amazon Bedrock API.
Note
If you use the OpenAI Chat completions API, you can only authenticate with an Amazon Bedrock API key.
-
Endpoint – Find the endpoint that corresponds to the AWS Region to use in Amazon Bedrock Runtime endpoints and quotas. If you use an AWS SDK, you might only need to specify the region code and not the whole endpoint when you set up the client. You must use an endpoint associated with a Region supported by the model used in the example.
-
Model access – Request access to an OpenAI model. For more information, see Add or remove access to Amazon Bedrock foundation models.
-
(If the example uses an SDK) Install the SDK – After installation, set up default credentials and a default AWS Region. If you don't set up default credentials or a Region, you'll have to explicitly specify them in the relevant code examples. For more information about standardized credential providers, see AWS SDKs and Tools standardized credential providers.
Note
If you use the OpenAI SDK, you can only authenticate with an Amazon Bedrock API key and you must explicitly set the Amazon Bedrock endpoint.
Expand the section for the example that you want to see:
To see examples of using the OpenAI Create chat completion API, choose the tab for your preferred method, and then follow the steps:
Choose the tab for your preferred method, and then follow the steps:
When you use the unified Converse API, you need to map the OpenAI Create chat completion fields to its corresponding field in the Converse request body.
For example, compare the following chat completion request body to its corresponding Converse request body:
Choose the tab for your preferred method, and then follow the steps:
Apply a guardrail when running model invocation by specifying the guardrail ID, version, and whether or not to enable the guardrail trace in the header of a model invocation request.
Choose the tab for your preferred method, and then follow the steps:
To see examples of using guardrails with OpenAI chat completions, choose the tab for your preferred method, and then follow the steps:
Batch inference lets you run model inference asynchronously with multiple prompts. To run batch inference with an OpenAI model, you do the following:
-
Create a JSONL file and populate it with at least the minimum number of JSON objects, each separated by a new line. Each
modelInput
object must conform to the format of the OpenAI create chat completionrequest body. The following shows an example of the first two lines of a JSONL file containing request bodies for OpenAI. { "recordId": "RECORD1", "modelInput": { "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Can you generate a question with a factual answer?" } ], "max_completion_tokens": 1000 } } { "recordId": "RECORD2", "modelInput": { "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the weather like today?" } ], "max_completion_tokens": 1000 } } ...
Note
The
model
field is optional because the batch inference service will insert it for you based on the header if you omit it.Check that your JSONL file conforms to the batch inference quotas as outlined in Format and upload your batch inference data.
-
Upload the file to an Amazon S3 bucket.
-
Send a CreateModelInvocationJob request with an Amazon Bedrock control plane endpoint with the S3 bucket from the previous step specified in the
inputDataConfig
field and the OpenAI model specified in themodelId
field.
For an end-to-end code example, see Code example for batch inference. Replace with the proper configurations for the OpenAI models.