Inference using Responses API
Amazon Bedrock provides the OpenAI Responses API via the bedrock-mantle endpoint,
powered by Mantle, a distributed inference engine for large-scale machine learning model
serving. This endpoint allows you to use familiar OpenAI SDKs and tools with Amazon Bedrock models,
enabling you to migrate existing applications with minimal code changes—simply update your
base URL and API key.
Important
When using the OpenAI SDK with Amazon Bedrock, you must point it to the Amazon Bedrock endpoint, not the OpenAI endpoint. Set the following environment variables:
OPENAI_BASE_URL="https://bedrock-mantle.<your-region>.api.aws/v1" OPENAI_API_KEY="<your Bedrock API key>"
Do not use your OpenAI API key or the OpenAI base URL (https://api.openai.com/v1). Those connect to OpenAI directly, not to Amazon Bedrock. To create a Amazon Bedrock API key, see API keys.
Key benefits include:
-
Asynchronous inference – Support for long-running inference workloads through the Responses API
-
Stateful conversation management – Automatically rebuild context without manually passing conversation history with each request
-
Simplified tool use – Streamlined integration for agentic workflows
-
Flexible response modes – Support for both streaming and non-streaming responses
-
Easy migration – Compatible with existing OpenAI SDK codebases
Supported Regions and Endpoints
The bedrock-mantle endpoint is available in the following AWS Regions:
| Region Name | Region | Endpoint |
|---|---|---|
| US East (Ohio) | us-east-2 | bedrock-mantle.us-east-2.api.aws |
| US East (N. Virginia) | us-east-1 | bedrock-mantle.us-east-1.api.aws |
| US West (Oregon) | us-west-2 | bedrock-mantle.us-west-2.api.aws |
| Asia Pacific (Jakarta) | ap-southeast-3 | bedrock-mantle.ap-southeast-3.api.aws |
| Asia Pacific (Mumbai) | ap-south-1 | bedrock-mantle.ap-south-1.api.aws |
| Asia Pacific (Sydney) | ap-southeast-2 | bedrock-mantle.ap-southeast-2.api.aws |
| Asia Pacific (Tokyo) | ap-northeast-1 | bedrock-mantle.ap-northeast-1.api.aws |
| Europe (Frankfurt) | eu-central-1 | bedrock-mantle.eu-central-1.api.aws |
| Europe (Ireland) | eu-west-1 | bedrock-mantle.eu-west-1.api.aws |
| Europe (London) | eu-west-2 | bedrock-mantle.eu-west-2.api.aws |
| Europe (Milan) | eu-south-1 | bedrock-mantle.eu-south-1.api.aws |
| Europe (Stockholm) | eu-north-1 | bedrock-mantle.eu-north-1.api.aws |
| South America (São Paulo) | sa-east-1 | bedrock-mantle.sa-east-1.api.aws |
Prerequisites
Before using OpenAI APIs, ensure you have the following:
-
Authentication – You can authenticate using:
-
Amazon Bedrock API key (required for OpenAI SDK)
-
AWS credentials (supported for HTTP requests)
-
-
OpenAI SDK (optional) – Install the OpenAI Python SDK if using SDK-based requests.
-
Environment variables – Set the following environment variables:
-
OPENAI_API_KEY– Set to your Amazon Bedrock API key -
OPENAI_BASE_URL– Set to the Amazon Bedrock endpoint for your region (for example,https://bedrock-mantle.us-east-1.api.aws/v1)
-
Models API
The Models API allows you to discover available models in Amazon Bedrock powered by Mantle. Use this API to
retrieve a list of models you can use with the Responses API.
For complete API details, see the OpenAI Models
documentation
List available models
To list available models, choose the tab for your preferred method, and then follow the steps:
Responses API
The Responses API provides stateful conversation management with support for
streaming, background processing, and multi-turn interactions. For complete API details,
see the OpenAI
Responses documentation
Note
Not all models support the Responses API. To see which models support the Responses API, see API compatibility.
How the Responses API stores conversation state
The Responses API can use stored state to enable multi-turn conversations and let
you reference previous turns through the previous_response_id
parameter. Storage is enabled by default but can be disabled per request through the
store parameter. Stored responses are scoped by Project. A response
from one Project cannot be used as the previous response or read in a second
Project. For more information about Projects, see Projects (OpenAI-compatible).
-
When
storeistrue(the default), Amazon Bedrock retains the response, including the input and output, for 30 days in the source region of the request. During this window you can chain follow-up requests by passingprevious_response_idand retrieve the response withGET /v1/responses/{id}. After 30 days, the response is automatically deleted and is no longer retrievable. -
When
storeisfalse, Amazon Bedrock does not retain any data from the request or response. Theprevious_response_idparameter cannot be used to continue the conversation.
The default value is true to match the OpenAI Responses API
specification. Customers who do not want Amazon Bedrock to retain conversation data should
explicitly set store to false on every request. Stored
data is kept in the source region of the request, encrypted at rest, and scoped to
the calling AWS account's Project resource. The data is stored solely to service
your requests and is not used or retained for any other purpose.
Basic request
To create a response, choose the tab for your preferred method, and then follow the steps:
Stream responses
To receive response events incrementally, choose the tab for your preferred method, and then follow the steps: