View a markdown version of this page

API compatibility - Amazon Bedrock

API compatibility

Amazon Bedrock supports four families of runtime APIs, each designed for different integration patterns and use cases.

Invoke family: InvokeModel handles synchronous, single-response calls. InvokeModelWithResponseStream returns responses as a real-time stream. InvokeModelWithBidirectionalStream enables full-duplex streaming for interactive applications. AsyncInvoke submits long-running requests asynchronously, storing output to Amazon S3.

Converse family: Converse provides a unified, model-agnostic interface for synchronous multi-turn conversations. ConverseStream delivers the same experience with streaming output.

OpenAI-compatible family: ChatCompletions implements the OpenAI Chat Completions interface, enabling existing OpenAI-based integrations to run on Bedrock with minimal changes. Responses API implements the OpenAI Responses interface, supporting stateful, agentic interactions with built-in tool use and conversation history management.

Messages family: Messages implements the Anthropic Messages interface on the bedrock-mantle endpoint, enabling existing Anthropic SDK-based integrations to run on Bedrock with minimal changes.

We will now look at the list of APIs supported by each model.

AI21

Model name Invoke Converse Chat Completions Responses Messages
Jamba 1.5 Large* Yes Yes No No No
Jamba 1.5 Mini* Yes Yes No No No

Amazon

Anthropic

Cohere

Model name Invoke Converse Chat Completions Responses Messages
Command R* Yes Yes No No No
Command R+* Yes Yes No No No
Embed English Yes No No No No
Embed Multilingual Yes No No No No
Embed v4 Yes No No No No
Rerank 3.5 Yes No No No No

DeepSeek

Model name Invoke Converse Chat Completions Responses Messages
DeepSeek V3.2* Yes Yes Yes No No
DeepSeek-R1* Yes Yes No No No
DeepSeek-V3.1* Yes Yes Yes No No

Google

Model name Invoke Converse Chat Completions Responses Messages
Gemma 3 12B IT* Yes Yes Yes No No
Gemma 3 27B PT* Yes Yes Yes No No
Gemma 3 4B IT* Yes Yes Yes No No

Meta

MiniMax

Model name Invoke Converse Chat Completions Responses Messages
MiniMax M2* Yes Yes Yes No No
MiniMax M2.1* Yes Yes Yes No No
MiniMax M2.5* Yes Yes Yes No

Mistral

Moonshot

Model name Invoke Converse Chat Completions Responses Messages
Kimi K2 Thinking* Yes Yes No No No
Kimi K2.5* Yes Yes Yes No No

NVIDIA

Model name Invoke Converse Chat Completions Responses Messages
NVIDIA Nemotron Nano 9B v2* Yes Yes Yes No No
NVIDIA Nemotron Nano 12B v2 VL BF16* Yes Yes Yes No No
Nemotron Nano 3 30B* Yes Yes Yes No No
NVIDIA Nemotron 3 Super 120B* Yes Yes Yes No

OpenAI

Model name Invoke Converse Chat Completions Responses Messages
GPT OSS Safeguard 120B* Yes Yes Yes Yes No
GPT OSS Safeguard 20B* Yes Yes Yes Yes No
gpt-oss-120b* Yes Yes Yes Yes No
gpt-oss-20b* Yes Yes Yes Yes No

Qwen

Stability

TwelveLabs

Model name Invoke Converse Chat Completions Responses Messages
Marengo Embed 3.0 Yes No No No No
Marengo Embed v2.7 No No No No No
Pegasus v1.2 Yes No No No No

Writer

Model name Invoke Converse Chat Completions Responses Messages
Palmyra Vision 7B Yes Yes Yes No
Palmyra X4* Yes Yes No No No
Palmyra X5* Yes Yes No No No

Z.AI

Model name Invoke Converse Chat Completions Responses Messages
GLM 4.7* Yes Yes Yes No No
GLM 4.7 Flash* Yes Yes Yes No No
GLM 5* Yes Yes Yes No
Note

* Streaming Support: Models marked with an asterisk (*) also support InvokeModelWithResponseStream, which returns responses as a real-time stream.

Models supporting StartAsyncInvoke

StartAsyncInvoke is an Amazon Bedrock Runtime API that allows callers to submit a model invocation request and immediately receive back an invocationArn without waiting for the model to finish processing. The job runs in the background, and the output is written to a caller-specified S3 bucket once complete. Callers can then poll job status using the companion GetAsyncInvoke and ListAsyncInvokes APIs. The pattern is purpose-built for workloads involving large or latency-insensitive inputs, particularly video, audio, and bulk embedding generation, where holding an open synchronous connection would be impractical.

In terms of which models support it, the following models support StartAsyncInvoke:

  • TwelveLabs Marengo Embed 2.7 (twelvelabs.marengo-embed-2-7-v1:0) — required for video and audio input; InvokeModel only handles text and image

  • TwelveLabs Marengo Embed 3.0 (twelvelabs.marengo-embed-3-0-v1:0) — same pattern; async required for video/audio at scale

  • Amazon Nova Reel (amazon.nova-reel-v1:0 and v1:1) — video generation is exclusively async; output lands in S3

  • Amazon Nova Multimodal Embeddings (amazon.nova-2-multimodal-embeddings-v1:0) — async is required for video inputs larger than 25MB base64-encoded; sync is available for text, image, and document inputs

InvokeModelWithBidirectionalStream

InvokeModelWithBidirectionalStream is an Amazon Bedrock Runtime API that establishes a persistent, full-duplex channel between the caller and the model, allowing audio data to flow in both directions simultaneously and continuously. Unlike the standard InvokeModel or even InvokeModelWithResponseStream APIs, which follow a request-then-response pattern, this API keeps the connection open for the duration of a session so that the model can process incoming audio as it arrives and stream generated speech back in near real-time, without waiting for a complete utterance to finish. The interaction is structured around three phases: session initialization (where the client sends configuration events to set up the stream), audio streaming (where captured audio is encoded and sent as a continuous event stream), and response streaming (where the model simultaneously returns text transcriptions of user speech and synthesized audio output). InvokeModelWithBidirectionalStream cannot be used with Amazon Bedrock API keys and requires standard AWS credential-based authentication, reflecting its more complex session lifecycle compared to other Bedrock Runtime operations.

The following models support this API:

  • Amazon Nova Sonic family: Both amazon.nova-sonic-v1:0 and amazon.nova-2-sonic-v1:0 use it as their sole invocation path, since the speech-to-speech architecture fundamentally requires a live bidirectional channel that neither InvokeModel nor Converse can provide.