AI21 Amazon Anthropic Cohere DeepSeek Google Meta MiniMax Mistral Moonshot NVIDIA OpenAI Qwen Stability TwelveLabs Writer xAI Z.AI Models supporting StartAsyncInvoke InvokeModelWithBidirectionalStream

API compatibility

Amazon Bedrock supports four families of runtime APIs, each designed for different integration patterns and use cases.

Invoke family: InvokeModel handles synchronous, single-response calls. InvokeModelWithResponseStream returns responses as a real-time stream. InvokeModelWithBidirectionalStream enables full-duplex streaming for interactive applications. AsyncInvoke submits long-running requests asynchronously, storing output to Amazon S3.

Converse family: Converse provides a unified, model-agnostic interface for synchronous multi-turn conversations. ConverseStream delivers the same experience with streaming output.

OpenAI-compatible family: ChatCompletions implements the OpenAI Chat Completions interface, enabling existing OpenAI-based integrations to run on Bedrock with minimal changes. Responses API implements the OpenAI Responses interface, supporting stateful, agentic interactions with built-in tool use and conversation history management.

Messages family: Messages implements the Anthropic Messages interface on the bedrock-mantle endpoint, enabling existing Anthropic SDK-based integrations to run on Bedrock with minimal changes.

We will now look at the list of APIs supported by each model.

AI21

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Jamba 1.5 Large*
Jamba 1.5 Mini*

Amazon

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Amazon Nova Multimodal Embeddings
Nova 2 Lite*
Nova 2 Sonic
Nova Canvas
Nova Lite*
Nova Micro*
Nova Premier*
Nova Pro*
Nova Reel
Nova Sonic*
Titan Embeddings G1 - Text
Titan Image Generator G1 v2
Titan Multimodal Embeddings G1
Titan Text Embeddings V2

Anthropic

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Claude Mythos 5
Claude Fable 5
Claude Mythos Preview
Claude 3 Haiku*
Claude 3.5 Haiku*
Claude Haiku 4.5*
Claude Opus 4.1*
Claude Opus 4.5*
Claude Opus 4.6*
Claude Opus 4.7*
Claude Opus 4.8*
Claude Sonnet 4*
Claude Sonnet 4.5*
Claude Sonnet 4.6*

Cohere

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Command R*
Command R+*
Embed English
Embed Multilingual
Embed v4
Rerank 3.5

DeepSeek

Model name	Invoke	Converse	Chat Completions	Responses	Messages
DeepSeek V3.2*
DeepSeek-R1*
DeepSeek-V3.1*

Google

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Gemma 3 12B IT*
Gemma 3 27B PT*
Gemma 3 4B IT*

MiniMax

Model name	Invoke	Converse	Chat Completions	Responses	Messages
MiniMax M2*
MiniMax M2.1*
MiniMax M2.5*

Mistral

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Devstral 2 123B*
Magistral Small 2509*
Ministral 14B 3.0*
Ministral 3 8B*
Ministral 3B*
Mistral 7B Instruct*
Mistral Large*
Mistral Large 3*
Mistral Small*
Mixtral 8x7B Instruct*
Pixtral Large*
Voxtral Mini 3B 2507*
Voxtral Small 24B 2507*

Moonshot

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Kimi K2 Thinking*
Kimi K2.5*

NVIDIA

Model name	Invoke	Converse	Chat Completions	Responses	Messages
NVIDIA Nemotron Nano 9B v2*
NVIDIA Nemotron Nano 12B v2 VL BF16*
Nemotron Nano 3 30B*
NVIDIA Nemotron 3 Super 120B*

OpenAI

Model name	Invoke	Converse	Chat Completions	Responses	Messages
GPT-5.5
GPT-5.4
GPT OSS Safeguard 120B*
GPT OSS Safeguard 20B*
gpt-oss-120b*
gpt-oss-20b*

Qwen

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Qwen3 235B A22B 2507*
Qwen3 32B*
Qwen3 Coder 480B A35B Instruct*
Qwen3 Coder Next*
Qwen3 Next 80B A3B*
Qwen3 VL 235B A22B*
Qwen3-Coder-30B-A3B-Instruct*

Stability

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Stable Image Conservative Upscale
Stable Image Control Sketch
Stable Image Control Structure
Stable Image Creative Upscale
Stable Image Erase Object
Stable Image Fast Upscale
Stable Image Inpaint
Stable Image Outpaint
Stable Image Remove Background
Stable Image Search and Recolor
Stable Image Search and Replace
Stable Image Style Guide
Stable Image Style Transfer

TwelveLabs

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Marengo Embed 3.0
Marengo Embed v2.7
Pegasus v1.2

Writer

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Palmyra Vision 7B
Palmyra X4*
Palmyra X5*

xAI

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Grok 4.3

Z.AI

Model name	Invoke	Converse	Chat Completions	Responses	Messages
GLM 4.7*
GLM 4.7 Flash*
GLM 5*

Note

* Streaming Support: Models marked with an asterisk (*) also support InvokeModelWithResponseStream, which returns responses as a real-time stream.

Models supporting StartAsyncInvoke

StartAsyncInvoke is an Amazon Bedrock Runtime API that allows callers to submit a model invocation request and immediately receive back an invocationArn without waiting for the model to finish processing. The job runs in the background, and the output is written to a caller-specified S3 bucket once complete. Callers can then poll job status using the companion GetAsyncInvoke and ListAsyncInvokes APIs. The pattern is purpose-built for workloads involving large or latency-insensitive inputs, particularly video, audio, and bulk embedding generation, where holding an open synchronous connection would be impractical.

In terms of which models support it, the following models support StartAsyncInvoke:

TwelveLabs Marengo Embed 2.7 (twelvelabs.marengo-embed-2-7-v1:0) — required for video and audio input; InvokeModel only handles text and image
TwelveLabs Marengo Embed 3.0 (twelvelabs.marengo-embed-3-0-v1:0) — same pattern; async required for video/audio at scale
Amazon Nova Reel (amazon.nova-reel-v1:0 and v1:1) — video generation is exclusively async; output lands in S3
Amazon Nova Multimodal Embeddings (amazon.nova-2-multimodal-embeddings-v1:0) — async is required for video inputs larger than 25MB base64-encoded; sync is available for text, image, and document inputs

InvokeModelWithBidirectionalStream

InvokeModelWithBidirectionalStream is an Amazon Bedrock Runtime API that establishes a persistent, full-duplex channel between the caller and the model, allowing audio data to flow in both directions simultaneously and continuously. Unlike the standard InvokeModel or even InvokeModelWithResponseStream APIs, which follow a request-then-response pattern, this API keeps the connection open for the duration of a session so that the model can process incoming audio as it arrives and stream generated speech back in near real-time, without waiting for a complete utterance to finish. The interaction is structured around three phases: session initialization (where the client sends configuration events to set up the stream), audio streaming (where captured audio is encoded and sent as a continuous event stream), and response streaming (where the model simultaneously returns text transcriptions of user speech and synthesized audio output). InvokeModelWithBidirectionalStream cannot be used with Amazon Bedrock API keys and requires standard AWS credential-based authentication, reflecting its more complex session lifecycle compared to other Bedrock Runtime operations.

The following models support this API:

Amazon Nova Sonic family: Both amazon.nova-sonic-v1:0 and amazon.nova-2-sonic-v1:0 use it as their sole invocation path, since the speech-to-speech architecture fundamentally requires a live bidirectional channel that neither InvokeModel nor Converse can provide.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Model availability & compatibility

Endpoint availability

Model name	Invoke	Converse	Chat Completions	Responses	Messages
Llama 3 70B Instruct*
Llama 3 8B Instruct*
Llama 3.1 405B Instruct
Llama 3.1 70B Instruct*
Llama 3.1 8B Instruct*
Llama 3.2 11B Instruct*
Llama 3.2 1B Instruct*
Llama 3.2 3B Instruct*
Llama 3.2 90B Instruct*
Llama 3.3 70B Instruct*
Llama 4 Maverick 17B Instruct*
Llama 4 Scout 17B Instruct*