TwelveLabs models

This section describes the request parameters and response fields for TwelveLabs models. Use this information to make inference calls to TwelveLabs models. The TwelveLabs Pegasus 1.2 model supports InvokeModel and InvokeModelWithResponseStream (streaming) operations, while the TwelveLabs Marengo Embed 2.7 model supports StartAsyncInvoke operations. This section also includes code examples that show how to call TwelveLabs models. To use a model in an inference operation, you need the model ID for the model. To get the model ID, see Supported foundation models in Amazon Bedrock.

TwelveLabs is a leading provider of multimodal AI models specializing in video understanding and analysis. Their advanced models enable sophisticated video search, analysis, and content generation capabilities through state-of-the-art computer vision and natural language processing technologies. Amazon Bedrock now offers two TwelveLabs models: TwelveLabs Pegasus 1.2, which provides comprehensive video understanding and analysis, and TwelveLabs Marengo Embed 2.7, which generates high-quality embeddings for video, text, audio, and image content. These models empower developers to build applications that can intelligently process, analyze, and derive insights from video data at scale.

TwelveLabs Pegasus 1.2

A multimodal model that provides comprehensive video understanding and analysis capabilities, including content recognition, scene detection, and contextual understanding. The model can analyze video content and generate textual descriptions, insights, and answers to questions about the video.

TwelveLabs Marengo Embed 2.7

A multimodal embedding model that generates high-quality vector representations of video, text, audio, and image content for similarity search, clustering, and other machine learning tasks. The model supports multiple input modalities and provides specialized embeddings optimized for different use cases.

The following table lists the TwelveLabs models available in Amazon Bedrock.

TwelveLabs models
Model name	Model ID	Input modality	Output modality	Description
TwelveLabs Pegasus 1.2	twelvelabs.pegasus-1-2-v1:0	Video	Text	A multimodal model that provides comprehensive video understanding and analysis capabilities, including content recognition, scene detection, and contextual understanding.
TwelveLabs Marengo Embed 2.7	twelvelabs.marengo-embed-2-7-v1:0	Video, Text, Audio, Image	Embeddings	A multimodal embedding model that generates high-quality vector representations of video, text, audio, and image content for similarity search, clustering, and other machine learning tasks.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Stability.ai Stable Diffusion 3

TwelveLabs Pegasus 1.2