TwelveLabs Marengo Embed 2.7请求参数 TwelveLabs Marengo Embed 2.7 响应 TwelveLabs Marengo Embed 2.7 代码示例

TwelveLabs Marengo Embed 2.7

该TwelveLabs Marengo Embed 2.7模型根据视频、文本、音频或图像输入生成嵌入内容。这些嵌入可用于相似度搜索、聚类和其他机器学习任务。

提供商 — TwelveLabs
型号 — twelvelabs.marengo-embed-2-7-v1:0

该TwelveLabs Marengo Embed 2.7模型支持下表中的 Amazon Bedrock 运行时操作。

有关不同 API 方法用例的更多信息，请参阅了解不同模型推理方法的用例。
有关模型类型的更多信息，请参阅推理在 Amazon Bedrock 中是如何运作的。
- 要查看模型列表 IDs 以及中支持的模型和 AWS 区域TwelveLabs Marengo Embed 2.7，请在表格中搜索模型Amazon Bedrock 中支持的根基模型。
- 有关推理配置文件的完整列表 IDs，请参阅推理配置文件支持的区域和模型。推理配置文件 ID 基于 AWS 区域。

API 操作	支持的模型类型	输入模式	输出模态
InvokeModel	推理配置文件	文本图像	嵌入
StartAsyncInvoke	基础模型	视频音频图像文本	嵌入

API 操作

支持的模型类型

输入模式

输出模态

InvokeModel

推理配置文件

文本

图像

嵌入

StartAsyncInvoke

基础模型

视频

音频

图像

文本

嵌入

注意

InvokeModel用于生成搜索查询的嵌入内容。用于StartAsyncInvoke为大规模资源生成嵌入内容。

以下配额适用于输入：

输入模式	最大值
文本	77 个代币
图像	5MB
视频 (S3)	2 GB
音频 (S3)	2 GB

注意

如果您使用 base64 编码内联定义音频或视频，请确保请求正文有效负载不超过 Amazon Bedrock 25 MB 模型调用配额。

TwelveLabs Marengo Embed 2.7请求参数

当您发出请求时，指定模型特定输入的字段取决于 API 操作：

InvokeModel— 在请求中body。
StartAsyncInvoke— 在请求正文的modelInput字段中。

模型输入的格式取决于输入模式：

Text


{
    "inputType": "text",
    "inputText": "string",
    "textTruncate": "string
}

Inline image


{
     "inputType": "image",
     "mediaSource": {
          "base64String": "base64-encoded string"
     }
}

S3 image


{
    "inputType": "image",
    "mediaSource": {
        "s3Location": {
            "uri": "string",
            "bucketOwner": "string"
        }
    }
}

Inline video


{
    "inputType": "video",
    "mediaSource": {
        "s3Location": {
            "base64String": "base64-encoded string"
        }
    },
    "startSec": double,
    "lengthSec": double,
    "useFixedLengthSec": double,
    "embeddingOption": "visual-text" | "visual-image" | "audio"
}

S3 video


{
    "inputType": "image",
    "mediaSource": {
        "s3Location": {
           "uri": "string",
           "bucketOwner": "string"
        }
    },
    "startSec": double,
    "lengthSec": double,
    "useFixedLengthSec": double,
    "minClipSec": int,
    "embeddingOption": ["string"]
}

Inline audio


{
    "inputType": "audio", 
    "mediaSource": { 
        "base64String": "base64-encoded string"
    },
    "startSec": double,
    "lengthSec": double,
    "useFixedLengthSec": double
}

S3 audio


{
    "inputType": "audio",
    "mediaSource": {
        "s3Location": {
           "uri": "string",
           "bucketOwner": "string"
        }
    },
    "startSec": double,
    "lengthSec": double,
    "useFixedLengthSec": double
}

展开以下各节，了解有关输入参数的详细信息：

嵌入模式。

类型：字符串
必需：是
有效值：video | text | audio | image

要嵌入的文本。

类型：字符串
必填项：是（适用于兼容的输入类型）
兼容的输入类型：文本

指定平台如何截断文本。

类型：字符串
必需：否
有效值：
- end— 截断文本的结尾。
- none— 如果文本超过限制，则返回错误
默认值：结束
兼容的输入类型：文本

包含有关媒体来源的信息。

类型：对象
必填项：是（如果类型兼容）
兼容的输入类型：图像、视频、音频

请求正文中mediaSource对象的格式取决于媒体是定义为 Base64 编码的字符串还是 S3 位置。

Base64 编码的字符串


{
    "mediaSource": {
        "base64String": "base64-encoded string"
    }
}

base64String— 媒体的 Base64 编码字符串。

S3 位置 — 指定 S3 的 URI 和
```
{
    "s3Location": {
        "uri": "string",
        "bucketOwner": "string"
    }
}
```
- uri— 包含媒体的 S3 URI。
- bucketOwner— S3 存储桶所有者的 AWS 账户 ID。

指定要检索的嵌入类型。

类型：列表
必需：否
列表成员的有效值：
- visual-text— 针对文本搜索进行了优化的视觉嵌入。
- visual-image— 针对图像搜索进行了优化的视觉嵌入。
- audio— 在视频中嵌入音频。
默认值：[“视觉文本”、“视觉图像”、“音频”]
兼容的输入类型：视频、音频

片段中应开始处理的时间点，以秒为单位。

类型：双精度
必需：否
最小值：0
默认值：0
兼容的输入类型：视频、音频

以秒为单位的时间，从startSec时间点开始计算，之后应停止处理。

类型：双精度
必需：否
有效值：0-媒体持续时间
默认值：媒体时长
兼容的输入类型：视频、音频

例如：

startSec: 5
LengthSec：20
结果：该片段将在 0:05 到 0:20 之间进行处理。

模型应为其生成嵌入的每个片段的持续时间。

类型：双精度
必需：否
取值参数：2-10。必须大于或等于minClipSec。
默认值：取决于媒体类型：
- 视频：通过镜头边界检测动态划分
- 音频：均匀分割，尽可能接近 10 个。例如：
  - 50 秒的片段将分成 5 个 10 秒的片段。
  - 一个 16 秒的片段将分成 2 个 8 秒的片段。
兼容的输入类型：— 视频、音频
注意：必须大于或等于minClipSec。

设置每个片段的最小值（以秒为单位）。

类型：整数
必需：否
取值参数：范围：1-5
默认值：4
兼容的输入类型：视频
注意：必须小于或等于useFixedLengthSec。

TwelveLabs Marengo Embed 2.7 响应

输出嵌入和关联元数据的位置取决于调用方法：

InvokeModel— 在响应正文中。
StartAsyncInvoke— 在中定义的 S3 存储桶中s3OutputDataConfig，在异步调用任务完成后。

如果有多个嵌入向量，则输出为对象列表，每个对象都包含一个向量及其关联的元数据。

输出嵌入向量的格式如下：


{
    "embedding": ["string"],
    "embeddingOption": "visual-text" | "visual-image" | "audio",
    "startSec": double,
    "endsec": double
}

展开以下各节，了解有关响应参数的详细信息：

嵌入输入的向量表示。

类型：双打名单

嵌入的类型。

类型：字符串
可能的值：
- visual-text— 针对文本搜索进行了优化的视觉嵌入。
- visual-image— 针对图像搜索进行了优化的视觉嵌入。
- audio— 在视频中嵌入音频。
兼容的输入类型：视频

片段的起始偏移量。

类型：双精度
兼容的输入类型：视频、音频

片段的末端偏移量，以秒为单位。

类型：双精度
兼容的输入类型：视频、音频

TwelveLabs Marengo Embed 2.7 代码示例

本节介绍如何使用 Python 使用具有不同输入类型的TwelveLabs Marengo Embed 2.7模型。

注意

目前， InvokeModel 仅支持文本和图像输入。

按照以下步骤整理您的代码：

1. 定义特定于模型的输入

根据您的输入类型定义特定于模型的输入：

Text


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"
                            
model_input = {
  "inputType": "text",
  "inputText": "man walking a dog"
}

Inline image


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
   "inputType": "image",
   "mediaSource": {
      "base64String": "example-base64-image"
   }
}

S3 image


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
     "inputType": "image",
     "mediaSource": {
          "s3Location": {
               "uri": "s3://amzn-s3-demo-bucket/my_image.png",
               "bucketOwner": "123456789012"
          }
     }
}

Inline video


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
    "inputType": "video",
    "mediaSource": {
        "base64String": "base_64_encoded_string_of_video"
    },
    "startSec": 0,
    "lengthSec": 13,
    "useFixedLengthSec": 5,
    "embeddingOption": [
        "visual-text", 
        "audio"
    ]
}

S3 video


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
    "inputType": "video",
    "mediaSource": {
        "s3Location": {
            "uri": "amzn-s3-demo-bucket/my-video.mp4",
            "bucketOwner": "123456789012"
        }
    },
    "startSec": 0,
    "lengthSec": 13,
    "useFixedLengthSec": 5,
    "embeddingOption": [
        "visual-text", 
        "audio"
    ]
}

Inline audio


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
    "inputType": "audio", 
    "mediaSource": { 
        "base64String": "base_64_encoded_string_of_audio"
    },
    "startSec": 0,
    "lengthSec": 13,
    "useFixedLengthSec": 10
}

S3 audio


# Create the model-specific input
model_id = "twelvelabs.marengo-embed-2-7-v1:0"
# Replace the us prefix depending on your region
inference_profile_id = "us.twelvelabs.marengo-embed-2-7-v1:0"

model_input = {
    "inputType": "audio",
    "mediaSource": {  
        "s3Location": { 
            "uri": "s3://amzn-s3-demo-bucket/my-audio.wav", 
            "bucketOwner": "123456789012" 
        }
    },
    "startSec": 0,
    "lengthSec": 13,
    "useFixedLengthSec": 10
}

2. 使用模型输入运行模型调用

然后，添加与您选择的模型调用方法相对应的代码片段。

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

TwelveLabs Pegasus 1.2

Writer AIPalmyra 模型