使用 Invoke API

注意

本文件適用於 Amazon Nova 第 1 版。如需有關如何搭配 Amazon Nova 2 使用調用 API 的資訊，請參閱使用調用 API。

調用 Amazon Nova 理解模型 (Amazon Nova Micro、Lite、Pro 和 Premier) 的另一種方法是透過 Invoke API。Amazon Nova 模型的 Invoke API 設計為與 Converse API 一致，讓相同的統一性可擴展以支援使用 Invoke API 的使用者 (文件理解功能除外，該功能是 Converse API 特有的)。在模型提供者之間維持一致的結構描述的同時，會使用先前討論的元件。Invoke API 支援下列模型功能：

調用模型：支援使用緩衝 (而不是串流) 回應的基本多回合對話
透過回應串流呼叫模型：具有串流回應的多回合對話，以實現更多增量生成和更強的互動感
系統提示詞：系統指示，例如角色或回應指導方針
視覺：影像和影片輸入
工具使用：函數呼叫以選取各種外部工具
串流工具使用：結合工具使用和即時生成串流
護欄機制：防止不當或有害的內容

重要

對 Amazon Nova 的推理呼叫逾時時間為 60 分鐘。根據預設， AWS SDK 用戶端會在 1 分鐘後逾時。建議您將 AWS SDK 用戶端的讀取逾時期間增加到至少 60 分鐘。例如，在 AWS Python botocore SDK 中，將 botocore.config 中的 read_timeout 欄位值變更為至少 3600。


client = boto3.client(
    "bedrock-runtime",
    region_name="us-east-1",
    config=Config(
        connect_timeout=3600,  # 60 minutes
        read_timeout=3600,     # 60 minutes
        retries={'max_attempts': 1}
    )
)

以下是如何搭配 boto3 使用叫用串流 API 的範例，boto3 是搭配 Amazon Nova Lite 的適用於 Python 的 AWS SDK：


# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
import boto3
import json
from datetime import datetime

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

LITE_MODEL_ID = "us.amazon.nova-lite-v1:0"

# Define your system prompt(s).
system_list = [
            {
                "text": "Act as a creative writing assistant. When the user provides you with a topic, write a short story about that topic."
            }
]

# Define one or more messages using the "user" and "assistant" roles.
message_list = [{"role": "user", "content": [{"text": "A camping trip"}]}]

# Configure the inference parameters.
inf_params = {"maxTokens": 500, "topP": 0.9, "topK": 20, "temperature": 0.7}

request_body = {
    "schemaVersion": "messages-v1",
    "messages": message_list,
    "system": system_list,
    "inferenceConfig": inf_params,
}

start_time = datetime.now()

# Invoke the model with the response stream
response = client.invoke_model_with_response_stream(
    modelId=LITE_MODEL_ID, body=json.dumps(request_body)
)

request_id = response.get("ResponseMetadata").get("RequestId")
print(f"Request ID: {request_id}")
print("Awaiting first token...")

chunk_count = 0
time_to_first_token = None

# Process the response stream
stream = response.get("body")
if stream:
    for event in stream:
        chunk = event.get("chunk")
        if chunk:
            # Print the response chunk
            chunk_json = json.loads(chunk.get("bytes").decode())
            # Pretty print JSON
            # print(json.dumps(chunk_json, indent=2, ensure_ascii=False))
            content_block_delta = chunk_json.get("contentBlockDelta")
            if content_block_delta:
                if time_to_first_token is None:
                    time_to_first_token = datetime.now() - start_time
                    print(f"Time to first token: {time_to_first_token}")

                chunk_count += 1
                current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S:%f")
                # print(f"{current_time} - ", end="")
                print(content_block_delta.get("delta").get("text"), end="")
    print(f"Total chunks: {chunk_count}")
else:
    print("No response stream received.")

如需有關 Invoke API 作業的更多詳細資訊，包括請求和回應語法，請參閱 Amazon Bedrock API 文件中的 InvokeModelWithResponseStream。

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

使用 Converse API

完成請求結構描述