本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 容量、限制和成本最佳化
<a name="capacity-limits-cost-optimization"></a>

Amazon Bedrock 提供彈性的容量選項，以符合您的工作負載需求和預算。了解隨需層 (Flex、Priority、Standard)、預留層、批次處理和跨區域推論之間的差異，可協助您最佳化效能和成本。

# 最佳化效能和成本的服務層
<a name="service-tiers-inference"></a>

Amazon Bedrock 為模型推論提供四個服務層：預留、優先順序、標準和彈性。透過服務層，您可以最佳化可用性、成本和效能。

## 預留層
<a name="w2aac26b5b5"></a>

預留層可讓您為任務關鍵應用程式預留優先順序運算容量，而這些應用程式無法容忍任何停機時間。您可以彈性配置不同的輸入和輸出tokens-per-minute容量，以符合工作負載和控制成本的確切需求。當您的應用程式每分鐘需要比您預留更多的tokens-per-minute容量時，服務會自動溢位到 Standard 層，以確保不間斷的操作。預留層以模型回應的 99.5% 執行時間為目標。客戶可以保留 1 個月或 3 個月的容量。客戶每分鐘每 1K tokens-per-minute支付固定價格，並按月計費。

若要存取預留方案，請聯絡您的 AWS 帳戶團隊。

**注意**  
帳單會持續進行，直到您在 AWS 帳戶 經理的協助下刪除預留方案保留為止。

## 優先順序層級
<a name="w2aac26b5b7"></a>

Priority 方案提供比標準隨需定價更快速的價格溢價回應時間。它最適合具有面對客戶的業務工作流程的任務關鍵應用程式，這些工作流程不需要24X7的容量保留。優先順序方案不需要事先保留。您可以直接將 "service\$1tier" 選用參數設定為 "priority"，以利用請求層級優先順序。優先順序方案請求的優先順序高於標準和 Flex 方案請求。

## 標準方案
<a name="w2aac26b5b9"></a>

Standard 層為內容產生、文字分析和例行文件處理等日常 AI 任務提供一致的效能。根據預設，當缺少 "service\$1tier" 參數時，所有推論請求都會路由至 Standard 層。您也可以將「service\$1tier」選用參數設定為「default」，讓推論請求與 Standard 層搭配使用。

## Flex 層
<a name="w2aac26b5c11"></a>

對於可以處理較長處理時間的工作負載，Flex 層提供符合成本效益的定價折扣處理。這可協助您最佳化工作負載的成本，例如模型評估、內容摘要和代理程式工作流程。您可以設定「service\$1tier」選用參數為「flex」，讓您的推論請求可搭配 Flex 方案使用，並提供定價折扣。

## 使用服務層功能
<a name="w2aac26b5c13"></a>

若要存取服務層功能，您可以在呼叫 Amazon Bedrock 執行時間 API 時，將 "service\$1tier" 選用參數設定為 "reserved"、"priority"、"default" 或 "flex"。

```
"service_tier" : "reserved | priority | default | flex"
```

模型的隨需配額會跨「優先順序」、「預設」和「彈性」服務層共用。您的「預留」方案容量保留與隨需配額不同。服務請求的服務層組態會顯示在 API 回應和 AWS CloudTrail Events 中。您也可以在 ModelId、ServiceTier 和 ResolvedServiceTier 下檢視 Amazon CloudWatch Metrics 中的服務層指標，其中 ResolvedServiceTier 會顯示提供您請求的實際層。

如需有關定價的詳細資訊，請造訪[定價頁面](https://aws.amazon.com/bedrock/pricing/)。

預留服務層支援的模型和區域：


|  |  |  |  | 
| --- |--- |--- |--- |
| 供應商 | 模型 | 模型 ID | 大區 (Regions) | 
| Anthropic | Claude Sonnet 4.6 | global.anthropic.claude-sonnet-4-6us.anthropic.claude-sonnet-4-6eu.anthropic.claude-sonnet-4-6 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| me-south-1 | 
| ap-southeast-7 | 
| af-south-1 | 
| me-central-1 | 
| ap-southeast-5 | 
| mx-central-1 | 
| il-central-1 | 
| ap-east-2 | 
| ca-west-1 | 
| Anthropic | Claude Opus 4.6 | global.anthropic.claude-opus-4-6-v1us.anthropic.claude-opus-4-6-v1eu.anthropic.claude-opus-4-6-v1 | af-south-1 | 
| ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-south-1 | 
| ap-south-2 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| me-south-1 | 
| mx-central-1 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Anthropic | Claude Sonnet 4.5 | global.anthropic.claude-sonnet-4-5-20250929-v1：0us.anthropic.claude-sonnet-4-5-20250929-v1：0eu.anthropic.claude-sonnet-4-5-20250929-v1：0us-gov.anthropic.claude-sonnet-4-5-20250929-v1：0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| us-gov-west-1 | 
| Anthropic | Claude Opus 4.5 | global.anthropic.claude-opus-4-5-20251101-v1：0us.anthropic.claude-opus-4-5-20251101-v1：0eu.anthropic.claude-opus-4-5-20251101-v1：0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Anthropic | Claude Haiku 4.5 | global.anthropic.claude-haiku-4-5-20251001-v1：0us.anthropic.claude-haiku-4-5-20251001-v1：0eu.anthropic.claude-haiku-4-5-20251001-v1：0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 

**注意**  
預留層不支援 Sonnet 4.5 的 1M 內容長度。

Priority 和 Flex 服務方案支援的模型和區域：


|  |  |  |  | 
| --- |--- |--- |--- |
| 供應商 | 模型 | 模型 ID | 大區 (Regions) | 
| OpenAI | gpt-oss-120b | openai.gpt-oss-120b-1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| OpenAI | gpt-oss-20b | openai.gpt-oss-20b-1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| OpenAI | GPT OSS Safeguard 20B | openai.gpt-oss-safeguard-20b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| OpenAI | GPT OSS Safeguard 120B | openai.gpt-oss-safeguard-120b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Qwen | Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-2 | 
| Qwen | Qwen3 Coder 480B A35B Instruct | qwen.qwen3-coder-480b-a35b-v1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-north-1 | 
| eu-west-2 | 
| Qwen | Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| Qwen | Qwen3 32B (dense) | qwen.qwen3-32b-v1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| Qwen | Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Qwen | Qwen3 VL 235B A22B | qwen.qwen3-vl-235b-a22b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| DeepSeek | DeepSeek-V3.1 | deepseek.v3-v1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-north-1 | 
| eu-west-2 | 
| Amazon | Nova Premier | amazon.nova-premier-v1:0 | us-east-1\$1 | 
| us-east-2\$1 | 
| us-west-2\$1 | 
| Amazon | Nova Pro | amazon.nova-pro-v1:0 | us-east-1 | 
| us-east-2\$1 | 
| us-west-1\$1 | 
| us-west-2\$1 | 
| ap-east-2\$1 | 
| ap-northeast-1\$1 | 
| ap-northeast-2\$1 | 
| ap-south-1\$1 | 
| ap-southeast-1\$1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4\$1 | 
| ap-southeast-5\$1 | 
| ap-southeast-7\$1 | 
| eu-central-1\$1 | 
| eu-north-1\$1 | 
| eu-south-1\$1 | 
| eu-south-2\$1 | 
| eu-west-1\$1 | 
| eu-west-2 | 
| eu-west-3\$1 | 
| il-central-1\$1 | 
| me-central-1 | 
| Amazon | Nova 2 Lite | amazon.nova-2-lite-v1:0 | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Amazon | Nova 2 Pro Preview | amazon.nova-2-pro-preview-20251202-v1:0 | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Amazon | Nova Lite 2 Omni | amazon.nova-2-lite-omni-v1 | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Google | Gemma 3 4B | google.gemma-3-4b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Google | Gemma 3 12B | google.gemma-3-12b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Google | Gemma 3 27B | google.gemma-3-27b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Minimax AI | Minimax M2 | minimax.minimax-m2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Magistral Small 1.2 | mistral.magistral-small-2509 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Voxtral Mini 1.0 | mistral.voxtral-mini-3b-2507 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Voxtral Small 1.0 | mistral.voxtral-small-24b-2507 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 3B 3.0 | mistral.ministral-3-3b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 8B 3.0 | mistral.ministral-3-8b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 14B 3.0 | mistral.ministral-3-14b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Mistral Large 3 | mistral.mistral-large-3-675b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Kimi AI | Kimi K2 Thinking | moonshot.kimi-k2-thinking | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Nvidia | NVIDIA Nemotron Nano 2 | nvidia.nemotron-nano-9b-v2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Nvidia | NVIDIA Nemotron Nano 2 VL | nvidia.nemotron-nano-12b-v2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 

 \$1模型推論可以使用多個區域提供。

若要控制對服務層的存取，請參閱 [控制對服務層的存取](security_iam_id-based-policy-examples-agent.md#security_iam_id-based-policy-examples-service-tiers)

## 容量選項
<a name="capacity-options"></a>


| 容量類型 | 使用案例 | 重要特性 | 
| --- | --- | --- | 
| 隨需：彈性 | 偶爾、低容量工作負載 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 隨需：標準 | 定期生產工作負載 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 隨需：優先順序 | 高優先順序、延遲敏感的應用程式 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 預留層 | 一致的大量工作負載 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 批次 | 大規模、non-time-sensitive處理 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 跨區域推論 | 高可用性、流量暴增 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 

## 限制與配額
<a name="limits-quotas"></a>

### 隨需限制 （依層）
<a name="on-demand-limits"></a>


| Tier | RPM 範圍 | TPM 範圍 | 調節風險 | 
| --- | --- | --- | --- | 
| Flex | 10-100 | 5K-50K | 高 | 
| 標準 | 100-500 | 50K-150K | 中 | 
| Priority | 500-1000\$1 | 150K-300K\$1 | 低 | 
+ 爆量容量：可用於所有層級的短峰值
+ 軟性限制：透過服務配額請求增加
+ 特定模型：實際限制因基礎模型而異

### 預留層限制
<a name="reserved-tier-limits"></a>
+ 最低承諾：1 個模型單位
+ 單位上限：帳戶和區域特定
+ 輸入/輸出字符限制：根據購買的單位
+ 購買容量內沒有 RPM 限流

### 批次處理限制
<a name="batch-processing-limits"></a>
+ 任務大小：每個批次最多 10，000 筆記錄
+ 檔案大小：最大 200 MB 輸入檔案
+ 處理時間：24 小時完成時段
+ 並行任務：區域特定的配額

### 跨區域推論
<a name="cross-region-inference-limits"></a>
+ 繼承每個區域的隨需層限制
+ 沒有額外的配額額外負荷
+ 自動路由 （無手動限制管理）

## 成本最佳化
<a name="cost-optimization"></a>

### 決策架構
<a name="decision-framework"></a>


| 案例 | 建議選項 | 為什麼 | 
| --- | --- | --- | 
| 開發/測試 | Flex | 最低成本，適用於非生產 | 
| 標準生產 | 標準 | 最佳成本效能平衡 | 
| 面向使用者的關鍵應用程式 | Priority | 與成本相比的可靠性和效能 | 
| 穩定的大量負載 | 預留層 | 承諾節省 30-50% | 
| 大量資料處理 | 批次 | 50% 折扣、非緊急工作負載 | 
| 關鍵任務運作時間 | 跨區域推論 | 可用性 > 成本 | 

### 最佳化策略
<a name="optimization-strategies"></a>

**選擇正確的隨需方案**
+ 從適用於大多數工作負載的標準開始
+ 針對開發/測試環境降級為 Flex
+ 只有在限流影響使用者時，才升級至優先順序
+ 監控 CloudWatch 限流指標，為決策提供資訊

**轉換為預留層**
+ 當一致性負載超過隨需成本的 40% 時
+ 計算損益平衡： （每月隨需成本） 與 （預留承諾）
+ 最初使用 1 個月的承諾
+ 預留層可與任何隨需層搭配使用

**將 Batch 用於**
+ 訓練資料產生
+ 內容管制待處理項目
+ 產生報告
+ 資料擴充管道

**合併方法**
+ 基準流量的預留層
+ 中度爆量的標準隨需
+ 關鍵尖峰時段的隨需優先順序
+ 用於離線處理的批次
+ 僅限容錯移轉的跨區域

**成本監控**
+ 比較方案成本：Flex < Standard < Priority
+ 追蹤每個請求的權杖 （最佳化提示）
+ 使用 CloudWatch 指標進行使用率和限流
+ 設定意外尖峰的帳單警示
+ 每月檢閱預留方案使用率
+ 僅在限流發生時評估層升級

# 使用批次推論處理多個提示
<a name="batch-inference"></a>

使用批次推論，您可以提交多個提示並以非同步方式產生回應。您可以使用 `InvokeModel`或 `Converse` API 格式來格式化輸入資料。批次推論可協助您有效率地處理大量請求，方法是傳送單一請求，並在 Amazon S3 儲存貯體中產生回應。在建立的檔案中定義模型輸入後，將檔案上傳到 S3 儲存貯體。然後，您提交批次推論請求並指定 S3 儲存貯體。任務完成後，您可以從 S3 擷取輸出檔案。您可以使用批次推論來改善大型資料集上模型推論的效能。

**注意**  
佈建模型不支援批次推論。

如需批次推論的一般資訊，請參閱下列資源：
+ 若要查看批次推論的定價，請參閱 [Amazon Bedrock 定價](https://aws.amazon.com/bedrock/pricing/)。
+ 若要查看批次推論的配額，請參閱 AWS 一般參考中的 [Amazon Bedrock 端點和配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html)。
+ 若要在批次推論任務完成或變更狀態而非輪詢時接收通知，請參閱 [使用 Amazon EventBridge 監控 Amazon Bedrock 任務狀態變更監控事件變更](monitoring-eventbridge.md)。

**Topics**
+ [支援批次推論的區域和模型](batch-inference-supported.md)
+ [批次推論的先決條件](batch-inference-prereq.md)
+ [建立批次推論任務](batch-inference-create.md)
+ [監控批次推論任務](batch-inference-monitor.md)
+ [停止批次推論任務](batch-inference-stop.md)
+ [檢視批次推論任務的結果](batch-inference-results.md)
+ [批次推論的程式碼範例](batch-inference-example.md)
+ [使用 OpenAI 批次 API 提交一批提示](inference-openai-batch.md)

# 支援批次推論的區域和模型
<a name="batch-inference-supported"></a>

下列清單提供 Amazon Bedrock 中區域和模型支援的一般資訊連結：
+ 如需 Amazon Bedrock 中支援的區域代碼和端點清單，請參閱 [Amazon Bedrock 端點和配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bedrock_region)。
+ 如需呼叫 Amazon Bedrock API 操作時要使用的 Amazon Bedrock 模型 ID 清單，請參閱[Amazon Bedrock 中支援的基礎模型](models-supported.md)。
+ 如需呼叫 Amazon Bedrock API 操作時要使用的 Amazon Bedrock 推論設定檔 ID 清單，請參閱[支援的跨區域推論設定檔](inference-profiles-support.md#inference-profiles-support-system)。

批次推論可以與不同類型的模型搭配使用。以下清單說明對不同類型的 Amazon Bedrock 模型的支援：
+ **單一區域模型支援** – 列出支援將推論請求傳送至一個區域中基礎模型 AWS 的區域。如需 Amazon Bedrock 可用模型的完整資料表，請參閱 [Amazon Bedrock 中支援的基礎模型](models-supported.md)。
+ **跨區域推論設定檔支援** – 列出支援使用跨區域推論設定檔的區域，該設定檔支援將推論請求傳送至地理區域內多個 AWS 區域的基礎模型。推論描述檔在模型 ID 前面有一個字首，指出其地理區域 （例如，`us.`、`apac`)。如需 Amazon Bedrock 中可用推論設定檔的詳細資訊，請參閱 [推論設定檔支援的區域和模型](inference-profiles-support.md)。
+ **自訂模型支援** – 列出支援將推論請求傳送至自訂模型的區域。如需模型自訂的詳細資訊，請參閱 [自訂模型，以改善其針對使用案例的效能](custom-models.md)。

下表摘要說明批次推論的支援：


| 供應商 | 模型 | 模型 ID | 單一區域模型支援 | 跨區域推論設定檔支援 | 自訂模型支援 | 
| --- | --- | --- | --- | --- | --- | 
| Amazon | Amazon Nova 多模式內嵌 | amazon.nova-2-multimodal-embeddings-v1：0 |  us-east-1  |  | N/A | 
| Amazon | Nova 2 Lite | amazon.nova-2-lite-v1：0 | N/A |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Nova Lite | amazon.nova-lite-v1:0 |  me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Nova Micro | amazon.nova-micro-v1:0 |  us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-2  | N/A | 
| Amazon | Nova Premier | amazon.nova-premier-v1:0 | N/A |  us-east-1 us-east-2 us-west-2  | N/A | 
| Amazon | Nova Pro | amazon.nova-pro-v1:0 |  ap-southeast-3 me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Amazon | Titan Multimodal Embeddings G1 | amazon.titan-embed-image-v1 |  ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  |  us-east-1 us-west-2  | 
| Amazon | Titan 文本嵌入 V2 | amazon.titan-embed-text-v2:0 |  ap-northeast-1 ap-northeast-2 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-2 sa-east-1 us-east-1 us-west-2  |  | N/A | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ca-central-1 eu-central-1 eu-central-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | N/A | N/A | 
| Anthropic | Claude 3 Opus | anthropic.claude-3-opus-20240229-v1：0 |  us-west-2  |  us-east-1  | N/A | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1：0 |  ap-northeast-2 ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Anthropic | Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 |  us-west-2  |  us-east-1  | N/A | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 |  ap-northeast-1 ap-northeast-2 ap-southeast-1 eu-central-1 us-east-1 us-east-2 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 |  us-west-2  |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 us-east-1 us-east-2 us-west-2  | N/A | 
| Anthropic | Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | N/A |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 eu-central-1 eu-north-1 eu-west-1 eu-west-3 us-east-1 us-east-2 us-west-2  | N/A | 
| Anthropic | Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | N/A |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Opus 4.5 | anthropic.claude-opus-4-5-20251101-v1：0 | N/A |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Opus 4.6 | anthropic.claude-opus-4-6-v1 | N/A |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | N/A |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | N/A |  af-south-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-gov-east-1 us-gov-west-1 us-west-1 us-west-2  | N/A | 
| Anthropic | Claude Sonnet 4.6 | anthropic.claude-sonnet-4-6 |  eu-west-2  |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| DeepSeek | DeepSeek V3 | deepseek.v3.2 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| DeepSeek | DeepSeek-V3.1 | deepseek.v3-v1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  |  | N/A | 
| Google | Gemma 3 12B IT | google.gemma-3-12b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Google | Gemma 3 27B PT | google.gemma-3-27b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Google | Gemma 3 4B IT | google.gemma-3-4b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Meta | Llama 3.1 405B Instruct | meta.llama3-1-405b-instruct-v1:0 |  us-west-2  |  | N/A | 
| Meta | Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0 |  us-west-2  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0 |  us-west-2  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 11B Instruct | meta.llama3-2-11b-instruct-v1:0 |  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 1B Instruct | meta.llama3-2-1b-instruct-v1:0 |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 3B Instruct | meta.llama3-2-3b-instruct-v1:0 |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.2 90B Instruct | meta.llama3-2-90b-instruct-v1:0 |  |  us-east-1 us-west-2  | N/A | 
| Meta | Llama 3.3 70B Instruct | meta.llama3-3-70b-instruct-v1:0 |  us-east-2  |  us-east-1 us-east-2 us-west-2  | N/A | 
| Meta | Llama 4 Maverick 17B Instruct | meta.llama4-maverick-17b-instruct-v1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| Meta | Llama 4 Scout 17B Instruct | meta.llama4-scout-17b-instruct-v1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | N/A | 
| MiniMax | MiniMax M2 | minimax.minimax-m2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| MiniMax | MiniMax M2.1 | minimax.minimax-m2.1 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | 開發 2 123B | mistral.devstral-2-123b |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Magistral Small 2509 | mistral.magistral-small-2509 |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Ministral 14B 3.0 | mistral.ministral-3-14b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Ministral 3 8B | mistral.ministral-3-8b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | 最小 3B | mistral.ministral-3-3b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Large (24.07) | mistral.mistral-large-2407-v1:0 |  us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Large 3 | mistral.mistral-large-3-675b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Mistral Small (24.02) | mistral.mistral-small-2402-v1:0 |  us-east-1  | N/A | N/A | 
| Mistral AI | Voxtral Mini 3B 2507 | mistral.voxtral-mini-3b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Mistral AI | Voxtral 小型 24B 2507 | mistral.voxtral-small-24b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| 月亮 AI | Kimi K2 思維 | moonshot.kimi-k2-thinking |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| 月亮 AI | Kimi K2.5 | moonshotai.kimi-k2.5 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | NVIDIA Nemotron Nano 12B v2 VL BF16 | nvidia.nemotron-nano-12b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | NVIDIA Nemotron Nano 9B v2 | nvidia.nemotron-nano-9b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| NVIDIA | Nemotron Nano 3 30B | nvidia.nemotron-nano-3-30b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | GPT OSS 保護 120B | openai.gpt-oss-safeguard-120b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | GPT OSS 保護 20B | openai.gpt-oss-safeguard-20b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| OpenAI | gpt-oss-120b | openai.gpt-oss-120b-1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | N/A | N/A | 
| OpenAI | gpt-oss-20b | openai.gpt-oss-20b-1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-2 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 32B （密集） | qwen.qwen3-32b-v1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 Coder 480B A35B 指示 | qwen.qwen3-coder-480b-a35b-v1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 Coder 下一步 | qwen.qwen3-coder-next |  ap-southeast-2 eu-west-2 us-east-1  | N/A | N/A | 
| Qwen | Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3 VL 235B A22B | qwen.qwen3-vl-235b-a22b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Qwen | Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v1：0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Z.AI | GLM 4.7 | zai.glm-4.7 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 
| Z.AI | GLM 4.7 快閃記憶體 | zai.glm-4.7-flash |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | N/A | N/A | 

# 批次推論的先決條件
<a name="batch-inference-prereq"></a>

若要執行批次推論，您必須滿足下列先決條件：

1. 準備資料集並上傳至 Amazon S3 儲存貯體。

1. 為您的輸出資料建立 S3 儲存貯體。

1. 設定相關 IAM 身分的批次推論相關許可。

1. (選用) 設定 VPC 以在執行批次推論時保護 S3 中的資料。如果您不需要使用 VPC，可以略過此步驟。

若要了解如何滿足這些先決條件，請導覽至下列主題：

**Topics**
+ [格式化並上傳您的批次推論資料](batch-inference-data.md)
+ [批次推論的必要許可](batch-inference-permissions.md)
+ [使用 VPC 保護批次推論任務](batch-vpc.md)

# 格式化並上傳您的批次推論資料
<a name="batch-inference-data"></a>

您必須將批次推論資料新增至要在提交模型調用任務時選擇或指定的 S3 位置。S3 位置必須包含下列項目：
+ 至少一個定義模型輸入的 JSONL 檔案。JSONL 包含 JSON 物件的資料列。您的 JSONL 檔案必須以副檔名 .jsonl 結尾，且格式如下：

  ```
  { "recordId" : "alphanumeric string", "modelInput" : {JSON body} }
  ...
  ```

  每一行都包含具有 `recordId` 欄位和 `modelInput` 欄位的 JSON 物件。`modelInput` JSON 物件的格式取決於您在[建立批次推論任務](batch-inference-create.md)時選擇的模型叫用類型。如果您使用 `InvokeModel`類型 （預設），格式必須符合您在`InvokeModel`請求中使用的模型`body`欄位 （請參閱 [基礎模型的推論請求參數和回應欄位](model-parameters.md))。如果您使用 `Converse`類型，格式必須符合 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API 的請求內文。
**注意**  
如果您省略 `recordId` 欄位，Amazon Bedrock 會將其加入到輸出中。
輸出 JSONL 檔案中的記錄順序不保證符合輸入 JSONL 檔案中的記錄順序。
您可以在建立[批次推論任務](batch-inference-create.md)時指定要使用的模型。
+ （如果您的輸入內容包含 Amazon S3 位置） 有些模型可讓您將輸入的內容定義為 S3 位置。請參閱 [Amazon Nova 的範例影片輸入](#batch-inference-data-ex-s3)。
**警告**  
在提示中使用 S3 URIs 時，所有資源都必須位於相同的 S3 儲存貯體和資料夾中。`InputDataConfig` 參數必須指定包含所有連結資源 （例如影片或映像） 的資料夾路徑，而不只是個別`.jsonl`檔案。請注意，S3 路徑區分大小寫，因此請確定您的 URIs符合確切的資料夾結構。

確保您的輸入符合批次推論配額。您可以在 [Amazon Bedrock 服務配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中搜尋下列配額：
+ **每個批次推論任務的記錄數下限** – 任務中所有 JSONL 檔案的記錄 (JSON 物件) 數目下限。
+ **每個批次推論任務的每個輸入檔案記錄數** – 任務中單一 JSONL 檔案中的記錄 (JSON 物件) 數目上限。
+ **每個批次推論任務的記錄數** – 任務中所有 JSONL 檔案的記錄 (JSON 物件) 數目上限。
+ **批次推論輸入檔案大小** – 任務中單一檔案的大小上限。
+ **批次推論任務大小** – 所有輸入檔案的最大累積大小。

若要進一步了解如何設定批次推論輸入，請參閱下列範例：

## Anthropic Claude 3 Haiku 的範例文字輸入
<a name="batch-inference-data-ex-text"></a>

如果您計劃使用 Anthropic Claude 3 Haiku 模型的[訊息 API](model-parameters-anthropic-claude-messages.md) 格式來執行批次推論，您可以提供 JSONL 檔案，在其中加入包含下列 JSON 物件的一行：

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
```

## Amazon Nova 的範例影片輸入
<a name="batch-inference-data-ex-s3"></a>

如果您計劃使用 Amazon Nova Lite 或 Amazon Nova Pro 模型在影片輸入上執行批次推論，您可以選擇在 JSONL 檔案中以位元組或 S3 位置來定義影片。例如，您可能有一個路徑為 `s3://batch-inference-input-bucket` 且包含下列檔案的 S3 儲存貯體：

```
s3://batch-inference-input-bucket/
├── videos/
│   ├── video1.mp4
│   ├── video2.mp4
│   ├── ...
│   └── video50.mp4
└── input.jsonl
```

來自 `input.jsonl` 檔案的範例記錄如下：

```
{
    "recordId": "RECORD01",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "You are an expert in recipe videos. Describe this video in less than 200 words following these guidelines: ..."
                    },
                    {
                        "video": {
                            "format": "mp4",
                            "source": {
                                "s3Location": {
                                    "uri": "s3://batch-inference-input-bucket/videos/video1.mp4",
                                    "bucketOwner": "111122223333"
                                }
                            }
                        }
                    }
                ]
            }
        ]
    }
}
```

建立批次推論任務時，您必須在 `InputDataConfig` 參數`s3://batch-inference-input-bucket`中指定資料夾路徑。批次推論將處理此位置`input.jsonl`的檔案，以及任何參考的資源 （例如`videos`子資料夾中的影片檔案）。

下列資源提供提交影片輸入以進行批次推論的詳細資訊：
+ 若要了解如何驗證輸入請求中的 Amazon S3 URIs，請參閱 [Amazon S3 URL 剖析部落格](https://aws.amazon.com/blogs/devops/s3-uri-parsing-is-now-available-in-aws-sdk-for-java-2-x/)。
+ 如需如何使用 Nova 設定影片理解調用記錄的詳細資訊，請參閱[Amazon Nova視覺提示準則](https://docs.aws.amazon.com/nova/latest/userguide/prompting-vision-prompting.html)。

## Converse 輸入範例
<a name="batch-inference-data-ex-converse"></a>

如果您在建立批次推論任務`Converse`時將模型叫用類型設定為 ， `modelInput` 欄位必須使用 Converse API 請求格式。下列範例顯示 Converse 批次推論任務的 JSONL 記錄：

```
{
    "recordId": "CALL0000001",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "Summarize the following call transcript: ..."
                    }
                ]
            }
        ],
        "inferenceConfig": {
            "maxTokens": 1024
        }
    }
}
```

如需 Converse 請求內文中支援的完整欄位清單，請參閱 API 參考中的 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)。

下列主題說明如何設定身分的 S3 存取和批次推論許可，以執行批次推論。

# 批次推論的必要許可
<a name="batch-inference-permissions"></a>

若要執行批次推論，您必須設定下列 IAM 身分的許可：
+ 將建立和管理批次推論任務的 IAM 身分。
+ Amazon Bedrock 擔任代表您執行動作的批次推論[服務角色](security-iam-sr.md)。

若要了解如何設定每個身分的許可，請導覽至下列主題：

**Topics**
+ [IAM 身分提交和管理批次推論任務所需的許可](#batch-inference-permissions-user)
+ [服務角色執行批次推論所需的許可](#batch-inference-permissions-service)

## IAM 身分提交和管理批次推論任務所需的許可
<a name="batch-inference-permissions-user"></a>

若要讓 IAM 身分使用此功能，您必須以必要的許可進行設定。若要這麼做，請執行下列其中一項：
+ 若要允許身分執行所有 Amazon Bedrock 動作，請將 [AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess) 政策連接至身分。如果您這樣做，您可以略過此主題。此選項較不安全。
+ 作為安全最佳實務，您應該只將必要的動作授予身分。本主題介紹使用此功能所需的許可。

若要將許可限制為僅用於批次推論的動作，請將下列身分型政策連接至 IAM 身分：

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "BatchInference",
            "Effect": "Allow",
            "Action": [  
                "bedrock:ListFoundationModels",
                "bedrock:GetFoundationModel",
                "bedrock:ListInferenceProfiles",
                "bedrock:GetInferenceProfile",
                "bedrock:ListCustomModels",
                "bedrock:GetCustomModel",
                "bedrock:TagResource", 
                "bedrock:UntagResource", 
                "bedrock:ListTagsForResource",
                "bedrock:CreateModelInvocationJob",
                "bedrock:GetModelInvocationJob",
                "bedrock:ListModelInvocationJobs",
                "bedrock:StopModelInvocationJob"
            ],
            "Resource": "*"
        }
    ]   
}
```

------

若要進一步限制許可，您可以忽略動作，也可以指定要篩選許可的資源和條件索引鍵。如需動作、資源和條件索引鍵的詳細資訊，請參閱*服務授權參考*中的下列主題：
+ [Amazon Bedrock 定義的動作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions) – 了解動作、您可以在 `Resource` 欄位中限制其範圍的資源類型，以及您可以在 `Condition` 欄位中篩選許可的條件索引鍵。
+ [Amazon Bedrock 定義的資源類型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies) – 了解 Amazon Bedrock 中的資源類型。
+ [Amazon Bedrock 的條件索引鍵](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys) – 了解 Amazon Bedrock 中的條件索引鍵。

下列政策範例縮小了批次推論的許可範圍，只允許帳戶 ID 為 `123456789012` 的使用者使用 Anthropic Claude 3 Haiku 模型在 `us-west-2` 區域中建立批次推論任務：

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "CreateBatchInferenceJob",
            "Effect": "Allow",
            "Action": [
                "bedrock:CreateModelInvocationJob"
            ],
            "Resource": [
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                "arn:aws:bedrock:us-west-2:123456789012:model-invocation-job/*"
            ]
        }
    ]
}
```

------

## 服務角色執行批次推論所需的許可
<a name="batch-inference-permissions-service"></a>

批次推論是由擔任您身分的[服務角色](security-iam-sr.md)執行，以代表您執行動作。您可以透過以下方式建立服務角色：
+ 使用AWS 管理主控台讓 Amazon Bedrock 自動為您建立具有必要許可的服務角色。您可以在建立批次推論任務時選取此選項。
+ 使用 並AWS Identity and Access Management連接必要的許可，為 Amazon Bedrock 建立自訂服務角色。當您提交批次推論任務時，請指定此角色。如需為批次推論建立自訂服務角色的詳細資訊，請參閱[建立批次推論的自訂服務角色](batch-iam-sr.md)。如需建立服務角色的一般資訊，請參閱《IAM 使用者指南》中的[建立角色以將許可委派給 AWS 服務](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html)。

**重要**  
如果您[上傳資料以進行批次推論](batch-inference-data.md)的 S3 儲存貯體不同AWS 帳戶，您必須設定 S3 儲存貯體政策，以允許服務角色存取資料。即使您使用主控台自動建立服務角色，仍必須手動設定此政策。若要了解如何設定 Amazon Bedrock 資源的 S3 儲存貯體政策，請參閱 [將儲存貯體政策連接至 Amazon S3 儲存貯體以供另一個帳戶存取](s3-bucket-access.md#s3-bucket-access-cross-account)。
Amazon Bedrock 中的基礎模型是AWS受管資源，無法用於需要客戶擁有權的 IAM 政策條件。這些模型由 擁有和操作AWS，無法由個別客戶擁有。套用到基礎模型時，檢查客戶擁有的資源 （例如使用資源標籤、組織 ID 或其他所有權屬性的條件） 的任何 IAM 政策條件都將失敗，可能封鎖對這些服務的合法存取。  
例如，如果您的政策包含以下`aws:ResourceOrgID`條件：  

  ```
  {
    "Condition": {
      "StringEqualsIgnoreCase": {
        "aws:ResourceOrgID": ["o-xxxxxxxx"]
      }
    }
  }
  ```
您的批次推論任務會因 而失敗`AccessDeniedException`。移除`aws:ResourceOrgID`條件或為基礎模型建立個別的政策陳述式。

# 使用 VPC 保護批次推論任務
<a name="batch-vpc"></a>

當您執行批次推論任務時，任務會存取您的 Amazon S3 儲存貯體，以下載輸入資料並寫入輸出資料。若要控制對資料的存取，建議您使用虛擬私有雲端 (VPC) 搭配 [Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html)。您可以透過設定 VPC 進一步保護資料，使得資料無法透過網際網路取得，並使用 [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html) 建立 VPC 介面端點，以建立資料的私有連結。如需 Amazon VPC 和 如何與 Amazon Bedrock AWS PrivateLink 整合的詳細資訊，請參閱 [使用 Amazon VPC 和 AWS PrivateLink 保護資料](usingVPC.md)。

執行下列步驟，為批次推論任務的輸入提示和輸出模型回應設定和使用 VPC。

**Topics**
+ [設定 VPC 以在批次推論期間保護您的資料](#batch-vpc-setup)
+ [將 VPC 許可連接至批次推論角色](#batch-vpc-role)
+ [提交批次推論任務時新增 VPC 組態](#batch-vpc-config)

## 設定 VPC 以在批次推論期間保護您的資料
<a name="batch-vpc-setup"></a>

若要設定 VPC，請遵循[設定 VPC](usingVPC.md#create-vpc) 中的步驟。您可以進一步保護 VPC，方法是設定 S3 VPC 端點，並使用資源型 IAM 政策，依照 [(範例) 使用 VPC 存取限制對 Amazon S3 資料的資料存取權](vpc-s3.md) 中的步驟，限制對包含批次推論資料的 S3 儲存貯體的存取。

## 將 VPC 許可連接至批次推論角色
<a name="batch-vpc-role"></a>

完成設定 VPC 之後，請將下列許可連接至[批次推論服務角色](batch-iam-sr.md)，以允許其存取 VPC。修改此政策，僅允許存取任務所需的 VPC 資源。將 *subnet-ids* 和 *security-group-id* 取代為 VPC 中的值。

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeVpcs",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
                "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}",
                "arn:aws:ec2:us-east-1:123456789012:security-group/${{security-group-id}}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/BedrockManaged": [
                        "true"
                    ]
                },
                "ArnEquals": {
                    "aws:RequestTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "3",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": [
                "*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:Subnet": [
                        "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}"
                    ]
                },
                "ArnEquals": {
                    "ec2:ResourceTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "4",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "CreateNetworkInterface"
                    ]
                },
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "BedrockManaged",
                        "BedrockModelInvocationJobArn"
                    ]
                }
            }
        }
    ]
}
```

------

## 提交批次推論任務時新增 VPC 組態
<a name="batch-vpc-config"></a>

依照前幾節所述設定 VPC 以及必要的角色和許可之後，您可以建立使用此 VPC 的批次推論任務。

**注意**  
目前，建立批次推論任務時，您只能透過 API 使用 VPC。

當您指定 VPC 子網路和安全群組時，Amazon Bedrock 會在其中一個子網路內建立與安全群組相關聯的*彈性網路介面*(ENI)。ENI 允許 Amazon Bedrock 工作連線至 VPC 中的資源。如需 ENI 的相關資訊，請參閱 *Amazon VPC 使用者指南*中的[彈性網路介面](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_ElasticNetworkInterfaces.html)。Amazon Bedrock 使用 `BedrockManaged` 和 `BedrockModelInvocationJobArn` 標籤標記它建立的 ENI。

建議您在每個可用區域中至少提供一個子網路。

您可以使用安全群組建立規則，以控制 Amazon Bedrock 對 VPC 資源的存取。

當您提交 [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) 請求時，您可以包含 `VpcConfig` 作為請求參數，以指定要使用的 VPC 子網路和安全群組，如下列範例所示。

```
"vpcConfig": { 
    "securityGroupIds": [
        "sg-0123456789abcdef0"
    ],
    "subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ]
}
```

# 建立批次推論任務
<a name="batch-inference-create"></a>

使用執行模型推論的檔案設定 Amazon S3 儲存貯體之後，您可以建立批次推論任務。開始之前，請檢查您是否根據[格式化並上傳您的批次推論資料](batch-inference-data.md)中所述的指示設定檔案。

**注意**  
若要使用 VPC 提交批次推論任務，您必須使用 API。選取 API 索引標籤，了解如何包含 VPC 組態。

若要了解如何建立批次推論任務，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

**建立批次推論任務**

1.  AWS 管理主控台 使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中，選取**批次推論**。

1. 在**批次推論任務**區段中，選擇**建立任務**。

1. 在**任務詳細資訊**區段中，為批次推論任務提供**任務名稱**，然後選擇**選取模型**，以選取要用於批次推論任務的模型。

1. 在**模型調用類型**區段中，為您的輸入資料選擇 API 格式。如果您的輸入資料使用模型特定的請求格式，請選擇 **InvokeModel**，或者如果您的輸入資料使用 **Converse** API 格式，請選擇 Converse。預設值為 **InvokeModel**。

1. 在**輸入資料**區段中，選擇**瀏覽 S3**，並為批次推論任務選取 S3 位置。批次推論會在該 S3 位置處理所有 JSONL 和隨附的內容檔案，無論該位置是 S3 資料夾還是單一 JSONL 檔案。
**注意**  
如果輸入資料位於與提交任務之帳戶不同的 S3 儲存貯體中，則必須使用 API 來提交批次推論任務。若要了解如何執行此操作，請選取上方的 API 標籤。

1. 在**輸出資料**區段中，選擇**瀏覽 S3** 並選取 S3 位置以存放批次推論任務中的輸出檔案。根據預設，輸出資料將由 加密 AWS 受管金鑰。若要選擇自訂 KMS 金鑰，請選取**自訂加密設定 (進階)**，然後選擇金鑰。如需 Amazon Bedrock 資源加密和設定自訂 KMS 金鑰的詳細資訊，請參閱[資料加密](data-encryption.md)。
**注意**  
如果您打算將輸出資料寫入與提交任務之帳戶不同的 S3 儲存貯體，則必須使用 API 來提交批次推論任務。若要了解如何執行此操作，請選取上方的 API 標籤。

1. 在**服務存取**區段中，選取下列其中一個選項：
   + **使用現有服務角色** — 從下拉式清單中選取服務角色。如需有關使用適當許可權設定自訂角色的詳細資訊，請參閱 [批次推論的必要許可](batch-inference-permissions.md)。
   + **建立並使用新的服務角色** — 輸入服務角色的名稱。

1. (選用) 若要將標籤與批次推論任務建立關聯，請展開**標籤**區段，並為每個標籤新增索引鍵和選用值。如需詳細資訊，請參閱[標記 Amazon Bedrock 資源](tagging.md)。

1. 選擇 **Create batch inference job (建立批次推論任務)**。

------
#### [ API ]

若要建立批次推論任務，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) 請求。

下列是必要欄位：


****  

| 欄位 | 使用案例 | 
| --- | --- | 
| jobName | 指定任務的名稱。 | 
| roleArn | 指定具有建立和管理任務許可的服務角色的 Amazon Resource Name (ARN)。如需詳細資訊，請參閱[建立批次推論的自訂服務角色](batch-iam-sr.md)。 | 
| modelId | 指定要在推論中使用之模型的 ID 或 ARN。 | 
| inputDataConfig | 指定包含輸入資料的 S3 位置。批次推論會在該 S3 位置處理所有 JSONL 和隨附的內容檔案，無論該位置是 S3 資料夾還是單一 JSONL 檔案。如需詳細資訊，請參閱[格式化並上傳您的批次推論資料](batch-inference-data.md)。 | 
| outputDataConfig | 指定要寫入模型回應的 S3 位置。 | 

以下是選填欄位：


****  

| 欄位 | 使用案例 | 
| --- | --- | 
| modelInvocationType | 指定輸入資料的 API 格式。設定為 Converse 以使用 Converse API 格式，或 InvokeModel（預設） 以使用模型特定的請求格式。如需 Converse 請求格式的詳細資訊，請參閱 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)。 | 
| timeoutDurationInHours | 指定任務逾時時間 (以小時為單位)。 | 
| 標籤 | 指定要與任務建立關聯的任何標籤。如需詳細資訊，請參閱[標記 Amazon Bedrock 資源](tagging.md)。 | 
| vpcConfig | 指定要在任務期間用來保護資料的 VPC 組態。如需詳細資訊，請參閱[使用 VPC 保護批次推論任務](batch-vpc.md)。 | 
| clientRequestToken | 為確保 API 請求，僅完成一次。如需詳細資訊，請參閱[確保冪等性](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html)。 | 

回應會傳回 `jobArn`，您可將其用於執行其他與批次推論相關的 API 呼叫。

------

# 監控批次推論任務
<a name="batch-inference-monitor"></a>

除了您為批次推論任務設定的組態之外，您也可以查看其狀態來監控其進度。如需任務可能狀態的詳細資訊，請參閱 [ModelInvocationJobSummary](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ModelInvocationJobSummary.html) 中的 `status` 欄位。

您也可以比較已處理的記錄總數和記錄數，來追蹤任務的狀態。您可以在包含輸出檔案之 Amazon S3 儲存貯體的 `manifest.json.out` 檔案中，找到這些號碼。如需詳細資訊，請參閱[檢視批次推論任務的結果](batch-inference-results.md)。若要了解如何下載 S3 物件，請參閱[下載物件](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html)。

**提示**  
您可以使用 Amazon EventBridge 在批次推論任務完成或變更狀態時接收自動通知，而不是輪詢任務狀態。如需詳細資訊，請參閱[使用 Amazon EventBridge 監控 Amazon Bedrock 任務狀態變更監控事件變更](monitoring-eventbridge.md)。

若要了解如何檢視批次推論任務的詳細資訊，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

**檢視批次推論任務的相關資訊**

1.  AWS 管理主控台 使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中，選取**批次推論**。

1. 在**批次推論任務**區段中，選擇任務。

1. 在任務詳細資訊頁面上，您可以檢視任務組態的相關資訊，並透過檢視其**狀態**來監控其進度。

------
#### [ API ]

若要取得批次推論任務的相關資訊，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [GetModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetModelInvocationJob.html) 請求，並在 `jobIdentifier` 欄位中提供任務的 ID 或 ARN。

若要列出多個批次推論任務的相關資訊，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [ListModelInvocationJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListModelInvocationJobs.html) 請求。您可以指定以下選用參數：

`GetModelInvocationJob` 和 的回應`ListModelInvocationJobs`包含指出任務是否使用 `InvokeModel`或 `Converse` API 格式`modelInvocationType`的欄位。


****  

| 欄位 | 簡短描述 | 
| --- | --- | 
| maxResults | 回應傳回的結果數目上限。 | 
| nextToken | 如果結果多於您在 maxResults 欄位中指定的數字，則回應會傳回 nextToken 值。若要查看下一批結果，請在另一個請求中傳送 nextToken 值。 | 

若要列出任務的所有標籤，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [ListTagsForResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListTagsForResource.html) 請求，並包含任務的 Amazon Resource Name (ARN)。

------

# 停止批次推論任務
<a name="batch-inference-stop"></a>

若要了解如何停止進行中的批次推論任務，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

**停止批次推論任務**

1. 使用具有 Amazon Bedrock 主控台使用許可的 IAM 身分登入AWS 管理主控台。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中，選取**批次推論**。

1. 選取任務以前往任務詳細資訊頁面，或選取任務旁的選項按鈕。

1. 選擇**停止任務**。

1. 檢閱訊息，然後選擇**停止任務**以確認。
**注意**  
系統會向您收取已處理字符的費用。

------
#### [ API ]

若要停止批次推論任務，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [StopModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_StopModelInvocationJob.html) 請求，並在 `jobIdentifier` 欄位中提供任務的 ID 或 ARN。

如果順利停止任務，您會收到 HTTP 200 回應。

------

# 檢視批次推論任務的結果
<a name="batch-inference-results"></a>

在批次推論任務 `Completed` 之後，您可以從您在建立任務期間所指定 Amazon S3 儲存貯體的檔案中擷取批次推論任務的結果。若要了解如何下載 S3 物件，請參閱[下載物件](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html)。S3 儲存貯體包含下列檔案：

1. Amazon Bedrock 會為每個輸入 JSONL 檔案產生一個輸出 JSONL 檔案。輸出檔案包含以下格式的每個輸入的模型輸出。`error` 物件會在推論中發生錯誤的任一行中取代 `modelOutput` 欄位。`modelOutput` JSON 物件的格式取決於模型調用類型。對於`InvokeModel`任務，格式符合`InvokeModel`回應中的 `body` 欄位 （請參閱 [基礎模型的推論請求參數和回應欄位](model-parameters.md))。對於`Converse`任務，格式符合 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API 的回應內文。

   ```
   { "recordId" : "string", "modelInput": {JSON body}, "modelOutput": {JSON body} }
   ```

   下列範例展示可能的輸出檔案。

   ```
   { "recordId" : "3223593EFGH", "modelInput" : {"inputText": "Roses are red, violets are"}, "modelOutput" : {"inputTextTokenCount": 8, "results": [{"tokenCount": 3, "outputText": "blue\n", "completionReason": "FINISH"}]}}
   { "recordId" : "1223213ABCD", "modelInput" : {"inputText": "Hello world"}, "error" : {"errorCode" : 400, "errorMessage" : "bad request" }}
   ```

1. 包含批次推論任務摘要的 `manifest.json.out` 檔案。

   ```
   {
       "totalRecordCount" : number, 
       "processedRecordCount" : number,
       "successRecordCount": number,
       "errorRecordCount": number,
       "inputTokenCount": number,
       "outputTokenCount" : number
   }
   ```

   這些欄位如下所述：
   + totalRecordCount – 提交至批次推論任務的記錄總數。
   + processedRecordCount – 批次推論任務中處理的記錄數。
   + successRecordCount – 批次推論任務成功處理的記錄數。
   + errorRecordCount – 批次推論任務中導致錯誤的記錄數。
   + inputTokenCount – 提交至批次推論任務的輸入字符總數。
   + outputTokenCount – 批次推論任務產生的輸出字符總數。

# 批次推論的程式碼範例
<a name="batch-inference-example"></a>

本章中的程式碼範例說明如何建立批次推論任務、檢視相關資訊，以及停止任務。此範例使用 `InvokeModel` API 格式。如需使用 `Converse` API 格式的詳細資訊，請參閱 [格式化並上傳您的批次推論資料](batch-inference-data.md)。

選取語言以查看其程式碼範例：

------
#### [ Python ]

建立名為 *abc.jsonl* 的 JSONL 檔案，並為至少包含記錄數下限的每個記錄加入一個 JSON 物件 (請參閱 ***\$1Model\$1* 的每個批次推論任務的記錄數下限**[Amazon Bedrock 的配額](quotas.md))。在此範例中，您將使用 Anthropic Claude 3 Haiku 模型。以下範例顯示檔案中的第一個輸入 JSON：

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
... 
# Add records until you hit the minimum
```

建立名為 *amzn-s3-demo-bucket-input* 的 S3 儲存貯體，並將檔案上傳至其中。然後建立名為 *amzn-s3-demo-bucket-output* 的 S3 儲存貯體，以寫入您的輸出檔案。執行下列程式碼片段以提交任務，並從回應中取得 *jobArn*：

```
import boto3

bedrock = boto3.client(service_name="bedrock")

inputDataConfig=({
    "s3InputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-input/abc.jsonl"
    }
})

outputDataConfig=({
    "s3OutputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-output/"
    }
})

response=bedrock.create_model_invocation_job(
    roleArn="arn:aws:iam::123456789012:role/MyBatchInferenceRole",
    modelId="anthropic.claude-3-haiku-20240307-v1:0",
    jobName="my-batch-job",
    inputDataConfig=inputDataConfig,
    outputDataConfig=outputDataConfig
)

jobArn = response.get('jobArn')
```

傳回任務的 `status`。

```
bedrock.get_model_invocation_job(jobIdentifier=jobArn)['status']
```

列出*失敗*的批次推論任務。

```
bedrock.list_model_invocation_jobs(
    maxResults=10,
    statusEquals="Failed",
    sortOrder="Descending"
)
```

停止您開始的任務。

```
bedrock.stop_model_invocation_job(jobIdentifier=jobArn)
```

------

# 使用 OpenAI 批次 API 提交一批提示
<a name="inference-openai-batch"></a>

您可以使用 [OpenAI 建立批次 API](https://platform.openai.com/docs/api-reference/batch) 搭配 Amazon Bedrock OpenAI 模型來執行批次推論任務。

您可以透過下列方式呼叫 OpenAI 建立批次 API：
+ 使用 Amazon Bedrock 執行時期端點提出 HTTP 請求。
+ 搭配 Amazon Bedrock 執行時期端點使用 OpenAI SDK 請求。

選取主題以進一步了解：

**Topics**
+ [OpenAI 批次 API 支援的模型和區域](#inference-openai-batch-supported)
+ [使用 OpenAI 批次 API 的必要條件](#inference-openai-batch-prereq)
+ [建立 OpenAI 批次任務](#inference-openai-batch-create)
+ [擷取 OpenAI 批次任務](#inference-openai-batch-retrieve)
+ [列出 OpenAI 批次任務](#inference-openai-batch-list)
+ [取消 OpenAI 批次任務](#inference-openai-batch-cancel)

## OpenAI 批次 API 支援的模型和區域
<a name="inference-openai-batch-supported"></a>

您可以使用OpenAI建立批次 API 搭配 Amazon Bedrock 和支援這些OpenAI模型的 AWS 區域中支援的所有模型。如需支援模型和區域的詳細資訊，請參閱 [Amazon Bedrock 中支援的基礎模型](models-supported.md)。

## 使用 OpenAI 批次 API 的必要條件
<a name="inference-openai-batch-prereq"></a>

若要查看使用 OpenAI 批次 API 操作的必要條件，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ OpenAI SDK ]
+ **身分驗證** – OpenAI SDK 僅支援使用 Amazon Bedrock API 金鑰進行身分驗證。產生 Amazon Bedrock API 金鑰來驗證您的請求。若要了解 Amazon Bedrock API 金鑰以及如何產生金鑰，請參閱建置章節中的 API 金鑰一節。
+ **端點** – 尋找與要在 Amazon Bedrock 執行期端點和配額中使用的 AWS 區域對應的端點。 [https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)如果您使用 AWS SDK，您可能只需要在設定用戶端時指定區域碼，而不是整個端點。
+ **模型存取** – 請求存取支援此功能的 Amazon Bedrock 模型。如需詳細資訊，請參閱[使用 SDK 和 CLI 管理模型存取](model-access.md#model-access-modify)。
+ **安裝 OpenAI SDK** – 如需詳細資訊，請參閱 OpenAI 文件中的[程式庫](https://platform.openai.com/docs/libraries)。
+ **上傳至 S3 的批次 JSONL 檔案** – 請遵循 OpenAI 文件中[準備批次檔案](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file)的步驟，以正確的格式準備批次檔案。然後，將檔案上傳至 Amazon S3 儲存貯體。
+ **IAM 許可** – 請確定您有下列具有適當許可的 IAM 身分：
  + 您用來進行驗證的 IAM 身分可以執行批次推論相關的 API 操作。如需詳細資訊，請參閱[IAM 身分提交和管理批次推論任務所需的許可](batch-inference-permissions.md)。
  + 您使用的批次推論服務角色可以擔任您的身分、調用您使用的 OpenAI 模型，以及存取 S3 中的批次 JSONL 檔案。如需詳細資訊，請參閱[服務角色](security-iam-sr.md)。

------
#### [ HTTP request ]
+ **身分驗證** – 您可以使用您的 AWS 登入資料或 Amazon Bedrock API 金鑰進行身分驗證。

  設定您的 AWS 登入資料或產生 Amazon Bedrock API 金鑰來驗證您的請求。
  + 若要了解如何設定您的 AWS 登入資料，請參閱[使用 AWS 安全登入資料進行程式設計存取](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds-programmatic-access.html)。
  + 若要了解 Amazon Bedrock API 金鑰以及如何產生金鑰，請參閱建置章節中的 API 金鑰一節。
+ **端點** – 尋找與要在 Amazon Bedrock 執行期端點和配額中使用的 AWS 區域對應的端點。 [https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)如果您使用 AWS SDK，您可能只需要在設定用戶端時指定區域碼，而不是整個端點。
+ **模型存取** – 請求存取支援此功能的 Amazon Bedrock 模型。如需詳細資訊，請參閱[使用 SDK 和 CLI 管理模型存取](model-access.md#model-access-modify)。
+ **上傳至 S3 的批次 JSONL 檔案** – 請遵循 OpenAI 文件中[準備批次檔案](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file)的步驟，以正確的格式準備批次檔案。然後，將檔案上傳至 Amazon S3 儲存貯體。
+ **IAM 許可** – 請確定您有下列具有適當許可的 IAM 身分：
  + 您用來進行驗證的 IAM 身分可以執行批次推論相關的 API 操作。如需詳細資訊，請參閱[IAM 身分提交和管理批次推論任務所需的許可](batch-inference-permissions.md)。
  + 您使用的批次推論服務角色可以擔任您的身分、調用您使用的 OpenAI 模型，以及存取 S3 中的批次 JSONL 檔案。如需詳細資訊，請參閱[服務角色](security-iam-sr.md)。

------

## 建立 OpenAI 批次任務
<a name="inference-openai-batch-create"></a>

如需 OpenAI 建立批次 API 的詳細資訊，請在 OpenAI 文件中參閱下列資源：
+ [建立批次](https://platform.openai.com/docs/api-reference/batch/create) – 詳細說明請求和回應。
+ [請求輸出物件](https://platform.openai.com/docs/api-reference/batch/request-output) – 詳細說明從批次任務產生的輸出欄位。解譯 S3 儲存貯體中的結果時，請參閱本文件。

**形成請求**  
形成批次推論請求時，請注意下列 Amazon Bedrock 特定欄位和值：

**請求標頭**
+ X-Amzn-Bedrock-RoleArn (必要) – 批次推論服務角色的 Amazon Resource Name (ARN)。如需詳細資訊，請參閱[建立批次推論的自訂服務角色](batch-iam-sr.md)
+ X-Amzn-Bedrock-ModelId (必要) – 要在推論中使用的基礎模型 ID。如需詳細資訊，請參閱[Amazon Bedrock 中支援的基礎模型](models-supported.md)。
+ X-Amzn-Bedrock-OutputEncryptionKeyId (選用) – 您要用來加密輸出 S3 檔案的 KMS 金鑰 ID。如需詳細資訊，請參閱[使用 AWS KMS (SSE-KMS) 指定伺服器端加密](https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html)。
+ X-Amzn-Bedrock-Tags (選用) – 金鑰和值的字典，指出要連接至輸出的標籤。如需詳細資訊，請參閱[標記 Amazon Bedrock 資源](tagging.md)。

**請求內文參數：**
+ 端點 – 必須是 `v1/chat/completions`。
+ input\$1file\$1id – 指定批次 JSONL 檔案的 S3 URI。

**尋找產生的結果**  
建立回應包含批次 ID。批次推論任務的結果和錯誤記錄會寫入包含輸入檔案的 S3 資料夾。結果將位於與批次 ID 相同名稱的資料夾中，如下列資料夾結構所示：

```
---- {batch_input_folder}
        |---- {batch_input}.jsonl
        |---- {batch_id}
	           |---- {batch_input}.jsonl.out
	           |---- {batch_input}.jsonl.err
```

若要查看搭配不同方法使用 OpenAI 建立批次 API 的範例，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ OpenAI SDK (Python) ]

若要使用 OpenAI SDK 建立批次任務，請執行下列操作：

1. 匯入 OpenAI SDK 並使用下列欄位設定用戶端：
   + `base_url` – 將 Amazon Bedrock 執行時期端點字首設為 `/openai/v1`，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 金鑰。
   + `default_headers` – 如果您需要包含任何標頭，可以將它們作為索引鍵/值對包含在此物件中。您也可以在進行特定 API 呼叫時，在 `extra_headers` 中指定標頭。

1. 搭配用戶端使用 [batches.create()](https://platform.openai.com/docs/api-reference/batch/create) 方法。

在執行下列範例之前，請取代下列欄位中的預留位置：
+ api\$1key – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ X-Amzn-BedrockRoleArn – 將 *arn:aws:iam::123456789012:role/BatchServiceRole* 取代為您設定的實際批次推論服務角色。
+ input\$1file\$1id – 將 *s3://amzn-s3-demo-bucket/openai-input.jsonl* 取代為您上傳批次 JSONL 檔案的實際 S3 URI。

此範例會在 `us-west-2` 中呼叫 OpenAI 建立批次任務 API，並包含一段中繼資料。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK", # Replace with actual API key
    default_headers={
        "X-Amzn-Bedrock-RoleArn": "arn:aws:iam::123456789012:role/BatchServiceRole" # Replace with actual service role ARN
    }
)

job = client.batches.create(
    input_file_id="s3://amzn-s3-demo-bucket/openai-input.jsonl", # Replace with actual S3 URI
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={
        "description": "test input"
    },
    extra_headers={
        "X-Amzn-Bedrock-ModelId": "openai.gpt-oss-20b-1:0",
    }
)
print(job)
```

------
#### [ HTTP request ]

若要使用直接 HTTP 請求建立聊天完成，請執行下列操作：

1. 使用 POST 方法，並將 Amazon Bedrock 執行時期端點字首設定為 `/openai/v1/batches` 來指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

1. 在 `Authorization`標頭中指定您的 AWS 登入資料或 Amazon Bedrock API 金鑰。

在執行下列範例之前，請先取代下列欄位中的預留位置：
+ 授權 – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ X-Amzn-BedrockRoleArn – 將 *arn:aws:iam::123456789012:role/BatchServiceRole* 取代為您設定的實際批次推論服務角色。
+ input\$1file\$1id – 將 *s3://amzn-s3-demo-bucket/openai-input.jsonl* 取代為您上傳批次 JSONL 檔案的實際 S3 URI。

此範例會在 `us-west-2` 中呼叫建立聊天完成 API，並包含一段中繼資料：

```
curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \  
    -H 'Content-Type: application/json' \
    -H 'X-Amzn-Bedrock-ModelId: openai.gpt-oss-20b-1:0' \
    -H 'X-Amzn-Bedrock-RoleArn: arn:aws:iam::123456789012:role/BatchServiceRole' \  
    -d '{    
    "input_file_id": "s3://amzn-s3-demo-bucket/openai-input.jsonl",    
    "endpoint": "/v1/chat/completions",    
    "completion_window": "24h",
    "metadata": {"description": "test input"}  
}'
```

------

## 擷取 OpenAI 批次任務
<a name="inference-openai-batch-retrieve"></a>

如需 OpenAI 擷取批次 API 請求和回應的詳細資訊，請參閱[擷取批次](https://platform.openai.com/docs/api-reference/batch/retrieve)。

當您提出請求時，可以指定要取得資訊的批次任務 ID。回應會傳回批次任務的相關資訊，包括您可以在 S3 儲存貯體中查詢的輸出和錯誤檔案名稱。

若要查看搭配不同方法使用 OpenAI 擷取批次 API 的範例，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ OpenAI SDK (Python) ]

若要擷取 OpenAI SDK 建立批次任務，請執行下列操作：

1. 匯入 OpenAI SDK 並使用下列欄位設定用戶端：
   + `base_url` – 將 Amazon Bedrock 執行時期端點字首設為 `/openai/v1`，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 金鑰。
   + `default_headers` – 如果您需要包含任何標頭，可以將它們作為索引鍵/值對包含在此物件中。您也可以在進行特定 API 呼叫時，在 `extra_headers` 中指定標頭。

1. 搭配用戶端使用 [batches.retrieve()](https://platform.openai.com/docs/api-reference/batch/create) 方法，並指定要擷取資訊的批次 ID。

在執行下列範例之前，請取代下列欄位中的預留位置：
+ api\$1key – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ batch\$1id – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。

此範例會在 ID 為 *batch\$1abc123* 的批次任務上呼叫 `us-west-2` 中的 OpenAI 擷取批次任務 API。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.retrieve(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

若要使用直接 HTTP 請求擷取批次任務，請執行下列操作：

1. 使用 GET 方法，並將 Amazon Bedrock 執行時期端點字首設定為 `/openai/v1/batches/${batch_id}` 來指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123
   ```

1. 在 `Authorization`標頭中指定您的 AWS 登入資料或 Amazon Bedrock API 金鑰。

在執行下列範例之前，請先取代下列欄位中的預留位置：
+ 授權 – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ batch\$1abc123 – 在路徑中，將此值取代為批次任務的實際 ID。

下列範例會在 ID 為 *batch\$1abc123* 的批次任務上呼叫 `us-west-2` 中的 OpenAI 擷取批次任務 API。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------

## 列出 OpenAI 批次任務
<a name="inference-openai-batch-list"></a>

如需 OpenAI 列出批次 API 請求和回應的詳細資訊，請參閱[列出批次](https://platform.openai.com/docs/api-reference/batch/list)。回應會傳回批次任務的相關資訊陣列。

當您提出請求時，可以包含查詢參數來篩選結果。回應會傳回批次任務的相關資訊，包括您可以在 S3 儲存貯體中查詢的輸出和錯誤檔案名稱。

若要查看搭配不同方法使用 OpenAI 列出批次 API 的範例，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ OpenAI SDK (Python) ]

若要使用 OpenAI SDK 列出批次任務，請執行下列操作：

1. 匯入 OpenAI SDK 並使用下列欄位設定用戶端：
   + `base_url` – 將 Amazon Bedrock 執行時期端點字首設為 `/openai/v1`，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 金鑰。
   + `default_headers` – 如果您需要包含任何標頭，可以將它們作為索引鍵/值對包含在此物件中。您也可以在進行特定 API 呼叫時，在 `extra_headers` 中指定標頭。

1. 搭配用戶端使用 [batches.list()](https://platform.openai.com/docs/api-reference/batch/list) 方法。您可以包含任一選用參數。

在執行下列範例之前，請取代下列欄位中的預留位置：
+ api\$1key – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。

此範例會呼叫 `us-west-2` 中的 OpenAI 列出批次任務 API，並指定傳回 2 個結果的限制。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.list(limit=2)

print(job)
```

------
#### [ HTTP request ]

若要使用直接 HTTP 請求列出批次任務，請執行下列操作：

1. 使用 GET 方法，並將 Amazon Bedrock 執行時期端點字首設定為 `/openai/v1/batches` 來指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

   您可以包含任一選用查詢參數。

1. 在 `Authorization`標頭中指定您的 AWS 登入資料或 Amazon Bedrock API 金鑰。

在執行下列範例之前，請先取代下列欄位中的預留位置：
+ 授權 – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。

下列範例會呼叫 `us-west-2` 中的 OpenAI 列出批次 API，並指定傳回 2 個結果的限制。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches?limit=2' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \
```

------

## 取消 OpenAI 批次任務
<a name="inference-openai-batch-cancel"></a>

如需 OpenAI 取消批次 API 請求和回應的詳細資訊，請參閱[取消批次](https://platform.openai.com/docs/api-reference/batch/cancel)。回應會傳回已取消批次任務的相關資訊。

當您提出請求時，可以指定要取消的批次任務 ID。

若要查看搭配不同方法使用 OpenAI 取消批次 API 的範例，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ OpenAI SDK (Python) ]

若要使用 OpenAI SDK 取消批次任務，請執行下列操作：

1. 匯入 OpenAI SDK 並使用下列欄位設定用戶端：
   + `base_url` – 將 Amazon Bedrock 執行時期端點字首設為 `/openai/v1`，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 金鑰。
   + `default_headers` – 如果您需要包含任何標頭，可以將它們作為索引鍵/值對包含在此物件中。您也可以在進行特定 API 呼叫時，在 `extra_headers` 中指定標頭。

1. 搭配用戶端使用 [batches.cancel()](https://platform.openai.com/docs/api-reference/batch/cancel) 方法，並指定要擷取資訊的批次 ID。

在執行下列範例之前，請取代下列欄位中的預留位置：
+ api\$1key – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ batch\$1id – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。

此範例會在 ID 為 *batch\$1abc123* 的批次任務上呼叫 `us-west-2` 中的 OpenAI 取消批次任務 API。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.cancel(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

若要使用直接 HTTP 請求取消批次任務，請執行下列操作：

1. 使用 POST 方法，並將 Amazon Bedrock 執行時期端點字首設定為 `/openai/v1/batches/${batch_id}/cancel` 來指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123/cancel
   ```

1. 在 `Authorization`標頭中指定您的 AWS 登入資料或 Amazon Bedrock API 金鑰。

在執行下列範例之前，請先取代下列欄位中的預留位置：
+ 授權 – 將 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 取代為實際 API 金鑰。
+ batch\$1abc123 – 在路徑中，將此值取代為批次任務的實際 ID。

下列範例會在 ID 為 *batch\$1abc123* 的批次任務上呼叫 `us-west-2` 中的 OpenAI 取消批次任務 API。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123/cancel' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------

# 透過跨區域推論增加輸送量
<a name="cross-region-inference"></a>

透過跨區域推論，您可以選擇與特定地理位置 (例如美國或歐洲) 繫結的跨區域推論設定檔，也可以選擇全域推論設定檔。當您選擇與特定地理繫結的推論設定檔時，Amazon Bedrock 會自動選取該地理 AWS 區域 內的最佳商業，以處理您的推論請求。使用全域推論設定檔時，Amazon Bedrock 會自動選取最佳廣告 AWS 區域 來處理請求，進而最佳化可用資源並提高模型輸送量。

這兩種類型的跨區域推論都透過[推論描述檔運作](inference-profiles.md)，定義基礎模型 (FM) AWS 區域 和可路由請求的 。在隨需模式中執行模型推論時，請求可能會受到服務配額或尖峰使用時間的限制。跨區域推論可讓您利用不同 的運算，順暢地管理意外流量暴增 AWS 區域。

您也可以購買[佈建輸送量](prov-throughput.md)來增加模型的輸送量。推論設定檔目前不支援佈建輸送量。

若要查看您可以使用推論設定檔來執行跨區域推論的區域和模型，請參閱[推論設定檔支援的區域和模型](inference-profiles-support.md)。

**Topics**
+ [在地理和全域跨區域推論之間進行選擇](#cross-region-inference-comparison)
+ [一般考量事項](#cross-region-inference-general-considerations)
+ [地理跨區域推論](geographic-cross-region-inference.md)
+ [全域跨區域推論](global-cross-region-inference.md)

## 在地理和全域跨區域推論之間進行選擇
<a name="cross-region-inference-comparison"></a>

Amazon Bedrock 提供兩種類型的跨區域推論設定檔，每個設定檔都針對不同的使用案例和合規要求而設計：


| 功能 | 地理跨區域推論 | 全域跨區域推論 | 建議 | 
| --- | --- | --- | --- | 
| 資料落地 | 在地理邊界 （美國、歐洲、亞太區等） 內 | 全球任何支援 AWS 的商業區域 | 選擇地理以滿足合規要求 | 
| 輸送量 | 高於單一區域 | 最高可用 | 選擇全域以獲得最佳效能 | 
| Cost | 標準定價 | 節省約 10% | 選擇全域進行成本最佳化 | 
| SCP 要求 | 允許設定檔中的所有目的地區域 | 允許 "aws:RequestedRegion": "unspecified" | 根據您的組織政策來設定 | 
| 最適合 | 具有資料落地法規的組織 | 組織優先考慮成本和效能 | 評估您的合規和效能需求 | 

當您有資料落地要求且需要確保資料處理保持在特定地理邊界內時，請選擇地理跨區域推論。當您想要在不受地理限制的情況下達到最大輸送量和節省成本時，請選擇全域跨區域推論。

## 一般考量事項
<a name="cross-region-inference-general-considerations"></a>

請注意與跨區域推論有關的下列資訊：
+ 使用跨區域推論無需額外的路由費用。價格是根據您從中呼叫推論設定檔的區域來計算。如需定價的資訊，請參閱 [Amazon Bedrock 定價](https://aws.amazon.com/bedrock/pricing/)。
+ 跨區域推論可以將請求路由到 中未手動啟用 AWS 區域 的 AWS 帳戶。跨區域推論不需要手動啟用區域即可運作。
+ 跨區域操作期間傳輸的所有資料都會保留在 AWS 網路上，而不會周遊公有網際網路。資料會在 之間傳輸中加密 AWS 區域。
+ 所有跨區域推論請求都會記錄在來源區域的 CloudTrail 中。尋找 `additionalEventData.inferenceRegion` 欄位來識別處理請求的位置。
+ AWS 由 Amazon Bedrock 提供支援的服務也可能使用 CRIS。如需詳細資訊，請參閱服務特定文件。

# 地理跨區域推論
<a name="geographic-cross-region-inference"></a>

地理跨區域推論會將資料處理保持在指定的地理邊界 （美國、歐洲、亞太區等） 內，同時提供比單一區域推論更高的輸送量。此選項非常適合具有資料落地要求和合規法規的組織。

## 地理跨區域推論考量
<a name="geographic-cris-considerations"></a>

請注意下列有關地理跨區域推論的資訊：
+ 跨區域推論請求與地理 （例如美國、歐洲和亞太區） 繫結的推論描述檔，會保留在 AWS 區域 原始資料所在地理位置的 內。例如，在美國提出的請求會保留 AWS 區域 在美國的 內。雖然資料只會存放在來源區域中，但您的輸入提示和輸出結果可能會在跨區域推論期間移動到來源區域之外。所有資料都會透過 Amazon 的安全網路進行加密傳輸。
+ 若要在使用與地理位置 (例如美國、歐洲和亞太區) 繫結的推論設定檔時，查看跨區域輸送量的預設配額，請參閱《AWS 一般參考》**中 **\$1\$1Model\$1 的每分鐘跨區域模型推論請求**，以及 [Amazon Bedrock 服務配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中 **\$1\$1Model\$1 的每分鐘跨區域模型推論字符**。

## 地理跨區域推論的 IAM 政策需求
<a name="geographic-cris-iam-setup"></a>

若要允許 IAM 使用者或角色叫用地理跨區域推論設定檔，您需要允許存取下列資源：

1. 地理特定的跨區域推論設定檔 （這些設定檔具有地理字首，例如 `us`、`eu`、`apac`)

1. 來源區域中的基礎模型

1. 地理設定檔中列出的所有目的地區域中的基礎模型

下列範例政策授予使用 Claude Sonnet 4.5 基礎模型與美國地理跨區域推論設定檔的必要許可，其中來源區域為 `us-east-2`，`us-east-1`而目的地區域為 `us-east-1`、 和 `us-west-2`：

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "GrantGeoCrisInferenceProfileAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:us-east-1:<ACCOUNT_ID>:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0"
            ]
        },
        {
            "Sid": "GrantGeoCrisModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0",
                "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0",
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0"
            ],
            "Condition": {
                "StringEquals": {
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-east-1:<ACCOUNT_ID>:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0"
                }
            }
        }
    ]
}
```

第一個陳述式會授予來自請求區域的請求之地理跨區域推論描述檔的 `bedrock:InvokeModel` API 存取權。第二個陳述式授予 `bedrock:InvokeModel` API 在請求區域和推論設定檔中列出的所有目的地區域中對基礎模型的存取權。

## 地理跨區域推論的服務控制政策需求
<a name="geographic-cris-scp-setup"></a>

許多組織透過 AWS Organizations 中的服務控制政策實作區域存取控制，以確保安全和合規。如果您組織的安全政策使用 SCPs 封鎖未使用的區域，您必須確保區域特定的 SCP 條件允許存取來源區域地理跨區域推論設定檔中列出的所有目的地區域。

對於地理跨區域推論，您需要了解來源區域 （進行 API 呼叫的位置） 與目的地區域 （可路由請求的位置） 之間的關係。檢查推論設定檔文件以識別來源區域的所有目的地區域，然後確保您的 SCPs存取所有這些目的地區域。

例如，如果您使用 US Anthropic Claude Sonnet 4.5 地理設定檔從 us-east-1 （來源區域） 呼叫 ，請求可以路由到 us-east-1、us-east-2 和 us-west-2 （目的地區域）。如果 SCP 限制只能存取 us-east-1，則嘗試路由到 us-east-2 或 us-west-2 時，跨區域推論將會失敗。因此，無論您從哪個區域呼叫，您都需要允許 SCP 中的所有三個目的地區域。

將 SCPs設定為區域排除時，請記住，封鎖推論描述檔中的任何目的地區域將阻止跨區域推論正常運作，即使您的來源區域仍可存取。如需全域跨區域推論的 SCP 需求，請參閱 [全域跨區域推論的服務控制政策需求](global-cross-region-inference.md#global-cris-scp-setup)。

為了提高安全性，請考慮使用 `bedrock:InferenceProfileArn`條件來限制對特定推論描述檔的存取。這可讓您授予對所需區域的存取權，同時限制可以使用哪些推論設定檔。

## 使用地理跨區域推論
<a name="geographic-cris-usage"></a>

若要使用地理跨區域推論，您可以在執行模型推論時以下列方式包含推論[描述](inference-profiles.md)檔：
+ **隨需模型推論**：在傳送 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)、[InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)、[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或 [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html) 請求時，將推論設定檔的 ID 指定為 `modelId`。推論設定檔會定義一或多個區域，其可將來自來源區域的推論請求路由至那些區域。使用跨區域推論可透過在推論設定檔中定義的區域中動態路由模型調用請求，來提高輸送量和效能。使用者流量、需求和資源使用率的路由因素。如需詳細資訊，請參閱[提交提示並使用模型推論產生回應](inference.md)
+ **批次推論**：在傳送 [CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html) 請求時，將推論設定檔的 ID 指定為 `modelId`，以批次推論非同步方式提交請求。使用推論設定檔可讓您跨多個 AWS 區域 使用運算，並加快批次任務的處理時間。任務完成後，您可以從來源區域中的 Amazon S3 儲存貯體擷取輸出檔案。
+ **代理程式**：在 [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgent.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgent.html) 請求的 `foundationModel` 欄位中指定推論設定檔的 ID。如需詳細資訊，請參閱[手動建立和設定代理程式](agents-create.md)。
+ **知識庫回應產生**：您可以在查詢知識庫之後產生回應時使用跨區域推論。如需詳細資訊，請參閱[使用查詢和回應測試您的知識庫](knowledge-base-test.md)。
+ **模型評估** – 您可以提交推論設定檔作為模型，以在提交模型評估任務時進行評估。如需詳細資訊，請參閱[評估 Amazon Bedrock 資源的效能](evaluation.md)。
+ **提示管理**：您可以在為提示管理中建立的提示產生回應時使用跨區域推論。如需詳細資訊，請參閱[在 Amazon Bedrock 中使用提示管理來建構和存放可重複使用的提示](prompt-management.md)
+ **提示流程**：在為提示產生回應時，您可以在提示流程中的提示節點中定義內嵌時，使用跨區域推論。如需詳細資訊，請參閱[使用 Amazon Bedrock 流程建置端對端生成式 AI 工作流程](flows.md)。

若要了解如何使用推論設定檔，跨區域傳送模型調用請求，請參閱[在模型調用中使用推論設定檔](inference-profiles-use.md)。

若要進一步了解跨區域推論，請參閱 [Amazon Bedrock 中的跨區域推論入門](https://aws.amazon.com/blogs/machine-learning/getting-started-with-cross-region-inference-in-amazon-bedrock/)。

如需全域跨區域推論的詳細資訊，包括 IAM 設定和服務配額管理，請參閱 [全域跨區域推論](global-cross-region-inference.md)。

# 全域跨區域推論
<a name="global-cross-region-inference"></a>

全域跨區域推論可將跨區域推論延伸到地理界限之外，從而將推論請求路由到 AWS 區域 支援的全球商業，最佳化可用資源並實現更高的模型輸送量。

## 全域跨區域推論的優點
<a name="global-cris-benefits"></a>

Anthropic 的 Claude Sonnet 4.5 全域跨區域推論相較於傳統地理跨區域推論描述檔提供多項優勢：
+ **提高尖峰需求期間的輸送量** – 全域跨區域推論透過自動將請求路由到 AWS 區域 具有可用容量的 ，在尖峰需求期間提高彈性。此動態路由無縫進行，無需開發人員的額外組態或介入。與可能需要在 之間進行複雜用戶端負載平衡的傳統方法不同 AWS 區域，全域跨區域推論會自動處理流量尖峰。這對於停機時間或效能降低可能會對財務或評價造成重大影響的業務關鍵應用程式尤其重要。
+ **成本效益** – Anthropic 的 Claude Sonnet 4.5 全域跨區域推論相較於地理跨區域推論，在輸入和輸出字符定價方面可節省約 10%。價格是根據提出請求 AWS 區域 的 （來源 AWS 區域) 來計算。這表示組織可以受益於改善的彈性，甚至降低成本。此定價模型讓全域跨區域推論成為符合成本效益的解決方案，適合希望最佳化其生成式 AI 部署的組織。透過改善資源使用率並實現更高的輸送量，而無須額外成本，可協助組織最大限度地提高在 Amazon Bedrock 中的投資價值。
+ **簡化監控** – 使用全域跨區域推論時，CloudWatch 和 CloudTrail 會繼續記錄來源中的日誌項目 AWS 區域，簡化可觀測性和管理。雖然您的請求會在 AWS 區域 全球各地處理，但您仍可透過熟悉的 AWS 監控工具，集中檢視應用程式的效能和使用模式。
+ **隨需配額彈性** – 透過全域跨區域推論，您的工作負載不會再受到個別區域容量的限制。您的請求可以動態路由到 AWS 全球基礎設施 AWS 區域，而不是限制在特定的可用容量。這可讓您存取更大的資源集區，從而降低處理大量工作負載和流量突增的複雜性。

## 全域跨區域推論考量事項
<a name="global-cris-considerations"></a>

請注意下列有關全域跨區域推論的資訊：
+ 全域跨區域推論設定檔的輸送量高於與特定地理位置繫結的推論設定檔。與特定地理位置繫結的推論設定檔可提供比單一區域推論更高的輸送量。
+ 若要在使用全域推論設定檔時，查看跨區域輸送量的預設配額，請參閱《AWS 一般參考》**中 **\$1\$1Model\$1 的每分鐘全域跨區域模型推論請求**，以及 [Amazon Bedrock 服務配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中 **\$1\$1Model\$1 的每分鐘全域跨區域模型推論字符**。

  您可以從 [Service Quotas 主控台](https://console.aws.amazon.com/servicequotas/home/services/bedrock/quotas)或使用**來源**區域中的 CLI AWS 命令，請求、檢視和管理全域跨區域推論設定檔的配額。

## 全域跨區域推論的 IAM 政策需求
<a name="global-cris-iam-setup"></a>

若要為使用者啟用全域跨區域推論，您必須將三個部分的 IAM 政策套用至角色。以下是提供精細控制的 IAM 政策範例。您可以在範例政策`<REQUESTING REGION>`中，將 取代 AWS 區域 為您操作的 。

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "GrantGlobalCrisInferenceProfileRegionAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "<REQUESTING REGION>"
                }
            }
        },
        {
            "Sid": "GrantGlobalCrisInferenceProfileInRegionModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:<REQUESTING REGION>::foundation-model/<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "<REQUESTING REGION>",
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
                }
            }
        },
        {
            "Sid": "GrantGlobalCrisInferenceProfileGlobalModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:::foundation-model/<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "unspecified",
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
                }
            }
        }
    ]
}
```

政策的第一部分會授予請求中區域推論設定檔的存取權 AWS 區域。第二部分提供區域 FM 資源的存取權。第三部分會授予全域 FM 資源的存取權，以啟用跨區域路由功能。

實作這些政策時，請確定 IAM 陳述式中包含所有三個資源 Amazon Resource Name (ARNs)：
+ 區域推論設定檔 ARN 遵循模式 `arn:aws:bedrock:REGION:ACCOUNT:inference-profile/global.MODEL-NAME`。這用於授予對來源 中全域推論描述檔的存取權 AWS 區域。
+ 區域 FM 使用 `arn:aws:bedrock:REGION::foundation-model/MODEL-NAME`。這用於授予來源中 FM 的存取權 AWS 區域。
+ 全域 FM 需要 `arn:aws:bedrock:::foundation-model/MODEL-NAME`。這用於授予不同全域中 FM 的存取權 AWS 區域。

全域 FM ARN 沒有指定 AWS 區域 或 帳戶，這是跨區域功能刻意且必要的。

### 停用全域跨區域推論
<a name="global-cris-iam-disable"></a>

您可以選擇兩種主要方法，針對特定 IAM 角色實作拒絕政策至全域 CRIS，每個都具有不同的使用案例和影響：
+ **移除 IAM 政策** – 第一個方法涉及從使用者許可中移除三個必要 IAM 政策中的一個或多個。由於全域 CRIS 需要這三個政策才能運作，移除政策將導致存取遭拒。
+ **實作拒絕政策** – 第二個方法是實作明確拒絕政策，專門以全域 CRIS 推論設定檔為目標。此方法提供清楚的安全意圖文件，並確保即使稍後有人不小心新增了所需的允許政策，明確拒絕仍將優先。拒絕政策應使用符合模式 `StringEquals`的條件`"aws:RequestedRegion": "unspecified"`。此模式特別以字`global`首為目標的推論設定檔。

實作拒絕政策時，請務必了解全域 CRIS 會變更`aws:RequestedRegion`欄位的行為。使用具有特定 AWS 區域 名稱 之`StringEquals`條件的傳統 AWS 區域型拒絕政策，因為服務會將此欄位設定為 `global`，而非實際目的地，因此`"aws:RequestedRegion": "us-west-2"`無法如預期與全域 CRIS 搭配使用 AWS 區域。不過，如先前所述， `"aws:RequestedRegion": "unspecified"` 將導致拒絕效果。

## 全域跨區域推論的服務控制政策需求
<a name="global-cris-scp-setup"></a>

對於全域跨區域推論，如果您組織的安全政策使用 SCPs 封鎖未使用的區域，您必須更新區域特定的 SCP 條件，以允許使用 進行存取`"aws:RequestedRegion": "unspecified"`。此條件專屬於 Amazon Bedrock Global 跨區域推論，並確保請求可以路由到所有支援 AWS 的商業區域。

下列範例 SCP 會封鎖核准區域外的所有 AWS API 呼叫，同時允許使用 `"unspecified"`做為全域路由區域的 Amazon Bedrock Global 跨區域推論呼叫：

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "DenyAllOutsideApprovedRegions",
            "Effect": "Deny",
            "Action": "*",
            "Resource": "*",
            "Condition": {
                "StringNotEquals": {
                    "aws:RequestedRegion": [
                        "us-east-1",
                        "us-east-2",
                        "us-west-2",
                        "unspecified"
                    ]
                }
            }
        }
    ]
}
```

### 停用全域跨區域推論
<a name="global-cris-disable"></a>

具有資料駐留或合規要求的組織應評估全域跨區域推論是否符合其合規架構，因為在其他支援 AWS 的商業區域中可能會處理請求。若要明確停用全域跨區域推論，請實作下列 SCP 政策：

```
{
    "Effect": "Deny",
    "Action": "bedrock:*",
    "Resource": "*",
    "Condition": {
        "StringEquals": {
            "aws:RequestedRegion": "unspecified"
        },
        "ArnLike": {
            "bedrock:InferenceProfileArn": "arn:aws:bedrock:*:*:inference-profile/global.*"
        }
    }
}
```

此 SCP 明確拒絕全域跨區域推論，因為 `"aws:RequestedRegion"` 為 ，`"unspecified"`且`"ArnLike"`條件以 ARN 中字`global`首為 的推論描述檔為目標。

### AWS Control Tower 實作
<a name="control-tower-scp"></a>

強烈建議手動編輯由 AWS Control Tower 管理SCPs，因為它可能會導致偏離。反之，請使用 Control Tower 提供的機制來管理這些例外狀況。核心原則涉及擴展現有的區域拒絕控制或啟用區域，然後套用自訂的條件式封鎖政策。

如需使用 Control Tower 實作跨區域推論的詳細step-by-step指引，請參閱部落格文章[在多帳戶環境中啟用 Amazon Bedrock 跨區域推論](https://aws.amazon.com/blogs/machine-learning/enable-amazon-bedrock-cross-region-inference-in-multi-account-environments/)。這涵蓋擴展現有的區域拒絕 SCPs、使用自訂 SCPs啟用拒絕區域，以及使用 Customizations for AWS Control Tower (CfCT) 將自訂 SCPs 部署為基礎設施做為程式碼。

## 全域跨區域推論的請求限制增加
<a name="global-cris-quotas"></a>

使用全域 CRIS 推論設定檔時，您可以從超過 20 個支援的來源使用全域 CRIS AWS 區域。由於這會是全域限制，檢視、管理或增加全域跨區域推論設定檔配額的請求必須透過請求來源中的 Service Quotas 主控台或 AWS 命令列界面 (AWS CLI) 提出 AWS 區域。

完成下列步驟以請求提高限制：

1. 登入您 AWS 帳戶中的 Service Quotas 主控台。

1. 在導覽窗格中，選擇 **AWS services** (AWS 服務)。

1. 從服務清單中，尋找並選擇 **Amazon Bedrock**。

1. 在 Amazon Bedrock 的配額清單中，使用搜尋篩選條件來尋找特定的全域 CRIS 配額。例如：
   + Anthropic Claude Sonnet 4.5 V1 的每分鐘全域跨區域模型推論字符

1. 選取您要增加的配額。

1. 選擇**在帳戶層級請求增加**。

1. 輸入所需的新配額值。

1. 選擇**請求**以提交您的請求。

計算所需的配額增加時，請記得考量縮減率，定義為輸入和輸出字符轉換為限流系統字符配額用量的速率。下列模型**的輸出字符消耗率為 5 倍 (1 個輸出字符從您的配額消耗 5 個字符）**：
+ Anthropic Claude Opus 4
+ Anthropic Claude Sonnet 4.5
+ Anthropic Claude Sonnet 4
+ Anthropic Claude 3.7 Sonnet

所有其他模型的銷毀率為 **1：1** (1 個輸出字符會消耗您配額中的 1 個字符)。對於輸入字符，字符對配額比率為 1：1。每個請求的字符總數計算如下：

`Input token count + Cache write input tokens + (Output token count x Burndown rate)`

## 使用全域跨區域推論
<a name="global-cris-usage"></a>

若要搭配 Anthropic 的 Claude Sonnet 4.5 使用全域跨區域推論，開發人員必須完成下列關鍵步驟：
+ **使用全域推論設定檔 ID** – 對 Amazon Bedrock 進行 API 呼叫時，請指定全域 Anthropic 的 Claude Sonnet 4.5 推論設定檔 ID (`global.anthropic.claude-sonnet-4-5-20250929-v1:0`)，而非 AWS 區域特定模型 ID。
+ **設定 IAM 許可** – 授予適當的 IAM 許可，以存取潛在目的地中的推論設定檔和 FMs AWS 區域。

支援全域跨區域推論：
+ 隨需模型推論
+ 批次推論
+ 客服人員
+ 模型評估
+ 提示管理
+ 提示流程

**注意**  
隨需模型推論、批次推論、代理程式、模型評估、提示管理和提示流程支援全域推論設定檔。

## 實作全域跨區域推論
<a name="global-cris-implementation"></a>

使用 Anthropic 的 Claude Sonnet 4.5 實作全域跨區域推論非常簡單，只需要對現有應用程式程式碼進行一些變更。以下是如何在 Python 中更新程式碼的範例：

```
import boto3
import json
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
model_id = "global.anthropic.claude-sonnet-4-5-20250929-v1:0"  
response = bedrock.converse(
    messages=[{"role": "user", "content": [{"text": "Explain cloud computing in 2 sentences."}]}],
    modelId=model_id,
)

print("Response:", response['output']['message']['content'][0]['text'])
print("Token usage:", response['usage'])
print("Total tokens:", response['usage']['totalTokens'])
```

# 使用推論設定檔設定模型調用資源
<a name="inference-profiles"></a>

*推論設定檔*是 Amazon Bedrock 中的資源，可定義模型以及一或多個區域，而推論設定檔可將模型調用請求路由到這些區域。您可以針對下列任務使用推論設定檔：
+ **追蹤用量指標** – 設定 CloudWatch 日誌，並使用應用程式推論設定檔提交模型調用請求，以收集模型調用的用量指標。您可以在檢視推論設定檔的相關資訊時檢查這些指標，並使用它們來通知您的決策。如需有關如何設定 CloudWatch 日誌的詳細資訊，請參閱 [使用 CloudWatch Logs 和 Amazon S3 監控模型調用](model-invocation-logging.md)。
+ **使用標籤來監控成本** – 將標籤連接至應用程式推論設定檔，以便在提交隨需模型調用請求時追蹤成本。如需如何使用標籤進行成本分配的詳細資訊，請參閱《 AWS Billing 使用者指南》中的[使用成本分配標籤組織和追蹤 AWS 成本](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html)。
+ **跨區域推論** – 使用包含多個 AWS 區域的推論設定檔來提高輸送量。推論設定檔會將模型調用請求分散到這些區域，以提高輸送量和效能。如需跨區域推論的詳細資訊，請參閱 [透過跨區域推論增加輸送量](cross-region-inference.md)。

Amazon Bedrock 提供下列類型的推論設定檔：
+ **跨區域 (系統定義) 推論設定檔** – 在 Amazon Bedrock 中預先定義的推論設定檔，並包含多個可路由模型請求的區域。
+ **應用程式推論設定檔** – 使用者為追蹤成本和模型用量而建立的推論設定檔。您可以建立推論設定檔，將模型調用請求路由到一個區域或多個區域：
  + 若要建立推論設定檔來追蹤某個區域中模型的成本和用量，請在您要推論設定檔路由請求的區域中指定基礎模型。
  + 若要建立追蹤跨多個區域之模型的成本和用量的推論設定檔，請指定跨區域 (系統定義) 推論設定檔，以定義您需要推論設定檔路由請求的模型和區域。

您可以使用推論設定檔搭配下列功能，將請求路由到多個區域，並追蹤使用這些功能發出的調用請求的用量和成本：
+ 模型推論 – 在 Amazon Bedrock 主控台的遊樂場中選擇推論設定檔，或在呼叫 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)、[InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)、[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 和 [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html) 操作時指定推論設定檔的 ARN，以在執行模型調用時使用推論設定檔。如需詳細資訊，請參閱[提交提示並使用模型推論產生回應](inference.md)。
+ 知識庫向量內嵌和回應產生 – 在查詢知識庫或剖析資料來源中的非文字資訊之後產生回應時，使用推論設定檔。如需詳細資訊，請參閱[使用查詢和回應測試您的知識庫](knowledge-base-test.md)及[剖析資料來源的選項](kb-advanced-parsing.md)。
+ 模型評估 – 您可以提交推論設定檔作為模型，以在提交模型評估任務時進行評估。如需詳細資訊，請參閱[評估 Amazon Bedrock 資源的效能](evaluation.md)。
+ 提示管理 – 您可以在為提示管理中建立的提示產生回應時，使用推論設定檔。如需詳細資訊，請參閱[在 Amazon Bedrock 中使用提示管理來建構和存放可重複使用的提示](prompt-management.md)
+ 流程 – 您可以在為在流程中的提示節點中定義內嵌的提示產生回應時，使用推論設定檔。如需詳細資訊，請參閱[使用 Amazon Bedrock 流程建置端對端生成式 AI 工作流程](flows.md)。

使用推論設定檔的價格是根據您呼叫推論設定檔之區域中模型的價格來計算。如需定價的資訊，請參閱 [Amazon Bedrock 定價](https://aws.amazon.com/bedrock/pricing/)。

如需跨區域推論設定檔可提供之輸送量的詳細資訊，請參閱 [透過跨區域推論增加輸送量](cross-region-inference.md)。

**Topics**
+ [推論設定檔支援的區域和模型](inference-profiles-support.md)
+ [推論設定檔的必要條件](inference-profiles-prereq.md)
+ [建立應用程式推論設定檔](inference-profiles-create.md)
+ [修改應用程式推論設定檔的標籤](inference-profiles-modify.md)
+ [檢視推論設定檔的相關資訊](inference-profiles-view.md)
+ [在模型調用中使用推論設定檔](inference-profiles-use.md)
+ [刪除應用程式推論設定檔](inference-profiles-delete.md)

# 推論設定檔支援的區域和模型
<a name="inference-profiles-support"></a>

如需 Amazon Bedrock 中支援的區域代碼和端點清單，請參閱《[Amazon Bedrock 端點和配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bedrock_region)》。本主題說明您可以使用的預先定義推論設定檔，以及支援應用程式推論設定檔的區域和模型。

**Topics**
+ [支援的跨區域推論設定檔](#inference-profiles-support-system)
+ [應用程式推論設定檔支援的區域和模型](#inference-profiles-support-user)

## 支援的跨區域推論設定檔
<a name="inference-profiles-support-system"></a>

您可以使用跨區域 (系統定義) 推論設定檔進行[跨區域推論](cross-region-inference.md)。跨區域推論可讓您使用跨不同 的運算，順暢地管理意外流量暴增 AWS 區域。您可以使用跨區域推論，將流量分散到多個 AWS 區域。

跨區域 (系統定義) 推論設定檔是以其支援的模型命名，並由其支援的區域定義。若要了解跨區域推論設定檔如何處理您的請求，請檢閱下列定義：
+ **來源區域** – 您從中發出指定推論設定檔之 API 請求的區域。
+ **目的地區域** – Amazon Bedrock 服務可對其路由來自您來源區域之請求的區域。

當您在 Amazon Bedrock 中調用跨區域推論設定檔時，您的請求會源自來源區域，並且會自動路由到該設定檔中定義的其中一個目的地區域，以最佳化效能。全域跨區域推論設定檔的目的地區域包含所有商業區域。

**注意**  
跨區域推論設定檔中的目的地區域可以包含*選擇加入區域*，這是您必須在 AWS 帳戶 或組織層級明確啟用的區域。若要進一步了解，請參閱[AWS 區域 在您的帳戶中啟用或停用](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-regions.html) 。使用跨區域推論設定檔時，可將您的推論請求路由到設定檔中的任何目的地區域，即使您未選擇加入帳戶中的此類區域也一樣。

服務控制政策 SCPs) 和 AWS Identity and Access Management (IAM) 政策共同運作，以控制允許跨區域推論的位置。您可以使用 SCP 來控制 Amazon Bedrock 可以使用哪些區域進行推論，並使用 IAM 政策來定義哪些使用者或角色具有執行推論的許可。如果跨區域推論設定檔中的任何目的地區域在 SCP 中遭到封鎖，則即使仍允許其他區域，請求也會失敗。為了確保跨區域推論的有效操作，您可以更新 SCP 和 IAM 政策，以允許在所選推論設定檔中包含的所有目的地區域中執行所有必要的 Amazon Bedrock 推論動作 (例如 `bedrock:InvokeModel*` 或 `bedrock:CreateModelInvocationJob`)。若要進一步了解，請參閱 [Enabling Amazon Bedrock cross-Region inference in multi-account environments](https://aws.amazon.com/blogs/machine-learning/enable-amazon-bedrock-cross-region-inference-in-multi-account-environments/)。

**注意**  
有些推論設定檔會根據您呼叫它的來源區域，路由到不同的目的地區域。例如，如果您從美國東部 (俄亥俄) 呼叫 `us.anthropic.claude-3-haiku-20240307-v1:0`，其可以將請求路由到 `us-east-1`、`us-east-2` 或 `us-west-2`，但如果您從美國西部 (奧勒岡) 呼叫，則只能將請求路由到 `us-east-1` 和 `us-west-2`。

若要檢查推論設定檔的來源和目的地區域，您可以執行下列其中一項操作：
+ 在[支援的跨區域推論設定檔清單](#inference-profiles-support)中展開對應的區段。
+ 從來源區域傳送具有 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)的 [GetInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html) 請求，並在 `inferenceProfileIdentifier` 欄位中指定推論設定檔的 Amazon Resource Name (ARN) 或 ID。回應中的 `models` 欄位會映射至模型 ARN 清單，您可以在其中識別每個目的地區域。

**注意**  
特定模型的全域跨區域推論設定檔可能會隨著時間而變更，因為 AWS 會新增更多可處理您的請求的商業區域。不過，如果推論設定檔與地理位置 （例如美國、歐洲或亞太區域） 繫結，則其目的地區域清單永遠不會變更。 AWS 可能會建立新的推論設定檔，其中包含新的區域。您可以更新系統來使用這些推論設定檔，方法是將設定中的 ID 變更為新的 ID。  
全球跨區域推論設定檔目前僅支援下列來源區域的 Anthropic Claude Sonnet 4 模型：美國西部 (奧勒岡)、美國東部 (維吉尼亞北部)、美國東部 (俄亥俄)、歐洲 (愛爾蘭) 和亞太地區 (東京)。全域推論設定檔的目的地區域包含所有商業 AWS 區域。

展開下列其中一個區段，以查看跨區域推論設定檔、可呼叫推論設定檔的來源區域，以及可路由請求的目的地區域的相關資訊。

### GLOBAL Amazon Nova 2 Lite
<a name="cross-region-ip-global.amazon.nova-2-lite-v1:0"></a>

若要呼叫 GLOBAL Amazon Nova 2 Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.amazon.nova-2-lite-v1:0
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### GLOBAL Anthropic Claude Opus 4.5
<a name="cross-region-ip-global.anthropic.claude-opus-4-5-20251101-v1:0"></a>

若要呼叫 GLOBAL Anthropic Claude Opus 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-opus-4-5-20251101-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### GLOBAL TwelveLabs Pegasus 1.2 版
<a name="cross-region-ip-global.twelvelabs.pegasus-1-2-v1:0"></a>

若要呼叫 GLOBAL TwelveLabs Pegasus 1.2 版推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.twelvelabs.pegasus-1-2-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-pegasus.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Anthropic Claude Haiku 4.5
<a name="cross-region-ip-global.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

若要呼叫 Global Anthropic Claude Haiku 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-haiku-4-5-20251001-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Anthropic Claude Opus 4.6
<a name="cross-region-ip-global.anthropic.claude-opus-4-6-v1"></a>

若要呼叫 Global Anthropic Claude Opus 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-opus-4-6-v1
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Anthropic Claude Sonnet 4.6
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-6"></a>

若要呼叫全域 Anthropic Claude Sonnet 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-sonnet-4-6
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Claude Sonnet 4
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-20250514-v1:0"></a>

若要呼叫 Global Claude Sonnet 4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-sonnet-4-20250514-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Claude Sonnet 4.5
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 Global Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Cohere Embed v4
<a name="cross-region-ip-global.cohere.embed-v4:0"></a>

若要呼叫 Global Cohere Embed v4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
global.cohere.embed-v4:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-embed.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 美國 Amazon Nova 2 Lite
<a name="cross-region-ip-us.amazon.nova-2-lite-v1:0"></a>

若要呼叫美國 Amazon Nova 2 Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.amazon.nova-2-lite-v1:0
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3 Haiku
<a name="cross-region-ip-us.anthropic.claude-3-haiku-20240307-v1:0"></a>

若要呼叫 US Anthropic Claude 3 Haiku 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-haiku-20240307-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3 Opus
<a name="cross-region-ip-us.anthropic.claude-3-opus-20240229-v1:0"></a>

若要呼叫 US Anthropic Claude 3 Opus 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-opus-20240229-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-sonnet-20240229-v1:0"></a>

若要呼叫 US Anthropic Claude 3 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-sonnet-20240229-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3.5 Haiku
<a name="cross-region-ip-us.anthropic.claude-3-5-haiku-20241022-v1:0"></a>

若要呼叫 US Anthropic Claude 3.5 Haiku 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-5-haiku-20241022-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

若要呼叫 US Anthropic Claude 3.5 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-5-sonnet-20240620-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3.5 Sonnet v2
<a name="cross-region-ip-us.anthropic.claude-3-5-sonnet-20241022-v2:0"></a>

若要呼叫 US Anthropic Claude 3.5 Sonnet v2 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-5-sonnet-20241022-v2:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

若要呼叫 US Anthropic Claude 3.7 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-3-7-sonnet-20250219-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Haiku 4.5
<a name="cross-region-ip-us.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

若要呼叫 US Anthropic Claude Haiku 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-haiku-4-5-20251001-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Opus 4.5
<a name="cross-region-ip-us.anthropic.claude-opus-4-5-20251101-v1:0"></a>

若要呼叫 US Anthropic Claude Opus 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-opus-4-5-20251101-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Opus 4.6
<a name="cross-region-ip-us.anthropic.claude-opus-4-6-v1"></a>

若要呼叫 US Anthropic Claude Opus 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-opus-4-6-v1
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 US Anthropic Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Sonnet 4.6
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-6"></a>

若要呼叫美國 Anthropic Claude Sonnet 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-sonnet-4-6
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Opus 4
<a name="cross-region-ip-us.anthropic.claude-opus-4-20250514-v1:0"></a>

若要呼叫 US Claude Opus 4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-opus-4-20250514-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Opus 4.1
<a name="cross-region-ip-us.anthropic.claude-opus-4-1-20250805-v1:0"></a>

若要呼叫 US Claude Opus 4.1 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-opus-4-1-20250805-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Sonnet 4
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-20250514-v1:0"></a>

若要呼叫 US Claude Sonnet 4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.anthropic.claude-sonnet-4-20250514-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Cohere Embed v4
<a name="cross-region-ip-us.cohere.embed-v4:0"></a>

若要呼叫 US Cohere Embed v4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.cohere.embed-v4:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-embed.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US DeepSeek-R1
<a name="cross-region-ip-us.deepseek.r1-v1:0"></a>

若要呼叫 US DeepSeek-R1 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.deepseek.r1-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://www.deepseek.com/)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Llama 4 Maverick 17B Instruct
<a name="cross-region-ip-us.meta.llama4-maverick-17b-instruct-v1:0"></a>

若要呼叫 US Llama 4 Maverick 17B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama4-maverick-17b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Llama 4 Scout 17B Instruct
<a name="cross-region-ip-us.meta.llama4-scout-17b-instruct-v1:0"></a>

若要呼叫 US Llama 4 Scout 17B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama4-scout-17b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 70B Instruct
<a name="cross-region-ip-us.meta.llama3-1-70b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.1 70B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-1-70b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 8B Instruct
<a name="cross-region-ip-us.meta.llama3-1-8b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.1 8B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-1-8b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 Instruct 405B
<a name="cross-region-ip-us.meta.llama3-1-405b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.1 Instruct 405B 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-1-405b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.2 11B Instruct
<a name="cross-region-ip-us.meta.llama3-2-11b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.2 11B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-2-11b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 1B Instruct
<a name="cross-region-ip-us.meta.llama3-2-1b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.2 1B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-2-1b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 3B Instruct
<a name="cross-region-ip-us.meta.llama3-2-3b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.2 3B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-2-3b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 90B Instruct
<a name="cross-region-ip-us.meta.llama3-2-90b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.2 90B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-2-90b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.3 70B Instruct
<a name="cross-region-ip-us.meta.llama3-3-70b-instruct-v1:0"></a>

若要呼叫 US Meta Llama 3.3 70B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.meta.llama3-3-70b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Mistral Pixtral Large 25.02
<a name="cross-region-ip-us.mistral.pixtral-large-2502-v1:0"></a>

若要呼叫 US Mistral Pixtral Large 25.02 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.mistral.pixtral-large-2502-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-mistral.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Lite
<a name="cross-region-ip-us.amazon.nova-lite-v1:0"></a>

若要呼叫 US Nova Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.amazon.nova-lite-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Micro
<a name="cross-region-ip-us.amazon.nova-micro-v1:0"></a>

若要呼叫 US Nova Micro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.amazon.nova-micro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Premier
<a name="cross-region-ip-us.amazon.nova-premier-v1:0"></a>

若要呼叫 US Nova Premier 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.amazon.nova-premier-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Pro
<a name="cross-region-ip-us.amazon.nova-pro-v1:0"></a>

若要呼叫 US Nova Pro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.amazon.nova-pro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國 Pegasus 1.2 版
<a name="cross-region-ip-us.twelvelabs.pegasus-1-2-v1:0"></a>

若要呼叫 US Pegasus v1.2 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.twelvelabs.pegasus-1-2-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-pegasus.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國穩定映像保守擴展
<a name="cross-region-ip-us.stability.stable-conservative-upscale-v1:0"></a>

若要呼叫 US Stable Image Conservative Upscale 推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.stability.stable-conservative-upscale-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](stable-image-services.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Control Sketch
<a name="cross-region-ip-us.stability.stable-image-control-sketch-v1:0"></a>

若要呼叫 US Stable Image Control Sketch 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-control-sketch-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Control Structure
<a name="cross-region-ip-us.stability.stable-image-control-structure-v1:0"></a>

若要呼叫 US Stable Image Control Structure 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-control-structure-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國穩定映像 Creative Upscale
<a name="cross-region-ip-us.stability.stable-creative-upscale-v1:0"></a>

若要呼叫美國穩定映像 Creative Upscale 推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.stability.stable-creative-upscale-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](stable-image-services.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Erase Object
<a name="cross-region-ip-us.stability.stable-image-erase-object-v1:0"></a>

若要呼叫 US Stable Image Erase Object 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-erase-object-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國穩定映像快速擴展
<a name="cross-region-ip-us.stability.stable-fast-upscale-v1:0"></a>

若要呼叫美國穩定映像快速擴展推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.stability.stable-fast-upscale-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](stable-image-services.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Inpaint
<a name="cross-region-ip-us.stability.stable-image-inpaint-v1:0"></a>

若要呼叫 US Stable Image Inpaint 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-inpaint-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國穩定影像貼圖
<a name="cross-region-ip-us.stability.stable-outpaint-v1:0"></a>

若要呼叫 US Stable Image Outpaint 推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.stability.stable-outpaint-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](stable-image-services.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Remove Background
<a name="cross-region-ip-us.stability.stable-image-remove-background-v1:0"></a>

若要呼叫 US Stable Image Remove Background 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-remove-background-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Search and Recolor
<a name="cross-region-ip-us.stability.stable-image-search-recolor-v1:0"></a>

若要呼叫 US Stable Image Search and Recolor 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-search-recolor-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Search and Replace
<a name="cross-region-ip-us.stability.stable-image-search-replace-v1:0"></a>

若要呼叫 US Stable Image Search and Replace 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-search-replace-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Style Guide
<a name="cross-region-ip-us.stability.stable-image-style-guide-v1:0"></a>

若要呼叫 US Stable Image Style Guide 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-image-style-guide-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Style Transfer
<a name="cross-region-ip-us.stability.stable-style-transfer-v1:0"></a>

若要呼叫 US Stable Image Style Transfer 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.stability.stable-style-transfer-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-stability-diffusion.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US TwelveLabs Marengo 內嵌 3.0
<a name="cross-region-ip-us.twelvelabs.marengo-embed-3-0-v1:0"></a>

若要呼叫 US TwelveLabs Marengo Embed 3.0 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.twelvelabs.marengo-embed-3-0-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-marengo.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 

### US TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-us.twelvelabs.marengo-embed-2-7-v1:0"></a>

若要呼叫 US TwelveLabs Marengo Embed v2.7 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us.twelvelabs.marengo-embed-2-7-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-marengo.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 

### 美國作家 Palmyra X4
<a name="cross-region-ip-us.writer.palmyra-x4-v1:0"></a>

若要呼叫 US Writer Palmyra X4 推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.writer.palmyra-x4-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-writer-palmyra.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美國作家 Palmyra X5
<a name="cross-region-ip-us.writer.palmyra-x5-v1:0"></a>

若要呼叫 US Writer Palmyra X5 推論描述檔，請在其中一個來源區域中指定下列推論描述檔 ID：

```
us.writer.palmyra-x5-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-writer-palmyra.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US-GOV Claude 3 Haiku
<a name="cross-region-ip-us-gov.anthropic.claude-3-haiku-20240307-v1:0"></a>

若要呼叫 US-GOV Claude 3 Haiku 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us-gov.anthropic.claude-3-haiku-20240307-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### US-GOV Claude 3.5 Sonnet
<a name="cross-region-ip-us-gov.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

若要呼叫 US-GOV Claude 3.5 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us-gov.anthropic.claude-3-5-sonnet-20240620-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### US-GOV Claude 3.7 Sonnet
<a name="cross-region-ip-us-gov.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

若要呼叫 US-GOV Claude 3.7 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us-gov.anthropic.claude-3-7-sonnet-20250219-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### US-GOV Claude Sonnet 4.5
<a name="cross-region-ip-us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 US-GOV Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-west-1  | 
| us-gov-west-1 |  us-gov-west-1  | 

### APAC Anthropic Claude 3 Haiku
<a name="cross-region-ip-apac.anthropic.claude-3-haiku-20240307-v1:0"></a>

若要呼叫 APAC Anthropic Claude 3 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-3-haiku-20240307-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-sonnet-20240229-v1:0"></a>

若要呼叫 APAC Anthropic Claude 3 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-3-sonnet-20240229-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

若要呼叫 APAC Anthropic Claude 3.5 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-3-5-sonnet-20240620-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.5 Sonnet v2
<a name="cross-region-ip-apac.anthropic.claude-3-5-sonnet-20241022-v2:0"></a>

若要呼叫 APAC Anthropic Claude 3.5 Sonnet v2 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-3-5-sonnet-20241022-v2:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

若要呼叫 APAC Anthropic Claude 3.7 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-3-7-sonnet-20250219-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 

### APAC Claude Sonnet 4
<a name="cross-region-ip-apac.anthropic.claude-sonnet-4-20250514-v1:0"></a>

若要呼叫 APAC Claude Sonnet 4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.anthropic.claude-sonnet-4-20250514-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Lite
<a name="cross-region-ip-apac.amazon.nova-lite-v1:0"></a>

若要呼叫 APAC Nova Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.amazon.nova-lite-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Micro
<a name="cross-region-ip-apac.amazon.nova-micro-v1:0"></a>

若要呼叫 APAC Nova Micro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.amazon.nova-micro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Pro
<a name="cross-region-ip-apac.amazon.nova-pro-v1:0"></a>

若要呼叫 APAC Nova Pro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.amazon.nova-pro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Pegasus v1.2
<a name="cross-region-ip-apac.twelvelabs.pegasus-1-2-v1:0"></a>

若要呼叫 APAC Pegasus v1.2 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.twelvelabs.pegasus-1-2-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-pegasus.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 

### APAC TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-apac.twelvelabs.marengo-embed-2-7-v1:0"></a>

若要呼叫 APAC TwelveLabs Marengo Embed v2.7 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
apac.twelvelabs.marengo-embed-2-7-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-marengo.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 

### AU AU Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-au.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 AU AU Anthropic Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
au.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Haiku 4.5
<a name="cross-region-ip-au.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

若要呼叫 AU Anthropic Claude Haiku 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
au.anthropic.claude-haiku-4-5-20251001-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Opus 4.6
<a name="cross-region-ip-au.anthropic.claude-opus-4-6-v1"></a>

若要呼叫 AU Anthropic Claude Opus 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
au.anthropic.claude-opus-4-6-v1
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Sonnet 4.6
<a name="cross-region-ip-au.anthropic.claude-sonnet-4-6"></a>

若要呼叫 AU Anthropic Claude Sonnet 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
au.anthropic.claude-sonnet-4-6
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### CA Nova Lite
<a name="cross-region-ip-ca.amazon.nova-lite-v1:0"></a>

若要呼叫 CA Nova Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
ca.amazon.nova-lite-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 ca-west-1  | 
| ca-west-1 |  ca-central-1 ca-west-1  | 

### 歐洲 Amazon Nova 2 Lite
<a name="cross-region-ip-eu.amazon.nova-2-lite-v1:0"></a>

若要呼叫 EU Amazon Nova 2 Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.amazon.nova-2-lite-v1:0
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3 Haiku
<a name="cross-region-ip-eu.anthropic.claude-3-haiku-20240307-v1:0"></a>

若要呼叫 EU Anthropic Claude 3 Haiku 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-3-haiku-20240307-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-sonnet-20240229-v1:0"></a>

若要呼叫 EU Anthropic Claude 3 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-3-sonnet-20240229-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

若要呼叫 EU Anthropic Claude 3.5 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-3-5-sonnet-20240620-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

若要呼叫 EU Anthropic Claude 3.7 Sonnet 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-3-7-sonnet-20250219-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Haiku 4.5
<a name="cross-region-ip-eu.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

若要呼叫 EU Anthropic Claude Haiku 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-haiku-4-5-20251001-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Opus 4.5
<a name="cross-region-ip-eu.anthropic.claude-opus-4-5-20251101-v1:0"></a>

若要呼叫 EU Anthropic Claude Opus 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-opus-4-5-20251101-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Opus 4.6
<a name="cross-region-ip-eu.anthropic.claude-opus-4-6-v1"></a>

若要呼叫 EU Anthropic Claude Opus 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-opus-4-6-v1
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 EU Anthropic Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Sonnet 4.6
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-6"></a>

若要呼叫 EU Anthropic Claude Sonnet 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-sonnet-4-6
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Claude Sonnet 4
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-20250514-v1:0"></a>

若要呼叫 EU Claude Sonnet 4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.anthropic.claude-sonnet-4-20250514-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1  | 

### EU Cohere Embed v4
<a name="cross-region-ip-eu.cohere.embed-v4:0"></a>

若要呼叫 EU Cohere Embed v4 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.cohere.embed-v4:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-embed.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Meta Llama 3.2 1B Instruct
<a name="cross-region-ip-eu.meta.llama3-2-1b-instruct-v1:0"></a>

若要呼叫 EU Meta Llama 3.2 1B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.meta.llama3-2-1b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Meta Llama 3.2 3B Instruct
<a name="cross-region-ip-eu.meta.llama3-2-3b-instruct-v1:0"></a>

若要呼叫 EU Meta Llama 3.2 3B Instruct 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.meta.llama3-2-3b-instruct-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-meta.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Mistral Pixtral Large 25.02
<a name="cross-region-ip-eu.mistral.pixtral-large-2502-v1:0"></a>

若要呼叫 EU Mistral Pixtral Large 25.02 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.mistral.pixtral-large-2502-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-mistral.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 

### EU Nova Lite
<a name="cross-region-ip-eu.amazon.nova-lite-v1:0"></a>

若要呼叫 EU Nova Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.amazon.nova-lite-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### EU Nova Micro
<a name="cross-region-ip-eu.amazon.nova-micro-v1:0"></a>

若要呼叫 EU Nova Micro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.amazon.nova-micro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### EU Nova Pro
<a name="cross-region-ip-eu.amazon.nova-pro-v1:0"></a>

若要呼叫 EU Nova Pro 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.amazon.nova-pro-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### EU TwelveLabs Marengo 內嵌 3.0
<a name="cross-region-ip-eu.twelvelabs.marengo-embed-3-0-v1:0"></a>

若要呼叫 EU TwelveLabs Marengo Embed 3.0 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.twelvelabs.marengo-embed-3-0-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-marengo.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-eu.twelvelabs.marengo-embed-2-7-v1:0"></a>

若要呼叫 EU TwelveLabs Marengo Embed v2.7 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.twelvelabs.marengo-embed-2-7-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-marengo.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU TwelveLabs Pegasus v1.2
<a name="cross-region-ip-eu.twelvelabs.pegasus-1-2-v1:0"></a>

若要呼叫 EU TwelveLabs Pegasus v1.2 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
eu.twelvelabs.pegasus-1-2-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-pegasus.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### JP Amazon Nova 2 Lite
<a name="cross-region-ip-jp.amazon.nova-2-lite-v1:0"></a>

若要呼叫 JP Amazon Nova 2 Lite 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
jp.amazon.nova-2-lite-v1:0
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Haiku 4.5
<a name="cross-region-ip-jp.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

若要呼叫 JP Anthropic Claude Haiku 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
jp.anthropic.claude-haiku-4-5-20251001-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-jp.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

若要呼叫 JP Anthropic Claude Sonnet 4.5 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
jp.anthropic.claude-sonnet-4-5-20250929-v1:0
```

如需此模型推論參數的詳細資訊，請參閱[連結](model-parameters-claude.md)。

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Sonnet 4.6
<a name="cross-region-ip-jp.anthropic.claude-sonnet-4-6"></a>

若要呼叫 JP Anthropic Claude Sonnet 4.6 推論設定檔，請在其中一個來源區域中指定下列推論設定檔 ID：

```
jp.anthropic.claude-sonnet-4-6
```

下表顯示您可以從中呼叫推論設定檔的來源區域，以及可路由請求的目的地區域：


| 來源區域 | 目的地區域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

## 應用程式推論設定檔支援的區域和模型
<a name="inference-profiles-support-user"></a>

您可以為下列所有模型建立應用程式推論設定檔 AWS 區域：
+ ap-northeast-1
+ ap-northeast-2
+ ap-south-1
+ ap-southeast-1
+ ap-southeast-2
+ ca-central-1
+ eu-central-1
+ eu-west-1
+ eu-west-2
+ eu-west-3
+ sa-east-1
+ us-east-1
+ us-east-2
+ us-gov-east-1
+ us-west-2

您可從 Amazon Bedrock 支援的所有模型和推論設定檔建立應用程式推論設定檔。如需 Amazon Bedrock 支援之模型的詳細資訊，請參閱 [Amazon Bedrock 中支援的基礎模型](models-supported.md)。

# 推論設定檔的必要條件
<a name="inference-profiles-prereq"></a>

請先檢查您是否符合下列必要條件後，才可使用推論設定檔：
+ 您的角色可以存取推論設定檔 API 動作。如果您的角色已連接 [AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess)AWS 受管政策，您可以略過此步驟。若否，請執行下列操作：

  1. 遵循[建立 IAM 政策](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html)中的步驟，並建立下列政策，允許角色執行推論設定檔相關的動作，並使用所有基礎模型和推論設定檔執行模型推論。

------
#### [ JSON ]

****  

     ```
     {
         "Version":"2012-10-17",		 	 	 
         "Statement": [
             {
                 "Effect": "Allow",
                 "Action": [
                     "bedrock:InvokeModel*",
                     "bedrock:CreateInferenceProfile"
                 ],
                 "Resource": [
                     "arn:aws:bedrock:*::foundation-model/*",
                     "arn:aws:bedrock:*:*:inference-profile/*",
                     "arn:aws:bedrock:*:*:application-inference-profile/*"
                 ]
             },
             {
                 "Effect": "Allow",
                 "Action": [
                     "bedrock:GetInferenceProfile",
                     "bedrock:ListInferenceProfiles",
                     "bedrock:DeleteInferenceProfile",
                     "bedrock:TagResource",
                     "bedrock:UntagResource",
                     "bedrock:ListTagsForResource"
                 ],
                 "Resource": [
                     "arn:aws:bedrock:*:*:inference-profile/*",
                     "arn:aws:bedrock:*:*:application-inference-profile/*"
                 ]
             }
         ]
     }
     ```

------

     (選用) 您可以透過以下方式限制角色的存取：
     + 若要限制角色可以執行的 API 動作，請將 `Action` 欄位中的清單修改為僅包含您要允許存取的 [API 操作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions)。
     + 若要限制角色對特定推論設定檔的存取，請將 `Resource` 清單修改為僅包含您要允許存取的[推論設定檔](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)和基礎模型。系統定義的推論設定檔開頭為 `inference-profile`，而應用程式推論設定檔開頭為 `application-inference-profile`。
**重要**  
當您在第一個陳述式的 `Resource` 欄位中指定推論設定檔時，還必須在與其相關聯的每個區域中指定基礎模型。
     + 若要限制使用者存取，以便他們只能透過推論設定檔調用基礎模型，請新增 `Condition` 欄位並使用 `aws:InferenceProfileArn` [條件金鑰](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys)。指定您要篩選存取權的推論設定檔。此條件可以包含在範圍為 `foundation-model` 資源的陳述式中。
     + 例如，您可以將下列政策連接至角色，使其只能透過 us-west-2 中帳戶 *111122223333* 中的美國 Anthropic Claude 3 Haiku 推論設定檔調用 Anthropic Claude 3 Haiku 模型：

------
#### [ JSON ]

****  

       ```
       {
           "Version":"2012-10-17",		 	 	 
           "Statement": [
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-west-2:111122223333:inference-profile/us.anthropic.claude-3-haiku-20240307-v1:0"
                   ]
               },
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                       "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0"
                   ],
                   "Condition": {
                       "StringLike": {
                           "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-west-2:111122223333:inference-profile/us.anthropic.claude-3-haiku-20240307-v1:0"
                       }
                   }
               }
           ]
       }
       ```

------
     + 例如，您可以將下列政策連接至角色，使其只能透過 us-east-2 (美國東部 (俄亥俄)) 中帳戶 111122223333 中的全域 Claude Sonnet 4 推論設定檔調用 Anthropic Claude Sonnet 4 模型。

------
#### [ JSON ]

****  

       ```
       {
           "Version":"2012-10-17",		 	 	 
           "Statement": [
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-2:111122223333:inference-profile/global.anthropic.claude-sonnet-4-20250514-v1:0"
                   ]
               },
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
                       "arn:aws:bedrock:::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0"
                   ],
                   "Condition": {
                       "StringLike": {
                           "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-east-2:111122223333:inference-profile/global.anthropic.claude-sonnet-4-20250514-v1:0"
                       }
                   }
               }
           ]
       }
       ```

------
     + 您也可以透過新增明確拒絕 `StringEquals` 條件來檢查請求內容索引鍵 `aws:RequestedRegion` 是否等於未指定，藉此限制使用全域 Claude Sonnet 4 推論設定檔。由於符合 `StringEquals`，拒絕會覆寫任何允許並封鎖推論請求的全域路由。

       ```
       {
           "Effect": "Deny",
           "Action": [
               "bedrock:InvokeModel*"
           ],
           "Resource": "*",
           "Condition": {
               "StringEquals": {
                   "aws:RequestedRegion": "unspecified"
               }
           }
       },
       ```

  1. 請依照[新增及移除 IAM 身分許可](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html)中的步驟，將政策連接至角色，以授予角色檢視和使用所有推論設定檔的許可。
+ 您已在要呼叫推論設定檔的區域中，請求存取您所要使用推論設定檔中定義的模型。

# 建立應用程式推論設定檔
<a name="inference-profiles-create"></a>

您可以使用一或多個區域來建立應用程式推論設定檔，以追蹤調用模型時的用量和成本。
+ 若要為一個區域建立應用程式推論設定檔，請指定基礎模型。將追蹤使用該模型向該區域提出之請求的使用量和成本。
+ 若要為多個區域建立應用程式推論設定檔，請指定跨區域 (系統定義) 推論設定檔。推論設定檔會將請求路由到您選擇的跨區域 (系統定義) 推論設定檔中定義的區域。將追蹤對推論設定檔中的區域提出之請求的使用量和成本。

目前，您只能使用 Amazon Bedrock API 建立推論設定檔。

若要建立推論設定檔，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [CreateInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateInferenceProfile.html) 請求。

下列是必要欄位：


****  

| 欄位 | 使用案例 | 
| --- | --- | 
| inferenceProfileName | 指定推論設定檔的名稱。 | 
| modelSource | 指定基礎模型或跨區域 (系統定義) 推論設定檔，以定義您要追蹤成本和用量的模型和區域。 | 

以下是選填欄位：


****  

| 欄位 | 使用案例 | 
| --- | --- | 
| description | 提供推論設定檔的說明。 | 
| 標籤 | 將標籤連接至推論設定檔。如需詳細資訊，請參閱 [標記 Amazon Bedrock 資源](tagging.md) 和[使用成本分配標籤組織和追蹤AWS成本](https://docs.aws.amazon.com//awsaccountbilling/latest/aboutv2/cost-alloc-tags.html)。 | 
| clientRequestToken | 為確保 API 請求，僅完成一次。如需詳細資訊，請參閱[確保冪等性](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html)。 | 

回應會傳回可用於其他推論設定檔相關動作的 `inferenceProfileArn`，以及可搭配模型調用和 Amazon Bedrock 資源使用。

# 修改應用程式推論設定檔的標籤
<a name="inference-profiles-modify"></a>

建立應用程式推論設定檔後，您仍可以透過 Amazon Bedrock API 管理標籤，方法是使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)提交 [TagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_TagResource.html) 或 [UntagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UntagResource.html) 請求，並在 `resourceArn` 欄位中指定應用程式推論設定檔的 ARN。如需進一步了解標記的資訊，請參閱 [標記 Amazon Bedrock 資源](tagging.md)。

# 檢視推論設定檔的相關資訊
<a name="inference-profiles-view"></a>

您可以檢視已建立的跨區域推論設定檔或應用程式推論設定檔的相關資訊。若要了解如何檢視推論設定檔的相關資訊，請選擇您偏好方法的標籤，然後遵循下列步驟：

------
#### [ Console ]

**檢視跨區域 (系統定義) 推論設定檔的相關資訊**

1. 使用具有 Amazon Bedrock 主控台使用許可的 IAM 身分登入AWS 管理主控台。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中，選取**跨區域推論**。然後，在**跨區域推論**區段中，選擇推論設定檔。

1. 在推論設定檔詳細資訊區段中，檢視**推論設定檔的詳細資訊**，以及在**模型**區段中包含的區域。

**注意**  
您無法在 Amazon Bedrock 主控台中檢視應用程式推論設定檔。

------
#### [ API ]

若要取得推論設定檔的相關資訊，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [GetInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html) 請求，並在 `inferenceProfileIdentifier` 欄位中，指定推論設定檔的 Amazon Resource Name (ARN) 或 ID。

若要列出您可以使用的推論設定檔相關資訊，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [ListInferenceProfiles](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListInferenceProfiles.html) 請求。您可以指定以下選用參數：


****  

| 欄位 | 簡短描述 | 
| --- | --- | 
| maxResults | 回應傳回的結果數目上限。 | 
| nextToken | 如果結果多於您在 maxResults 欄位中指定的數字，則回應會傳回 nextToken 值。若要查看下一批結果，請在另一個請求中傳送 nextToken 值。 | 

------

# 在模型調用中使用推論設定檔
<a name="inference-profiles-use"></a>

您可以使用跨區域推論設定檔來取代基礎模型，將請求路由到多個區域。若要追蹤模型的成本和用量，您可以在一或多個區域中使用應用程式推論設定檔。若要了解如何在執行模型推論時使用推論設定檔，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

若要搭配支援推論設定檔的功能使用推論設定檔，請執行下列動作：

1. 使用具有 Amazon Bedrock 主控台使用許可的 IAM 身分登入AWS 管理主控台。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 導覽至您要使用推論設定檔的功能頁面。例如，從左側導覽窗格中選取**聊天/文字遊樂場**。

1. 選擇**選取模型**，然後選擇模型。例如，選擇 **Amazon**，然後選擇 **Nova Premier**。

1. 在**推論**下，從下拉式功能表中選取**推論設定檔**。

1. 選取要使用的推論設定檔 (例如，**US Nova Premier**)，然後選擇**套用**。

------
#### [ API ]

從包含於下列 API 操作的任何區域執行推論時，您可以使用推論設定檔：
+ [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 或 [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) – 若要在模型調用中使用推論設定檔，請遵循 [使用 InvokeModel 提交單一提示](inference-invoke.md) 中的步驟，並在 `modelId` 欄位中指定推論設定檔的 Amazon Resource Name (ARN)。如需範例，請參閱[在模型調用中使用推論設定檔](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html#API_runtime_InvokeModel_Example_5)。
+ [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或 [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html) – 若要搭配 Converse API 在模型調用中使用推論設定檔，請遵循 [與 Converse API 操作進行對話](conversation-inference.md) 中的步驟，並在 `modelId` 欄位中指定推論設定檔的 ARN。如需範例，請參閱[在對話中使用推論設定檔](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_Example_5)。
+ [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) – 若要在從查詢知識庫的結果產生回應時使用推論設定檔，請遵循 [使用查詢和回應測試您的知識庫](knowledge-base-test.md) 中 API 索引標籤中的步驟，並在 `modelArn` 欄位中指定推論設定檔的 ARN。如需詳細資訊，請參閱[使用推論設定檔來產生回應](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html#API_agent-runtime_RetrieveAndGenerate_Example_3)。
+ [CreateEvaluationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateEvaluationJob.html) – 若要提交用於模型評估的推論設定檔，請遵循 [在 Amazon Bedrock 中啟動自動模型評估任務](model-evaluation-jobs-management-create.md) 中 API 索引標籤中的步驟，並在 `modelIdentifier` 欄位中指定推論設定檔的 ARN。
+ [CreatePrompt](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreatePrompt.html) – 若要在為提示管理中建立的提示產生回應時使用推論設定檔，請遵循 [使用提示管理建立提示](prompt-management-create.md) 中 API 索引標籤中的步驟，並在 `modelId` 欄位中指定推論設定檔的 ARN。
+ [CreateFlow](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateFlow.html) – 若要在為流程中提示節點內定義的內嵌提示產生回應時使用推論設定檔，請遵循 [在 Amazon Bedrock 中建立和設計流程](flows-create.md) 中 API 索引標籤中的步驟。在定義[提示節點](flows-nodes.md#flows-nodes-prompt)時，請在 `modelId` 欄位中指定推論設定檔的 ARN。
+ [CreateDataSource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateDataSource.html) – 若要在剖析資料來源中的非文字資訊時使用推論設定檔，請遵循 [剖析資料來源的選項](kb-advanced-parsing.md) 中 API 區段中的步驟，並在 `modelArn` 欄位中指定推論設定檔的 ARN。

**注意**  
如果您使用的是跨區域 (系統定義) 推論設定檔，則可以使用 ARN 或推論設定檔的 ID。

------

# 刪除應用程式推論設定檔
<a name="inference-profiles-delete"></a>

如果您不再需要某個應用程式推論設定檔，可以將它刪除。您只能透過 Amazon Bedrock API 刪除推論設定檔。

若要刪除推論設定檔，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [DeleteInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteInferenceProfiles.html) 請求，並在 `inferenceProflieIdentifier` 欄位中指定要刪除之推論設定檔的 Amazon Resource Name (ARN) 或 ID。

# 使用 Amazon Bedrock 中的佈建輸送量增加模型調用容量
<a name="prov-throughput"></a>

**輸送量**是指模型處理和傳回的輸入和輸出的數量和速率。您可以購買**佈建輸送量**，以固定成本為模型佈建更高層級的輸送量。如果您自訂模型，則必須購買佈建輸送量才能使用它。

您需要按小時支付所購買佈建輸送量的費用。如需定價的詳細資訊，請參閱 [Amazon Bedrock 定價](https://aws.amazon.com/bedrock/pricing)。每小時價格取決於下列因素：

1. 您選擇的模型 （對於自訂模型，定價與其自訂的基礎模型相同）。

1. 您為佈建輸送量指定的模型單位 (MUs) 數量。MU 為指定的模型提供特定的輸送量層級。MU 的輸送量層級會指定下列項目：
   + MU 可在一分鐘內跨所有請求處理的輸入字符數量。
   + MU 可在一分鐘內跨所有請求產生的輸出字符數量。
**注意**  
如需 MU 指定內容、每個 MU 定價以及請求提高限制的詳細資訊，請聯絡您的 AWS 帳戶 管理員。

1. 您承諾保留佈建輸送量的持續時間。承諾持續時間越長，每小時價格的折扣就越大。您可以選擇以下層級的承諾：
   + 無承諾 – 您可以隨時刪除佈建輸送量。
   + 1 個月 – 在一個月承諾期限結束之前，您無法刪除佈建輸送量。
   + 6 個月 – 在六個月的承諾期限結束之前，您無法刪除佈建輸送量。
**注意**  
帳單會持續進行，直到您刪除佈建輸送量為止。

下列步驟概述設定和使用佈建輸送量的程序。

1. 決定您要為佈建輸送量購買的 MUs 數量，以及您要承諾使用佈建輸送量的時間量。

1. 購買基礎或自訂模型的佈建輸送量。

1. 建立佈建模型之後，您可以使用它[來執行模型推論](inference.md)。

**Topics**
+ [佈建輸送量的支援區域和模型](prov-thru-supported.md)
+ [佈建輸送量的先決條件](prov-thru-prereq.md)
+ [購買 Amazon Bedrock 模型的佈建輸送量](prov-thru-purchase.md)
+ [檢視佈建輸送量的相關資訊](prov-thru-info.md)
+ [修改佈建的輸送量](prov-thru-edit.md)
+ [搭配 Amazon Bedrock 資源使用佈建輸送量](prov-thru-use.md)
+ [刪除佈建輸送量或取消自動續約](prov-thru-delete.md)
+ [佈建輸送量的程式碼範例](prov-thru-code-examples.md)

# 佈建輸送量的支援區域和模型
<a name="prov-thru-supported"></a>

如果您透過 Amazon Bedrock API 購買佈建輸送量，則必須為模型 ID 指定 Amazon Bedrock FM 內容變體。

**注意**  
只有不承諾購買的自訂模型才支援 AWS GovCloud （美國西部） 中的佈建輸送量。當購買自訂模型的佈建輸送量時，請使用該模型的 ID。

下表顯示您可以購買佈建輸送量的模型、購買佈建輸送量時使用的模型 ID，以及您可以 AWS 區域 為模型購買佈建輸送量的 。


| 供應商 | 模型 | 模型 ID | 單一區域模型支援 | 
| --- | --- | --- | --- | 
| Amazon | Nova 2 Lite | amazon.nova-2-lite-v1：0：256k |  us-east-1  | 
| Amazon | Nova Canvas | amazon.nova-canvas-v1:0 |  us-east-1  | 
| Amazon | Nova Lite | amazon.nova-lite-v1：0：24k |  us-east-1  | 
| Amazon | Nova Lite | amazon.nova-lite-v1:0:300k |  us-east-1  | 
| Amazon | Nova Micro | amazon.nova-micro-v1:0:128k |  us-east-1  | 
| Amazon | Nova Micro | amazon.nova-micro-v1：0：24k |  us-east-1  | 
| Amazon | Nova Pro | amazon.nova-pro-v1：0：24k |  us-east-1  | 
| Amazon | Nova Pro | amazon.nova-pro-v1:0:300k |  us-east-1  | 
| Amazon | Titan Embeddings G1 - Text | amazon.titan-embed-text-v1:2:8k |  us-east-1 us-west-2  | 
| Amazon | Titan Image Generator G1 v2 | amazon.titan-image-generator-v2：0 |  us-east-1 us-west-2  | 
| Amazon | Titan Multimodal Embeddings G1 | amazon.titan-embed-image-v1:0 |  ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:0:100k |  us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:0:18k |  us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:1:18k |  eu-central-1 us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2：1：200k |  eu-central-1 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0:200k |  ap-southeast-2 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1：0：48k |  ap-south-1 ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1：0：200k |  ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1：0：28k |  ap-south-1 ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1：0：18k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1：0：200k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1：0：51k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2：0：18k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2：0：200k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2：0：51k |  us-west-2  | 
| Anthropic | Claude Instant | anthropic.claude-instant-v1：2：100k |  us-east-1 us-west-2  | 
| Cohere | Embed English | cohere.embed-english-v3：0：512 |  ca-central-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Cohere | Embed Multilingual | cohere.embed-multilingual-v3：0：512 |  ca-central-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Meta | Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0:128k |  us-west-2  | 
| Meta | Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0:128k |  us-west-2  | 
| Meta | Llama 3.2 11B Instruct | meta.llama3-2-11b-instruct-v1：0：128k |  us-west-2  | 
| Meta | Llama 3.2 1B Instruct | meta.llama3-2-1b-instruct-v1:0:128k |  us-west-2  | 
| Meta | Llama 3.2 3B Instruct | meta.llama3-2-3b-instruct-v1:0:128k |  us-west-2  | 
| Meta | Llama 3.2 90B Instruct | meta.llama3-2-90b-instruct-v1：0：128k |  us-west-2  | 

**注意**  
下列模型不支援基本模型的無承諾購買：  
Titan Image Generator G1 V1
Titan Image Generator G1 V2

# 佈建輸送量的先決條件
<a name="prov-thru-prereq"></a>

您必須先滿足下列先決條件，才能購買和管理佈建輸送量：

1. [請求存取您要為其購買佈建輸送量的模型](model-access.md)。授予存取權後，您可以為基本模型和從中自訂的任何模型購買佈建輸送量。

1. 確保您的 IAM 角色可存取佈建輸送量 API 動作。如果您的角色已連接 [AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess)AWS 受管政策，您可以略過此步驟。若否，請執行下列操作：

   1. 請遵循[建立 IAM 政策](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html)中的步驟，並建立下列政策，允許角色為所有基礎和自訂模型建立佈建輸送量。

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Sid": "PermissionsForProvisionedThroughput",
                  "Effect": "Allow",
                  "Action": [
                      "bedrock:GetFoundationModel",
                      "bedrock:ListFoundationModels",
                      "bedrock:GetCustomModel",
                      "bedrock:ListCustomModels",
                      "bedrock:InvokeModel",
                      "bedrock:InvokeModelWithResponseStream",
                      "bedrock:ListTagsForResource",
                      "bedrock:UntagResource",
                      "bedrock:TagResource",
                      "bedrock:CreateProvisionedModelThroughput",
                      "bedrock:GetProvisionedModelThroughput",
                      "bedrock:ListProvisionedModelThroughputs",
                      "bedrock:UpdateProvisionedModelThroughput",
                      "bedrock:DeleteProvisionedModelThroughput"
                  ],
                  "Resource": "*"
              }
          ]
      }
      ```

------
**注意**  
如果您使用佈建輸送量搭配跨區域推論，您可能需要額外的許可。如需進一步了解，請參閱[透過跨區域推論增加輸送量](cross-region-inference.md)。

      (選用) 您可以透過以下方式限制角色的存取：
      + 若要限制角色可以執行的 API 動作，請將 `Action` 欄位中的清單修改為僅包含您要允許存取的 [API 操作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions)。
      + 建立佈建模型後，您可以將`Resource`清單修改為僅包含您要允許存取的[佈建模型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)，以限制角色使用佈建模型執行 API 請求的能力。如需範例，請參閱 [允許使用者調用佈建模型](security_iam_id-based-policy-examples.md#security_iam_id-based-policy-examples-perform-actions-pt)。
      + 若要限制角色從特定基礎模型或自訂模型建立佈建模型的能力，請將`Resource`清單修改為僅包含您想要允許存取[的基礎模型和自訂模型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)。

   1. 請遵循[新增和移除 IAM 身分許可](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html)中的步驟，將政策連接至角色以授予角色許可。

1. 如果您要為使用客戶受管AWS KMS金鑰加密的自訂模型購買佈建輸送量，您的 IAM 角色必須具有解密金鑰的許可。您可以在 使用 範本[了解如何建立客戶自管金鑰，以及如何將金鑰政策連接至其中](encryption-custom-job.md#encryption-key-policy)。對於最低許可，您只能使用*自訂模型使用者政策陳述式的許可*。

# 購買 Amazon Bedrock 模型的佈建輸送量
<a name="prov-thru-purchase"></a>

Amazon Bedrock 提供兩種類型的佈建輸送量 - 依權杖和依模型單位。如需您要購買的佈建輸送量類型，請參閱下列說明。

若要進一步了解兩種佈建輸送量類型之間的差異，請參閱 [使用 Amazon Bedrock 中的佈建輸送量增加模型調用容量](prov-throughput.md)。

## 依模型單位佈建的輸送量
<a name="prov-thru-purchase-MUs"></a>

當您為模型購買依模型單位的佈建輸送量時，您可以指定其承諾層級，以及要分配的模型單位 (MUs數量。如需 MU 配額，請參閱 中的 [Amazon Bedrock 端點和配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html)AWS 一般參考。在您可以購買佈建輸送量 （有承諾或不承諾） 之前，您必須先造訪[AWS支援中心](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase)，請求您的帳戶MUs 要在佈建輸送量之間分佈。授予請求後，您可以購買佈建輸送量。

**注意**  
購買佈建輸送量之後，如果其與自訂模型相關聯，您可以指定下列其中一個選項來變更模型：  
自訂模型的自訂基礎模型
從與自訂模型相同的基礎模型自訂的另一個自訂模型
您只能變更與自訂模型相關聯之佈建輸送量的相關模型。

若要了解如何購買模型的佈建輸送量，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

1. AWS 管理主控台使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中選取**佈建輸送量**。

1. 在**佈建輸送量**區段中，選擇**購買佈建輸送量**。

1. 針對**佈建輸送量詳細資訊**區段，執行下列動作：

   1. 在**佈建輸送量名稱**欄位中，輸入佈建輸送量的名稱。

   1. 在**選取模型**下，選取基礎模型提供者或自訂模型類別。然後選取要佈建輸送量的模型。
**注意**  
若要查看您可以在無需承諾的情況下購買佈建輸送量的基本模型，請參閱支援的模型文件。  
在 AWS GovCloud (US)區域中，您只能為沒有承諾的自訂模型購買佈建輸送量。

   1. （選用） 若要將標籤與您的佈建輸送量建立關聯，請展開**標籤**區段，然後選擇**新增標籤**。如需詳細資訊，請參閱[標記 Amazon Bedrock 資源](tagging.md)。

1. 針對**佈建模式**，選取**依模型單位**

1. 針對**承諾期間和模型單位**區段，執行下列動作：

   1. 在**選取承諾期限**區段中，選取您要承諾使用佈建輸送量的時間量。

   1. 在**模型單位**欄位中，輸入所需的模型單位 (MUs) 數量。如果您要佈建具有承諾的模型，您必須先造訪[AWS支援中心](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase)，請求增加您可以購買的 MUs 數量。

1. 選擇**購買佈建輸送量**。

1. 複查出現的備註，並選取核取方塊以確認履約承諾期間與價格。然後選擇**確認購買**。

1. 主控台會顯示**佈建輸送量**概觀頁面。佈建輸送量資料表中佈建輸送量**的狀態**會變成**建立**。當佈建輸送量建立完成時，**狀態**會變成**服務中**。如果更新失敗，**狀態**會變成**失敗**。

------
#### [ API ]

若要購買佈建輸送量，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html) 請求。

若要進一步了解請求內文的內容，以及建立依模型單位佈建輸送量所需的參數，請參閱《*Amazon Bedrock API 參考*》中的 [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)。

**注意**  
若要查看您可以在無需承諾的情況下購買佈建輸送量的基本模型，請參閱支援的模型文件。  
在 AWS GovCloud (US)區域中，您只能為沒有承諾的自訂模型購買佈建輸送量。

回應會傳回`provisionedModelArn`可用於[模型推論](inference.md)`modelId`的 。若要檢查佈建輸送量何時可供使用，請傳送 [GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 請求，並檢查狀態是否為 `InService`。如果更新失敗，其狀態將為 `Failed`，[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 回應將包含 `failureMessage`。

[查看程式碼範例](prov-thru-code-examples.md)

------

# 檢視佈建輸送量的相關資訊
<a name="prov-thru-info"></a>

若要了解如何檢視已購買佈建輸送量的相關資訊，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

**檢視佈建輸送量的相關資訊**

1.  AWS 管理主控台 使用具有使用 Amazon Bedrock 主控台許可的 IAM 身分登入 。然後，開啟位於 https：//[https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中選取**佈建輸送量**。

1. 從**佈建輸送量**區段中，選取佈建輸送量。

1. 在佈建輸送量**概觀區段中檢視佈建輸送量**的詳細資訊，以及在標籤區段中檢視與佈建輸送量相關聯的**標籤**。

------
#### [ API ]

若要擷取特定佈建輸送量的相關資訊，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 請求。將佈建輸送量的名稱或其 ARN 指定為 `provisionedModelId`。

若要列出帳戶中所有佈建輸送量的相關資訊，請傳送 [ListProvisionedModelThroughputs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListProvisionedModelThroughputs.html) 請求與 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)。若要控制傳回的結果數量，您可以指定下列選用參數：


****  

| 欄位 | 簡短描述 | 
| --- | --- | 
| maxResults | 回應傳回的結果數目上限。 | 
| nextToken | 如果結果多於您在 maxResults 欄位中指定的數字，回應會傳回nextToken值。若要查看下一批結果，請在另一個請求中傳送 nextToken值。 | 

如需您可以指定來排序和篩選結果的其他選用參數，請參閱 [ListProvisionedModelThroughputs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListProvisionedModelThroughputs.html)。

若要列出佈建輸送量的所有標籤，請傳送 [ListTagsForResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListTagsForResource.html) 請求與 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)，並包含佈建輸送量的 Amazon Resource Name (ARN)。

[請參閱程式碼範例](prov-thru-code-examples.md)

------

# 修改佈建的輸送量
<a name="prov-thru-edit"></a>

您可以在購買後編輯的佈建輸送量層面取決於佈建模式。對於依模型單位的佈建輸送量，您只能編輯佈建輸送量的名稱和標籤，如果模型是自訂模型，則可以編輯該模型。

透過依權杖佈建的輸送量，您可以有更多選項，包括修改佈建輸送量的每分鐘輸入和輸出權杖數量。

請參閱下列各節，進一步了解如何編輯您要修改的佈建輸送量類型。

## 依模型單位修改佈建的輸送量
<a name="prov-thru-edit-MUs"></a>

您可以編輯現有佈建輸送量的名稱或標籤。

下列限制適用於變更與佈建輸送量相關聯的模型：
+ 您無法變更與基礎模型相關聯之佈建輸送量的模型。
+ 如果佈建的輸送量與自訂模型相關聯，您可以將關聯變更為其自訂的基礎模型，或從相同基礎模型衍生的另一個自訂模型。

佈建輸送量更新時，您可以使用佈建輸送量執行推論，而不會中斷來自最終客戶的持續流量。如果您變更與佈建輸送量相關聯的模型，您可能會收到舊模型的輸出，直到完全部署更新為止。

若要了解如何編輯佈建輸送量，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

1. AWS 管理主控台使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中選取**佈建輸送量**。

1. 從**佈建輸送量**區段中，選取佈建輸送量。

1. 選擇**編輯**。您可以編輯下列欄位：
   + **佈建輸送量名稱** – 變更佈建輸送量的名稱。
   + **選取模型** – 如果佈建的輸送量與自訂模型相關聯，您可以變更相關聯的模型。

1. 您可以在標籤區段中編輯與佈建輸送量相關聯的**標籤**。如需詳細資訊，請參閱[標記 Amazon Bedrock 資源](tagging.md)。

1. 若要儲存變更，請選擇**儲存編輯**。

1. 主控台會顯示**佈建輸送量**概觀頁面。佈建輸送量資料表中佈建輸送量**的狀態**會變成**更新**。當佈建輸送量完成更新時，**狀態**會變成**服務中**。如果更新失敗，**狀態**會變成**失敗**。

------
#### [ API ]

若要編輯佈建輸送量，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html) 請求。

若要進一步了解請求內文和您需要提供的參數，請參閱《*Amazon Bedrock API 參考*》中的 [UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html)。

如果動作成功，回應會傳回 HTTP 200 狀態回應。若要檢查佈建輸送量何時可供使用，請傳送 [GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 請求，並檢查狀態是否為 `InService`。當佈建輸送量的狀態為 時，您無法更新或刪除該佈建輸送量`Updating`。如果更新失敗，其狀態將為 `Failed`，[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 回應將包含 `failureMessage`。

若要將標籤新增至佈建輸送量，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [TagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_TagResource.html) 請求，並包含佈建輸送量的 Amazon Resource Name (ARN)。請求內文包含 `tags` 欄位，該欄位是包含您為每個標籤指定的鍵值對的物件。

若要從佈建輸送量中移除標籤，請使用 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)傳送 [UntagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UntagResource.html) 請求，並包含佈建輸送量的 Amazon Resource Name (ARN)。`tagKeys` 請求參數是包含您要移除之標籤索引鍵的清單。

[查看程式碼範例](prov-thru-code-examples.md)

------

# 搭配 Amazon Bedrock 資源使用佈建輸送量
<a name="prov-thru-use"></a>

購買佈建輸送量後，您可以搭配下列功能使用：
+ **模型推論** – 您可以在 Amazon Bedrock 主控台遊樂場中測試佈建輸送量。當您準備好部署佈建輸送量時，請設定您的應用程式來叫用佈建模型。選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

**在 Amazon Bedrock 主控台遊樂場中使用佈建輸送量**

  1. AWS 管理主控台使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

  1. 在左側導覽窗格中，根據您的使用案例，選取**遊樂場**下的**聊天**、**文字**或**影像**。

  1. 選擇**選取模型**。

  1. 在 **1 中。類別**欄，選取提供者或自訂模型類別。然後，在 **2 中。模型**欄，選取與佈建輸送量相關聯的模型。

  1. 在 **3 中。輸送量**欄，選取您的佈建輸送量。

  1. 選擇**套用**。

  若要了解如何使用 Amazon Bedrock 遊樂場，請參閱 [使用遊樂場在主控台中產生回應](playgrounds.md)。

------
#### [ API ]

  若要使用佈建的輸送量執行推論，請傳送 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)、[InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)、[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或 [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html) 請求與 [Amazon Bedrock 執行時間端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)。指定佈建的模型 ARN 做為 `modelId` 參數。若要查看不同模型的請求內文需求，請參閱 [基礎模型的推論請求參數和回應欄位](model-parameters.md)。

  [查看程式碼範例](prov-thru-code-examples.md)

------
+ **將佈建輸送量與客服人員別名建立關聯** – 您可以在[建立](agents-deploy.md)或[更新](agents-alias-edit.md)客服人員別名時建立佈建輸送量的關聯。在 Amazon Bedrock 主控台中，您可以在設定別名或編輯別名時選擇佈建輸送量。在 Amazon Bedrock API 中，當您傳送 [CreateAgentAlias](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgentAlias.html) 或 [UpdateAgentAlias](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_UpdateAgentAlias.html); 請求時，您可以在 `routingConfiguration` 中指定 `provisionedThroughput`。

# 刪除佈建輸送量或取消自動續約
<a name="prov-thru-delete"></a>

您的佈建輸送量會在每個承諾期限結束時自動續約，以維護您目前的輸入和輸出字符組態。

如果您不想保留佈建輸送量，您可以刪除它，或者，對於權杖的佈建輸送量，取消自動續約，以防止在目前期限結束時續約。

## 刪除佈建輸送量
<a name="prov-thru-delete-del"></a>

當您刪除佈建的輸送量時，將無法再以您購買的輸送量層級叫用模型。如果您刪除與自訂模型相關聯的佈建輸送量，則不會刪除自訂模型。若要了解如何刪除自訂模型，請參閱 [刪除自訂模型](model-customization-delete.md)。

**注意**  
在承諾期限完成之前，您無法刪除模型單位具有承諾的佈建輸送量。

若要了解如何刪除佈建輸送量，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

1. AWS 管理主控台使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中選取**佈建輸送量**。

1. 從**佈建輸送量**區段中，選取佈建輸送量。

1. 從**動作**下拉式功能表中選擇**刪除**。

1. 主控台會顯示模態形式來警告您刪除是永久的。選擇**確認**以繼續。

1. 佈建輸送量會立即刪除。

------
#### [ API ]

若要刪除佈建輸送量，請傳送 [DeleteProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteProvisionedModelThroughput.html) 請求與 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)。將佈建輸送量的名稱或其 ARN 指定為 `provisionedModelId`。如果刪除成功，回應會傳回 HTTP 200 狀態碼。

[查看程式碼範例](prov-thru-code-examples.md)

------

## 取消佈建輸送量的自動續約
<a name="prov-thru-delete-cancel-auto-renew"></a>

對於權杖佈建的輸送量，您可以在承諾期限結束前隨時取消自動續約，以防止佈建的輸送量自動續約。

如果您取消自動續約，您的佈建輸送量將保持服務狀態，直到您的承諾期限結束為止。無論您是否執行推論，仍需支付目前期間的完整佈建費用。

取消佈建輸送量的自動續約後，您無法在承諾期間的剩餘時間內對佈建輸送量進行任何進一步修改。

**注意**  
一旦取消，就無法重新啟用自動續約。如果您在目前期限到期後需要佈建輸送量，則需要購買新的佈建輸送量。

若要了解如何取消依權杖佈建輸送量的自動續約，請選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ Console ]

1. AWS 管理主控台使用具有使用 Amazon Bedrock 主控台之許可的 IAM 身分登入 。接著，開啟位於 [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock) 的 Amazon Bedrock 主控台。

1. 從左側導覽窗格中選取**佈建輸送量**。

1. 從**佈建輸送量**區段中，選取佈建輸送量。

1. 從**動作**下拉式功能表中選擇**取消自動續約**。

1. 主控台會顯示模態表單，以警告您無法復原此動作。選擇**確認**以繼續。

1. 佈建輸送量會保持作用中狀態，直到目前承諾期限結束為止，之後會自動將其刪除。

------
#### [ API ]

若要取消佈建輸送量的自動續約，請傳送 [UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html) 請求，並將 `disableAutoRenew` 參數設為 的 [Amazon Bedrock 控制平面端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)`true`。佈建輸送量將保持作用中狀態，直到目前承諾期限結束為止。

[查看程式碼範例](prov-thru-code-examples.md)

------

# 佈建輸送量的程式碼範例
<a name="prov-thru-code-examples"></a>

下列程式碼範例示範如何使用 和 AWS CLIPython SDK 建立佈建輸送量，以及如何管理和叫用它。您可以從基礎模型或您已自訂的模型建立佈建輸送量。開始之前，請執行下列先決條件：

**先決條件**

下列範例使用Amazon Nova Lite模型，其模型 ID 為 `amazon.nova-lite-v1:0:24k`。如果您尚未，Amazon Nova Lite請依照 中的步驟請求存取 [使用 SDK 和 CLI 管理模型存取](model-access.md#model-access-modify)。

如果您想要為不同的基礎模型或自訂模型購買佈建輸送量，您必須執行下列動作：

1. 透過執行下列其中一項，尋找模型的 ID （適用於基礎模型）、名稱 （適用於自訂模型） 或 ARN （適用於任一）：
   + 如果您要購買基礎模型的佈建輸送量，請尋找支援佈建之模型的 ID 或 Amazon Resource Name (ARN)，方法如下：
     + 查詢資料表中的值。
     + 傳送 [ListFoundationModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListFoundationModels.html) 請求，並將`byInferenceType`值指定為 `PROVISIONED`，以查看支援佈建的模型清單。在 `modelId`或 `modelArn`欄位中尋找 值。
   + 如果您購買自訂模型的佈建輸送量，請尋找您以下列其中一種方式自訂模型的名稱或 Amazon Resource Name (ARN)：
     + 在 Amazon Bedrock 主控台中，從左側導覽窗格中選擇**自訂模型**。在模型****清單中尋找自訂模型的名稱，或選取它，然後在模型**詳細資訊中尋找模型** **ARN**。
     + 傳送 [ListCustomModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListCustomModels.html) 請求，並在回應中尋找自訂模型的 `modelName`或 `modelArn`值。

1. 在以下範例中修改 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 請求`body`的 ，以在 中尋找符合模型內文的格式[基礎模型的推論請求參數和回應欄位](model-parameters.md)。

選擇您偏好方法的索引標籤，然後遵循下列步驟：

------
#### [ AWS CLI ]

1. 透過在終端機中執行下列命令，傳送 [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html) 請求以建立稱為 *MyPT* 的無遞交佈建輸送量：

   ```
   aws bedrock create-provisioned-model-throughput \
      --model-units 1 \
      --provisioned-model-name MyPT \
      --model-id amazon.nova-lite-v1:0:24k
   ```

1. 回應會傳回 `provisioned-model-arn`。預留一些時間讓建立完成。若要檢查其狀態，請執行下列命令`provisioned-model-id`，傳送 [GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html) 請求，並提供佈建模型的名稱或 ARN 做為 ：

   ```
   aws bedrock get-provisioned-model-throughput \
       --provisioned-model-id ${provisioned-model-arn}
   ```

1. 傳送 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 請求，以您的佈建模型執行推論。提供`CreateProvisionedModelThroughput`回應中傳回之佈建模型的 ARN，做為 `model-id`。輸出會寫入目前資料夾中名為 *output.txt* 的檔案。

   ```
   aws bedrock-runtime invoke-model \
       --model-id ${provisioned-model-arn} \
       --body '{
                   "messages": [{
                       "role": "user",
                       "content": [{
                           "text": "Hello"
                       }]
                   }],
                   "inferenceConfig": {
                       "temperature":0.7
                   }
               }' \
       --cli-binary-format raw-in-base64-out \
       output.txt
   ```

1. 使用下列命令傳送 [DeleteProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteProvisionedModelThroughput.html) 請求來刪除佈建輸送量。您不再需要支付佈建輸送量的費用。

   ```
   aws bedrock delete-provisioned-model-throughput 
     --provisioned-model-id MyPT
   ```

------
#### [ Python (Boto) ]

下列程式碼片段會逐步引導您建立佈建輸送量以取得相關資訊，並叫用佈建輸送量。

1. 若要建立名為 *MyPT* 的無遞交佈建輸送量，並將佈建輸送量的 ARN 指派給名為 *provisioned\$1model\$1arn* 的變數，請傳送下列 [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html) 請求：

   ```
   import boto3 
   
   provisioned_model_name = 'MyPT'
   
   bedrock = boto3.client(service_name='bedrock')
   response = bedrock.create_provisioned_model_throughput(
       modelUnits=1,
       provisionedModelName=provisioned_model_name, 
       modelId='amazon.nova-lite-v1:0:24k' 
   )
                           
   provisioned_model_arn = response['provisionedModelArn']
   ```

1. 預留一些時間讓建立完成。您可以使用下列程式碼片段來檢查其狀態。您可以提供佈建輸送量的名稱，或從 [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html) 回應傳回的 ARN 做為 `provisionedModelId`。

   ```
   bedrock.get_provisioned_model_throughput(provisionedModelId=provisioned_model_name)
   ```

1. 使用下列命令並使用佈建模型的 ARN 做為 ，以更新的佈建模型執行推論`modelId`。

   ```
   import json
   import logging
   import boto3
   
   from botocore.exceptions import ClientError
   
   
   class ImageError(Exception):
       "Custom exception for errors returned by the model"
   
       def __init__(self, message):
           self.message = message
   
   
   logger = logging.getLogger(__name__)
   logging.basicConfig(level=logging.INFO)
   
   
   def generate_text(model_id, body):
       """
       Generate text using your provisioned custom model.
       Args:
           model_id (str): The model ID to use.
           body (str) : The request body to use.
       Returns:
           response (json): The response from the model.
       """
   
       logger.info(
           "Generating text with your provisioned custom model %s", model_id)
   
       brt = boto3.client(service_name='bedrock-runtime')
   
       accept = "application/json"
       content_type = "application/json"
   
       response = brt.invoke_model(
           body=body, modelId=model_id, accept=accept, contentType=content_type
       )
       response_body = json.loads(response.get("body").read())
   
       finish_reason = response_body.get("error")
   
       if finish_reason is not None:
           raise ImageError(f"Text generation error. Error is {finish_reason}")
   
       logger.info(
           "Successfully generated text with provisioned custom model %s", model_id)
   
       return response_body
   
   
   def main():
       """
       Entrypoint for example.
       """
       try:
           logging.basicConfig(level=logging.INFO,
                               format="%(levelname)s: %(message)s")
   
           model_id = provisioned-model-arn
   
           body = json.dumps({
               "inputText": "what isAWS?"
           })
   
           response_body = generate_text(model_id, body)
           print(f"Input token count: {response_body['inputTextTokenCount']}")
   
           for result in response_body['results']:
               print(f"Token count: {result['tokenCount']}")
               print(f"Output text: {result['outputText']}")
               print(f"Completion reason: {result['completionReason']}")
   
       except ClientError as err:
           message = err.response["Error"]["Message"]
           logger.error("A client error occurred: %s", message)
           print("A client error occured: " +
                 format(message))
       except ImageError as err:
           logger.error(err.message)
           print(err.message)
   
       else:
           print(
               f"Finished generating text with your provisioned custom model {model_id}.")
   
   
   if __name__ == "__main__":
       main()
   ```

1. 使用以下程式碼片段刪除佈建的輸送量。您不再需要支付佈建輸送量的費用。

   ```
   bedrock.delete_provisioned_model_throughput(provisionedModelId=provisioned_model_name)
   ```

------

# Amazon Bedrock 的配額
<a name="quotas"></a>

您的 AWS 帳戶 具有 Amazon Bedrock 的預設配額，先前稱為限制。若要檢視 Amazon Bedrock 的服務配額，請執行下列其中一項操作：
+ 請遵循[檢視服務配額](https://docs.aws.amazon.com/servicequotas/latest/userguide/gs-request-quota.html)中的步驟，然後選取 **Amazon Bedrock** 做為服務。
+ 請參閱 AWS 一般參考中的 [Amazon Bedrock 服務配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)。

Amazon Bedrock 中的模型推論是由字符用量的配額控制。有些模型會使用費率較高的字符。如需這些費率及如何最佳化字符用量的詳細資訊，請參閱 [Amazon Bedrock 字符的計數方式](quotas-token-burndown.md)。

為了維持服務的效能並確保適當使用 Amazon Bedrock，指派給帳戶的預設配額可能會根據區域因素、付款歷史記錄、詐騙使用量和/或[配額增加請求](quotas-increase.md)的核准而更新。

**Topics**
+ [Amazon Bedrock 字符的計數方式](quotas-token-burndown.md)
+ [在執行推論之前，透過計算字符來監控字符用量](count-tokens.md)
+ [請求提高 Amazon Bedrock 配額](quotas-increase.md)

# Amazon Bedrock 字符的計數方式
<a name="quotas-token-burndown"></a>

當您執行模型推論時，根據您使用的 Amazon Bedrock 模型，可以處理的字符數量有配額限制。請檢閱下列與字符配額相關的術語：


****  

| 術語 | 定義 | 
| --- | --- | 
| InputTokenCount | CloudWatch Amazon Bedrock 執行時期指標，代表請求中做為模型輸入提供的字符數量。 | 
| OutputTokenCount | CloudWatch Amazon Bedrock 執行時期指標，代表在對請求的回應中由模型產生的字符數量。 | 
| CacheReadInputTokens | CloudWatch Amazon Bedrock 執行時期指標，代表從快取成功擷取的輸入字符數量，而不是模型重新處理的輸入字符數量。如果您未使用[提示快取](prompt-caching.md)，則此值為 0。 | 
| CacheWriteInputTokens | CloudWatch Amazon Bedrock 執行時期指標，代表成功寫入快取的輸入字符數量。如果您未使用[提示快取](prompt-caching.md)，則此值為 0。 | 
| 每分鐘字符數 (TPM) | 您可以在一分鐘內使用之字符數量 （包括輸入和輸出） 在模型層級 AWS 設定的配額。 | 
| 每日字符數量 (TPD) | 您可以在一天內使用的字符數量 （包括輸入和輸出）， AWS 在模型層級由 設定的配額。根據預設，此值為 TPM x 24 x 60。不過，新的 AWS 帳戶 已減少配額。 | 
| 每分鐘請求數量 (RPM) | 您可以在一分鐘內傳送的請求數量， AWS 在模型層級由 設定的配額。 | 
| max\$1tokens | 您在請求中提供的參數，用於設定模型可產生的輸出字符數量上限。 | 
| 銷毀率 | 輸入和輸出字符轉換為限流系統其字符配額用量的比率。 | 

Anthropic Claude 模型 3.7 版和更新版本的縮減率**為輸出字符的 5 倍 **(1 個輸出字符從您的配額消耗 5 個字符）：

所有其他模型的銷毀率為 **1：1** (1 個輸出字符會消耗您配額中的 1 個字符)。

**Topics**
+ [了解字符配額管理](#quotas-token-burndown-management)
+ [了解 max\$1tokens 參數的影響](#quotas-token-burndown-max-tokens)
+ [最佳化 max\$1tokens 參數](#quotas-token-burndown-max-tokens-optimize)

## 了解字符配額管理
<a name="quotas-token-burndown-management"></a>

當您提出請求時，會從 TPM 和 TPD 配額中扣除字符。計算會在下列階段進行：
+ **請求開始時** – 假設您尚未超過 RPM 配額，則會從配額中扣除下列總和。如果您超過配額，則會調節該請求。

  ```
  Total input tokens + max_tokens
  ```
+ **處理期間** – 請求消耗的配額會定期調整，以考量實際產生的輸出字符數量。
+ **請求結束時** – 請求消耗的字符總數將如下所示計算，任何未使用的字符都會補充到您的配額：

  ```
  InputTokenCount + CacheWriteInputTokens + (OutputTokenCount x burndown rate)
  ```

  如果您不使用[提示快取](prompt-caching.md)，則 `CacheWriteInputTokens` 將為 0。`CacheReadInputTokens` 不會計入此計算中。

**注意**  
只會向您收取實際字符用量的費用。  
例如，如果您使用 Anthropic Claude Sonnet 4，並傳送一個包含 1,000 個輸入字符的請求，則會產生相當於 100 個字符的回應：  
您 TPM 和 TPD 配額中的 **1,500 個字符** (1,000 \$1 100 x 5) 將耗盡。
只會向您收取 **1,100 個字符**的費用。

## 了解 max\$1tokens 參數的影響
<a name="quotas-token-burndown-max-tokens"></a>

該 `max_tokens` 值會在每個請求開始時從您的配額中扣除。如果您比預期更早達到 TPM 配額，請嘗試減少 `max_tokens` 以更接近完成的大小。

下列情境提供範例說明如何使用銷毀率為輸出字符 5 倍的模型對已完成的請求扣除配額：

### 情境 1：高 max\$1tokens 值
<a name="quotas-token-burndown-max-tokens-too-high"></a>

假設有下列參數：
+ **InputTokenCount：**3,000
+ **CacheReadInputTokens：**4,000
+ **CacheWriteInputTokens：**1,000
+ **OutputTokenCount：**1,000
+ **max\$1tokens：**32,000

會發生下列配額扣除：
+ **提出請求時的初始扣款：**40,000 (= 3,000 \$1 4,000 \$1 1,000 \$1 32,000)
+ **產生回應後的最終調整後扣除：**9,000 (= 3,000 \$1 1,000 \$1 1,000 x 5)

在此情境中，由於 `max_tokens` 參數設定的過高，因此可以提出的並行請求數量會減少。這會減少並行請求、輸送量和配額使用率，因為很快就會達到 TPM 配額容量。

### 情境 2：最佳化 max\$1tokens 值
<a name="quotas-token-burndown-max-tokens-optimized"></a>

假設有下列參數：
+ **InputTokenCount：**3,000
+ **CacheReadInputTokens：**4,000
+ **CacheWriteInputTokens：**1,000
+ **OutputTokenCount：**1,000
+ **max\$1tokens：**1,250

會發生下列配額扣除：
+ **提出請求時的初始扣款：**9,250 (= 3,000 \$1 4,000 \$1 1,000 \$1 1,250)
+ **產生回應後的最終調整後扣除：**9,000 (= 3,000 \$1 1,000 \$1 1,000 x 5)

在此情境中，`max_tokens` 參數已最佳化，因為初始扣除僅略高於最終調整後扣除。這有助於增加請求並行、輸送量和配額使用率。

## 最佳化 max\$1tokens 參數
<a name="quotas-token-burndown-max-tokens-optimize"></a>

透過最佳化 `max_tokens` 參數，您可以有效率地利用配置的配額容量。為了協助您決定此參數，您可以使用 Amazon CloudWatch，它會自動從 AWS 服務收集指標，包括 Amazon Bedrock 中的字符用量資料。

字符會記錄在 `InputTokenCount` 和 `OutputTokenCount` 執行時期指標中 (如需更多指標，請參閱 [Amazon Bedrock 執行時期指標](monitoring.md#runtime-cloudwatch-metrics))。

若要使用 CloudWatch 監控來協助您做出 `max_tokens` 參數的決策，請在 AWS 管理主控台中執行下列動作：

1. 登入 Amazon CloudWatch 主控台，網址為 https：//[https://console.aws.amazon.com/cloudwatch](https://console.aws.amazon.com/cloudwatch)。

1. 從左側導覽窗格中，選取**儀表板**。

1. 選取**自動儀表板**索引標籤。

1. 選取 **Bedrock**。

1. 在**依模型的字符計數**儀表板中，選取展開圖示。

1. 為指標選取時間持續時間和範圍參數，以考慮尖峰用量。

1. 從標記為**總和**的下拉式功能表中，您可以選擇不同的指標來觀察字符用量。檢查這些指標，以引導您做出設定 `max_tokens` 值的決策。

# 在執行推論之前，透過計算字符來監控字符用量
<a name="count-tokens"></a>

當您執行模型推論時，您在輸入中傳送的字符數量將影響請求的成本，以及影響您每分鐘和每天可以使用的字符配額。[CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html) API 透過傳回在推論請求中將相同輸入傳送至模型時將使用的字符計數，協助您在將請求傳送至基礎模型之前估計字符用量。

**注意**  
使用 [CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html) API 不會產生費用。

字符計數是模型特定的，因為不同的模型會使用不同的字符化策略。此操作傳回的字符計數將與字符計數相符，如果將相同的輸入傳送到模型以執行推論，則將收費。

您可以使用 `CountTokens` API 執行下列動作：
+ 在傳送推論請求之前估計成本。
+ 最佳化提示以符合字符限制。
+ 規劃應用程式中的字符用量。

**Topics**
+ [支援字符計數的模型和區域](#count-tokens-supported)
+ [計算請求中的字符](#count-tokens-use)
+ [嘗試範例](#count-tokens-example)

## 支援字符計數的模型和區域
<a name="count-tokens-supported"></a>

下表顯示對字符計數的基礎模型支援：


| 供應商 | 模型 | 模型 ID | 單一區域模型支援 | 
| --- | --- | --- | --- | 
| Anthropic | Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 |  ap-northeast-1 ap-southeast-1 eu-central-1 eu-central-2 us-east-1 us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 |  ap-southeast-2 us-west-2  | 
| Anthropic | Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 |  eu-west-2  | 
| Anthropic | Claude Opus 4 | anthropic.claude-opus-4-20250514-v1:0 |  | 
| Anthropic | Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 |  | 

## 計算請求中的字符
<a name="count-tokens-use"></a>

若要計算推論請求中的輸入字符數量，請使用 [Amazon Bedrock 執行時期端點](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)傳送 [CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html) 請求、在標頭中指定模型，以及在 `body` 欄位中指定要計算字符的輸入。`body` 欄位的值取決於您計算的是 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 或 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 請求的輸入字符：
+ 對於 `InvokeModel` 請求，格式 `body` 是代表 JSON 物件的字串，其格式取決於您指定的模型。
+ 對於 `Converse` 請求，格式 `body` 是 JSON 物件，其會指定對話中包含的 `messages` 和 `system` 提示。

## 嘗試範例
<a name="count-tokens-example"></a>

本節中的範例可讓您使用 Anthropic Claude 3 Haiku 計算 `InvokeModel` 和 `Converse` 請求的字符。

**先決條件**
+ 您已下載 適用於 Python (Boto3) 的 AWS SDK，且您的組態已設定為自動辨識您的登入資料和預設AWS區域。
+ IAM 身分具有下列動作的許可 (如需詳細資訊，請參閱 [Amazon Bedrock 的動作、資源和條件索引鍵](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html))：
  + bedrock:CountTokens：允許使用 `CountTokens`。
  + bedrock:InvokeModel：允許使用 `InvokeModel` 和 `Converse`。應至少將範圍設為 *arn:\$1\$1Partition\$1:bedrock:\$1\$1Region\$1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0*。

若要嘗試計算 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 請求的字符數量，請執行下列 Python 程式碼：

```
import boto3
import json

bedrock_runtime = boto3.client("bedrock-runtime")

input_to_count = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 500,
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
})

response = bedrock_runtime.count_tokens(
    modelId="anthropic.claude-3-5-haiku-20241022-v1:0",
    input={
        "invokeModel": {
            "body": input_to_count
        }
    }
)

print(response["inputTokens"])
```

若要嘗試計算 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 請求的字符數量，請執行下列 Python 程式碼：

```
import boto3
import json 

bedrock_runtime = boto3.client("bedrock-runtime")

input_to_count = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": "What is the capital of France?"
                }
            ]
        },
        {
            "role": "assistant",
            "content": [
                {
                    "text": "The capital of France is Paris."
                }
            ]
        },
        {
            "role": "user",
            "content": [
                {
                    "text": "What is its population?"
                }
            ]
        }
    ],
    "system": [
        {
            "text": "You're an expert in geography."
        }
    ]
}

response = bedrock_runtime.count_tokens(
    modelId="anthropic.claude-3-5-haiku-20241022-v1:0",
    input={
        "converse": input_to_count
    }
)

print(response["inputTokens"])
```

# 請求提高 Amazon Bedrock 配額
<a name="quotas-increase"></a>

請求提高帳戶配額的步驟取決於 [Amazon Bedrock 服務配額](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中配額表內**可調整**欄的值：
+ 如果配額標示為**是**，您可以依照《Service Quotas 使用者指南》中[請求提高配額](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)的步驟進行調整。
+ 對於任何模型，您皆可一起請求提高下列配額：
  + *\$1\$1model\$1* 的每分鐘跨區域 InvokeModel 字符
  + *\$1\$1model\$1* 的每分鐘跨區域 InvokeModel 請求
  + *\$1\$1model\$1* 的每分鐘隨需 InvokeModel 字符
  + *\$1\$1model\$1* 的每分鐘隨需 InvokeModel 請求
  + *\$1\$1model\$1* 的每日模型調用字符上限

  若要請求提高這些配額的任意組合，請遵循《Service Quotas 使用者指南》中[請求提高配額](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)的步驟，請求提高 ***\$1\$1model\$1* 的每分鐘跨區域 InvokeModel 字符**配額。這麼做之後，支援團隊會與您聯絡，並提供您同時提高其他四個配額的選項。
**注意**  
由於需求量龐大，將優先考慮所產生流量足以消耗其現有配額配置的客戶。如果您不符合此條件，可能會拒絕您的請求。

# 提示快取可加快模型推論速度
<a name="prompt-caching"></a>

提示快取是一項選用功能，您可以搭配 Amazon Bedrock 上支援的模型使用，以減少推論回應延遲和輸入字符成本。透過將部分內容新增至快取，模型可以利用快取略過重新計算輸入，讓 Bedrock 共用計算節省並減少回應延遲。

當工作負載的內容冗長並重複，且經常重複用於多個查詢時，提示快取很有幫助。例如，如果您有一個聊天機器人可供使用者上傳文件並提出相關問題，則每次使用者提供輸入時，模型處理文件可能會很耗時。透過提示快取，您可以快取文件，以便包含文件的未來查詢不需要重新處理該文件。

使用提示快取時，系統會以較低的速率向您收取從快取讀取字符的費用。視模型而定，已寫入快取的字符其收費費率可能高於未快取的輸入字符費率。任何未讀取或寫入快取的字符，都會以該模型的標準輸入字符費率收費。如需更多資訊，請參閱 [Amazon Bedrock 定價頁面](https://aws.amazon.com/bedrock/pricing/)。

## 運作方式
<a name="prompt-caching-overview"></a>

如果您選擇使用提示快取，Amazon Bedrock 會建立由*快取檢查點*組成的快取。這些標記會定義所要快取提示的連續子區段 (通常稱為提示字首)。這些提示字首在請求之間應該保持不變，若在後續請求中修改提示字首會導致快取遺漏。

快取檢查點的字符數量下限和上限，取決於您使用的特定模型。只有在提示字首總數符合字符數量下限時，您才能建立快取檢查點。例如，Anthropic Claude 3.7 Sonnet 模型需要每個快取檢查點至少有 1,024 個字符。這表示您的第一個快取檢查點會在 1,024 個字符之後定義，而第二個快取檢查點會在 2,048 個字符之後定義。如果您在達到字符數量下限之前嘗試新增快取檢查點，則推論仍會成功，但不會將字首快取。快取具有存留時間 (TTL)，它會在每次快取命中成功時重設。在此期間，會保留快取中的內容。如果 TTL 時段內沒有發生快取命中，則快取會過期。大多數模型支援 5 分鐘的 TTL，而 Claude Opus 4.5Claude Haiku 4.5、 和 Claude Sonnet 4.5也支援延長的 1 小時 TTL 選項。

每當您在 Amazon Bedrock 中針對支援的模型取得模型推論時，都可以使用提示快取。下列 Amazon Bedrock 功能支援提示快取：

**Converse 和 ConverseStream API**  
您可以與在提示中指定快取檢查點所在的模型進行對話。

**InvokeModel 和 InvokeModelWithResponseStream API**  
您可以提交單一提示請求，在其中啟用提示快取並指定快取檢查點。

**透過跨區域推論的提示快取**  
提示快取可與跨區域推論搭配使用。跨區域推論會自動選取您地理位置內的最佳 AWS 區域，以滿足您的推論請求，從而最大化可用的資源和模型可用性。在需求高的時候，這些最佳化可能會導致快取寫入增加。

**Amazon Bedrock 提示管理**  
當您[建立](prompt-management-create.md)或[修改](prompt-management-modify.md)提示時，可以選擇啟用提示快取。視模型而定，您可以快取系統提示、系統指示和訊息 (使用者和助理)。您也可以選擇停用提示快取。

API 可為您提供對提示快取最大的彈性和最精細的控制。您可以在提示中設定個別快取檢查點。您可以透過建立更多快取檢查點來新增至快取，最多可達特定模型允許的快取檢查點數量上限。如需詳細資訊，請參閱[支援的模型、區域和限制](#prompt-caching-models)。

## 支援的模型、區域和限制
<a name="prompt-caching-models"></a>

下表列出支援的模型及其字符下限、快取檢查點數量上限，以及允許快取檢查點的欄位。


| 模型名稱 | 模型 ID | 版本類型 | 每個快取檢查點的字符數量下限 | 每個請求的快取檢查點數量上限 | 支援的 TTL | 接受提示快取檢查點的欄位 | 
| --- | --- | --- | --- | --- | --- | --- | 
| Claude Opus 4.5 | anthropic.claude-opus-4-5-20251101-v1：0 | 全面推出 | 4,096 | 4 | 5 分鐘、1 小時 | `system`、`messages` 和 `tools` | 
| Claude Opus 4.1 | anthropic.claude-opus-4-1-20250805-v1:0 | 全面推出 | 1,024 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Claude Opus 4 | anthropic.claude-opus-4-20250514-v1:0 | 全面推出 | 1,024 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Claude Sonnet 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | 全面推出 | 1,024 | 4 | 5 分鐘、1 小時 | `system`、`messages` 和 `tools` | 
| Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | 全面推出 | 4,096 | 4 | 5 分鐘、1 小時 | `system`、`messages` 和 `tools` | 
| Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | 全面推出 | 1,024 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | 全面推出 | 1,024 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 | 全面推出 | 2,048 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 | 預覽版 | 1,024 | 4 | 5 分鐘 | `system`、`messages` 和 `tools` | 
| Amazon Nova Micro | amazon.nova-micro-v1:0 | 全面推出 | 1K1 | 4 | 5 分鐘 | `system` 和 `messages` | 
| Amazon Nova Lite | amazon.nova-lite-v1:0 | 全面推出 | 1K1 | 4 | 5 分鐘 | `system` 和 `messages`2 | 
| Amazon Nova Pro | amazon.nova-pro-v1:0 | 全面推出 | 1K1 | 4 | 5 分鐘 | `system` 和 `messages`2 | 
| Amazon Nova Premier | amazon.nova-premier-v1:0 | 全面推出 | 1K1 | 4 | 5 分鐘 | `system` 和 `messages`2 | 
| Amazon Nova 2 Lite | amazon.nova-2-lite-v1：0 | 全面推出 | 1K1 | 4 | 5 分鐘 | `system` 和 `messages`2 | 

1：Amazon Nova 模型支援數量最多 2 萬個字符進行提示快取。

2：提示快取主要用於文字提示。

若要搭配支援的模型 (Claude Opus 4.5、 和 Claude Sonnet 4.5) 使用 1 小時 TTL 選項Claude Haiku 4.5，請在快取檢查點中指定 `ttl` 欄位。在 Converse API 中，將 `"ttl": "1h"`新增至`cachePoint`物件。在適用於 Claude 模型的 InvokeModel API 中，將 `"ttl": "1h"`新增至`cache_control`物件。如果未提供任何`ttl`值，則會套用預設的 5 分鐘快取行為。1 小時 TTL 適用於長時間執行的工作階段或批次處理案例，而您想要在長時間內維護快取。

Amazon Nova 為所有文字提示提供自動提示快取，包括 `User` 和 `System` 訊息。當提示開頭有重複的部分時，即使沒有明確的組態，這種機制也可以提供延遲優勢。不過，為了節省成本並確保有更穩定的效能優勢，我們建議您選擇加入**明確提示快取**。

## Claude 模型的簡化快取管理
<a name="prompt-caching-simplified"></a>

對於 Claude 模型，Amazon Bedrock 提供簡化的快取管理方法，可降低手動放置快取檢查點的複雜性。您不需要指定確切的快取檢查點位置，而是可以搭配靜態內容結尾的單一中斷點使用自動快取管理。

當您啟用簡化的快取管理時，系統會自動檢查先前內容區塊界限的快取命中，從您指定的中斷點回溯至最多約 20 個內容區塊。這可讓模型從快取中尋找最長的相符字首，而不需要您預測最佳檢查點位置。若要使用此功能，請在靜態內容的結尾放置單一快取檢查點，再放置任何動態或可變內容。系統會自動尋找最佳的快取相符項目。

為了更精細的控制，您仍然可以使用多個快取檢查點 (Claude 模型最多 4 個) 來指定確切的快取界限。如果您要快取以不同頻率變更的區段，或想要更精確地控制快取的內容，您應該使用多個快取檢查點。

**重要**  
自動字首檢查只會從快取檢查點回溯查看大約 20 個內容區塊。如果您的靜態內容超出此範圍，請考慮使用多個快取檢查點或重組您的提示，將最常重複使用的內容放在此範圍內。

## 如何有效地使用提示快取
<a name="prompt-caching-effective-use"></a>

如果您有定期使用的提示 （即每 5 分鐘使用頻率較頻繁的系統提示），請繼續使用 5 分鐘快取，因為這會繼續重新整理，無需額外費用。

1 小時快取最適合在下列情況下使用：
+ 當您的提示可能使用頻率低於 5 分鐘，但比每小時更頻繁時。例如，當客服人員端代理程式需要超過 5 分鐘的時間，或與使用者儲存長聊天對話時，您通常預期該使用者可能不會在接下來的 5 分鐘內回應。
+ 當延遲很重要時，您的追蹤提示可能會超過 5 分鐘傳送。
+ 當您想要改善速率限制使用率時，因為快取命中不會從您的速率限制中扣除。

您可以在相同的請求中使用 1 小時和 5 分鐘快取控制，但具有重要的限制：具有較長 TTL 的快取項目必須在較短 TTLs 之前出現 （即 1 小時快取項目必須在任何 5 分鐘快取項目之前出現）。

## 開始使用
<a name="prompt-caching-get-started"></a>

以下各節向您簡短介紹如何將提示快取功能用於透過 Amazon Bedrock 與模型互動的每個方法。

### Converse API
<a name="prompt-caching-converse"></a>

[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API 提供進階且靈活的選項，可在多回合對話中實作提示快取。如需每個模型其提示需求的詳細資訊，請參閱上一節「[支援的模型、區域和限制](#prompt-caching-models)」。

**範例請求**

下列範例顯示在 Converse API 其請求的 `messages`、`system` 或 `tools` 欄位中設定的快取檢查點。您可以針對指定的請求將檢查點放置在這些位置中的任何一個。例如，如果將請求傳送至 Claude 3.5 Sonnet v2 模型，您可以在 `messages` 中放置兩個快取檢查點、在 `system` 中放置一個快取檢查點，並在 `tools` 中放置一個快取檢查點。如需建構和傳送 Converse API 請求的詳細資訊和範例，請參閱 [與 Converse API 操作進行對話](conversation-inference.md)。

如下所示指定所需的 ttl 值，當 ttl 值未指定時，會套用 5 分鐘快取的預設行為。

```
"cachePoint" : {
    "type": "default",
    "ttl" : "5m | 1h"
}
```

------
#### [ messages checkpoints ]

在此範例中，第一個 `image` 欄位提供影像給模型，第二個 `text` 欄位則要求模型分析影像。只要 `content` 物件中 `cachePoint` 前面的字符數量符合模型的字符數量下限，就會建立快取檢查點。

```
...
"messages": [
   {
        "role": "user",
        "content": [
            {
                "image": {
                    "bytes": "asfb14tscve..."
                }
            },
            {
                "text": "What's in this image?"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
      ]
  }
]
...
```

------
#### [ system checkpoints ]

在此範例中，您會在 `text` 欄位中提供系統提示。此外，您可以新增 `cachePoint` 欄位來快取系統提示。

```
...
  "system": [ 
    {
        "text": "You are an app that creates play lists for a radio station that plays rock and pop music. Only return song names and the artist. "
    },
    {
        "cachePoint": {
            "type": "default"
        }
    }
  ],
...
```

------
#### [ tools checkpoints ]

在此範例中，您會在 `toolSpec` 欄位中提供工具定義。(或者，您可以呼叫先前定義的工具。如需詳細資訊，請參閱 [使用工具完成 Amazon Bedrock 模型回應](tool-use.md)。) 之後，您可以新增 `cachePoint` 欄位來快取工具。

```
...
toolConfig={
    "tools": [
        {
            "toolSpec": {
                "name": "top_song",
                "description": "Get the most popular song played on a radio station.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "sign": {
                                "type": "string",
                                "description": "The call sign for the radio station for which you want the most popular song. Example calls signs are WZPZ and WKRP."
                            }
                        },
                        "required": [
                            "sign"
                        ]
                    }
                }
            }
        },
        {
                "cachePoint": {
                    "type": "default"
                }
        }
    ]
}
...
```

------

Converse API 的模型回應包含三個特定於提示快取的新欄位。`CacheReadInputTokens` 和 `CacheWriteInputTokens` 值會告訴您由於先前的請求，從快取讀取了多少個字符，以及對快取寫入了多少個字符。這些`CacheDetails`值會告訴您用於寫入快取的字符數量的 ttl。這些是 Amazon Bedrock 向您收費的值，費率低於完整模型推論的成本。

### InvokeModel API
<a name="prompt-caching-invoke"></a>

當您呼叫 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) API 時，預設會啟用提示快取。您可以在請求內文中的任何一點設定快取檢查點，類似於 Converse API 的先前範例。

------
#### [ Anthropic Claude ]

下列範例示範如何為 Anthropic Claude 3.5 Sonnet v2 模型建構 InvokeModel 要求的內文。請注意，InvokeModel 請求的內文其確切格式和欄位可能會因您選擇的模型而有所不同。若要查看不同模型其請求和回應內文的格式和內容，請參閱 [基礎模型的推論請求參數和回應欄位](model-parameters.md)。

如下所示指定所需的 ttl 值，當 ttl 值未指定時，會套用 5 分鐘快取的預設行為。

```
"cache_control" : {
    "type": "ephemeral",
    "ttl" : "5m | 1h"
}
```

```
body={
        "anthropic_version": "bedrock-2023-05-31",
        "system":"Reply concisely",
        "messages": [
            {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe the best way to learn programming."
                },
                {
                    "type": "text",
                    "text": "Add additional context here for the prompt that meets the minimum token requirement for your chosen model.",
                    "cache_control": {
                        "type": "ephemeral"
                    }
                }
            ]
            }
        ],
        "max_tokens": 2048,
        "temperature": 0.5,
        "top_p": 0.8,
        "stop_sequences": [
            "stop"
        ],
        "top_k": 250
}
```

------
#### [ Amazon Nova ]

下列範例示範如何為 Amazon Nova 模型建構 InvokeModel 請求的內文。請注意，InvokeModel 請求的內文其確切格式和欄位可能會因您選擇的模型而有所不同。若要查看不同模型其請求和回應內文的格式和內容，請參閱 [基礎模型的推論請求參數和回應欄位](model-parameters.md)。

```
{
    "system": [{
        "text": "Reply Concisely"
    }],
    "messages": [{
        "role": "user",
        "content": [{
            "text": "Describe the best way to learn programming"
        },
        {
            "text": "Add additional context here for the prompt that meets the minimum token requirement for your chosen model.",
            "cachePoint": {
                "type": "default"
            }
        }]
    }],
    "inferenceConfig": {
        "maxTokens": 300,
        "topP": 0.1,
        "topK": 20,
        "temperature": 0.3
    }
}
```

------

如需傳送 InvokeModel 請求的詳細資訊，請參閱 [使用 InvokeModel 提交單一提示](inference-invoke.md)。

### 遊樂場
<a name="prompt-caching-playground"></a>

在 Amazon Bedrock 主控台的聊天遊樂場中，您可以開啟提示快取選項，Amazon Bedrock 會自動為您建立快取檢查點。

按照 [使用遊樂場在主控台中產生回應](playgrounds.md) 中的指示開始在 Amazon Bedrock 遊樂場中使用提示。對於支援的模型，系統會自動在遊樂場中開啟提示快取。但是若未開啟提示快取，請執行下列動作開啟：

1. 在左側面板中，開啟**組態**功能表。

1. 開啟**提示快取**切換開關。

1. 執行您的提示。

在合併的輸入和模型回應達到檢查點所需的字符數量下限後 (視模型而異)，Amazon Bedrock 會自動為您建立第一個快取檢查點。隨著您繼續聊天，後續每次達到字符數量下限時，都會建立新的檢查點，最高可達模型允許的檢查點數量上限。您可以隨時選擇**提示快取**切換開關旁的**檢視快取檢查點**來檢視快取檢查點，如下列螢幕擷取畫面所示。

![\[Amazon Bedrock 文字遊樂場中提示快取的 UI 切換開關。\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-ui-toggle.png)


您可以透過檢視遊樂場回應中的**快取指標**快顯視窗 (![\[The metrics icon shown in model responses when prompt caching is enabled.\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-metrics-icon.png))，來檢視因為與模型的每個互動而讀取和寫入快取的字符數量。

![\[快取指標方塊，顯示讀取和寫入快取的字符數量。\]](http://docs.aws.amazon.com/zh_tw/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-metrics.png)


如果您在對話過程中關閉提示快取切換開關，您可以繼續與模型聊天。