Tingkat layanan untuk mengoptimalkan kinerja dan biaya - Amazon Bedrock

Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.

Tingkat layanan untuk mengoptimalkan kinerja dan biaya

Amazon Bedrock menawarkan empat tingkatan layanan untuk inferensi model: Reserved, Priority, Standard, dan Flex. Dengan tingkatan layanan, Anda dapat mengoptimalkan ketersediaan, biaya, dan kinerja.

Tingkat Cadangan

Tingkat Cadangan menyediakan kemampuan untuk mencadangkan kapasitas komputasi yang diprioritaskan untuk aplikasi penting misi Anda yang tidak dapat mentolerir waktu henti apa pun. Anda memiliki fleksibilitas untuk mengalokasikan tokens-per-minute kapasitas input dan output yang berbeda agar sesuai dengan persyaratan yang tepat dari beban kerja dan biaya kontrol Anda. Ketika aplikasi Anda membutuhkan tokens-per-minute kapasitas lebih dari yang Anda pesan, layanan secara otomatis meluap ke tingkat Standar, memastikan operasi tidak terganggu. Tingkat Cadangan menargetkan waktu aktif 99,5% untuk respons model. Pelanggan dapat memesan kapasitas untuk durasi 1 bulan atau 3 bulan. Pelanggan membayar harga tetap per 1K tokens-per-minute dan ditagih setiap bulan.

Untuk mendapatkan akses ke tingkat Cadangan, silakan hubungi tim akun AWS Anda.

Tingkat Prioritas

Tingkat Prioritas memberikan waktu respons tercepat untuk harga premium dibandingkan harga sesuai permintaan standar. Ini paling cocok untuk aplikasi penting misi dengan alur kerja bisnis yang dihadapi pelanggan yang tidak menjamin reservasi kapasitas 24X7. Tingkat prioritas tidak memerlukan reservasi sebelumnya. Anda cukup mengatur parameter opsional “service_tier” ke “priority” untuk memanfaatkan prioritas tingkat permintaan. Permintaan tingkat prioritas diprioritaskan di atas permintaan tingkat Standar dan Flex.

Tingkat Standar

Tingkat Standar memberikan kinerja yang konsisten untuk tugas AI sehari-hari seperti pembuatan konten, analisis teks, dan pemrosesan dokumen rutin. Secara default semua permintaan inferensi dirutekan ke tingkat Standar ketika parameter “service_tier” hilang. Anda juga dapat menyetel parameter opsional “service_tier” ke “default” agar permintaan inferensi Anda disajikan dengan tingkat Standar.

Tingkat Fleksibel

Untuk beban kerja yang dapat menangani waktu pemrosesan lebih lama, tingkat Flex menawarkan pemrosesan hemat biaya untuk diskon harga. Ini membantu Anda mengoptimalkan biaya untuk beban kerja seperti evaluasi model, ringkasan konten, dan alur kerja agen. Anda dapat menyetel parameter opsional “service_tier” ke “flex” agar permintaan inferensi Anda dilayani dengan tingkat Flex dan memanfaatkan diskon harga.

Menggunakan kemampuan tingkat layanan

Untuk mengakses kemampuan tingkat layanan, Anda dapat menyetel parameter opsional “service_tier” ke “reserved”, “priority”, “default”, atau “flex” saat memanggil API runtime Amazon Bedrock.

"service_tier" : "reserved | priority | default | flex"

Kuota sesuai permintaan untuk model dibagikan di seluruh tingkatan layanan “prioritas”, “default”, dan “fleksibel”. Reservasi kapasitas tingkat “reservasi” Anda terpisah dari kuota sesuai permintaan Anda. Konfigurasi tingkat layanan untuk permintaan yang ditayangkan dapat dilihat dalam respons API dan CloudTrail Acara AWS. Anda juga dapat melihat metrik tingkat layanan di Metrik Amazon CloudWatch di bawah ModelId,, dan ServiceTier ResolvedServiceTier, di mana ResolvedServiceTier menampilkan tingkat aktual yang melayani permintaan Anda.

Untuk informasi lebih lanjut tentang harga, kunjungi halaman harga.

Model dan wilayah yang didukung oleh tingkat layanan Cadangan:

Penyedia Model Model IDs Daerah
Antropik Claude Soneta 4.5

global.anthropic.claude-sonnet-4-5-20250929-v 1:0

kami.anthropic.claude-sonnet-4-5-20250929-v 1:0

ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-southeast-1
ap-southeast-2
ap-south-1
ap-southeast-3
ap-south-2
ap-southeast-4
ca-central-1
Eropa-Barat-1
Eropa-Tengah-1
Eropa-Tengah-2
Eropa-utara-1
Eropa-Selatan-1
Eropa-Selatan-2
Eropa-Barat-2
Eropa-Barat-3
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
catatan

Panjang konteks 1M untuk Soneta 4.5 tidak didukung oleh tingkat Cadangan.

Model dan wilayah yang didukung oleh tingkat layanan Priority dan Flex:

Penyedia Model ID Model Daerah
OpenAI gpt-oss-120b openai.gpt-oss-120b- 1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
OpenAI gpt-oss-20b openai.gpt-oss-20b- 1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
OpenAI GPT OSS Safeguard 20B terbuka. gpt-oss-safeguard-20b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
OpenAI GPT OSS Safeguard 120B terbuka. gpt-oss-safeguard-120b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Qwen Qwen3 235B A22B 2507 qwen.qwen3-235b-a22b-2507-v 1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-2
Qwen Qwen3 Coder 480B A35B Instruksi qwen.qwen3-coder-480b-a35b-v 1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-north-1
eu-west-2
Qwen Qwen3-Coder-30B-A3B-Instruksi qwen.qwen3-coder-30b-a3b-v 1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
Qwen Qwen3 32B (padat) qwen.qwen3-32b-v 1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
Qwen Qwen3 Berikutnya 80B A3B qwen.qwen3-next-80b-a3b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Qwen Qwen3 VL 235B A22B qwen.qwen3-vl-235b-a22b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
DeepSeek DeepSeek-V3.1 deepseek.v3-v 1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-north-1
eu-west-2
Amazon Nova Premier Amazon. nova-premier-v1:0 kami-timur-1*
kami-timur-2*
kami-barat-2*
Amazon Nova Pro Amazon. nova-pro-v1:0 us-east-1
kami-timur-2*
kami-barat-1*
kami-barat-2*
ap-timur-2*
ap-timur laut-1*
ap-timur laut-2*
ap-selatan-1*
ap-tenggara 1*
ap-southeast-2
ap-southeast-3
ap-tenggara 4*
ap-tenggara 5*
ap-tenggara 7*
eu-sentral-1*
eu-utara-1*
eu-selatan-1*
eu-selatan-2*
eu-barat-1*
eu-west-2
eu-barat-3*
Il-sentral-1*
me-central-1
Amazon Nova 2 Lite amazon.nova-2-lite-v 1:0 ap-timur-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-tenggara 7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Amazon Pratinjau Nova 2 Pro amazon.nova-2-pro-preview-20251202-v 1:0 ap-timur-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-tenggara 7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Amazon Nova Lite 2 Omni amazon.nova-2- 1 lite-omni-v ap-timur-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-tenggara 7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Google Gemma 3 4B google.gemma-3-4b-it ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Google Gemma 3 12B google.gemma-3-12b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Google Gemma 3 27B google.gemma-3-27b-it ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Minimax AI Minimax M2 minimax.minimax-m2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Magistral Kecil 1.2 mistral.magistral-kecil-2509 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Voxtral Mini 1.0 mistral.voxtral-mini-3b-2507 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Voxtral Kecil 1.0 mistral.voxtral-kecil-24b-2507 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministro 3B 3.0 mistral.ministral-3-3b-instruktur ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministro 8B 3.0 mistral.ministral-3-8b-instruktur ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministro 14B 3.0 mistral.ministral-3-14b-instruktur ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Mistral Besar 3 mistral.mistral-besar-3-675b-instruktur ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Kimi AI Kimi K2 Berpikir moonshot.kimi-k2-berpikir ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Nvidia NVIDIA Nemotron Nano 2 nvidia.nemotron-nano-9b-v2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Nvidia NVIDIA Nemotron Nano 2 VL nvidia.nemotron-nano-12b-v2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2

*Inferensi model dapat disajikan menggunakan beberapa wilayah.

Untuk mengontrol akses ke tingkatan layanan, lihat Kontrol akses ke tingkatan layanan