Understanding Amazon Bedrock pricing in CUR CUR usage type examples Cost allocation tags in CUR Key things to know when reading your CUR data

Understanding your Amazon Bedrock Cost and Usage Report data

AWS Data Exports Cost and Usage Reports (CUR 2.0) is the AWS recommended way to receive your detailed cost and usage data. CUR 2.0 provides line-item detail for every Amazon Bedrock inference request. Each request generates separate line items for each token type, with distinct usage types and unit prices. This page explains how to read Amazon Bedrock entries in CUR and reconcile them to your actual spend. For more information about AWS CUR 2.0, see AWS Data Exports documentation.

Understanding Amazon Bedrock pricing in CUR

Amazon Bedrock pricing in CUR is determined by three factors: the token type, the service tier, and whether the request was routed through cross-region inference. Understanding each of these is essential for accurate cost reconciliation.

Token types

Amazon Bedrock charges are broken down by four token types. Each has a different unit price.

Token type	CUR usage type pattern	Description
Input tokens	`-input-tokens` or `-mantle-input-tokens-*`	Tokens sent in the request prompt
Output tokens	`-output-tokens` or `-mantle-output-tokens-*`	Tokens generated in the response
Cache read tokens	`*-cache-read-input-token-count`	Tokens read from prompt cache (significantly cheaper than input)
Cache write tokens	`*-cache-write-input-token-count`	Tokens written to prompt cache (more expensive than input)

Important

All four token types must be accounted for when reconciling usage to spend. If you only sum input and output tokens, your totals will not match your bill. This is the most common source of reconciliation gaps, particularly for workloads that use prompt caching heavily.

Service tiers

Amazon Bedrock supports different service tiers that affect pricing and availability. The service tier for a request is reflected in the CUR usage type. When reconciling costs, ensure you apply the correct unit price for the service tier associated with each line item.

For more information on service tiers and how they affect inference pricing, see Service tiers for optimizing performance and cost.

CUR usage type examples

The line_item_usage_type field identifies the model, token type, service tier, and whether the request used cross-region inference. The format varies by endpoint:

{region}-{model}-{token-type} for bedrock-runtime standard tier requests
{region}-{model}-{token-type}-{tier} for bedrock-runtime priority or flex tier requests
{region}-{model}-mantle-{token-type}-standard for bedrock-mantle requests
{region}-{model}-{token-type}-cross-region-global for cross-region requests

Usage type	Model	Service Tier	Token type	Routing
`USE1-openai.gpt-oss-120b-mantle-input-tokens-standard`	OpenAI gpt-oss-120b	Standard	Input	In-region
`USE1-gpt-oss-120b-output-tokens-priority`	OpenAI gpt-oss-120b	Priority	Output	In-region
`USE1-Nova2.0Lite-input-tokens-flex`	Amazon Nova 2 Lite	Flex	Input	In-region
`USE1-Claude4.6Sonnet-input-tokens`	Claude Sonnet 4.6	Standard	Input	In-region
`USE1-Claude4.6Sonnet-cache-read-input-token-count`	Claude Sonnet 4.6	Standard	Cache read	In-region
`USE1-Claude4.6Sonnet-output-tokens-cross-region-global`	Claude Sonnet 4.6	Standard	Output	Cross-region

Cost allocation tags in CUR

Cost allocation tags from IAM principals, Projects, and application inference profiles appear as columns in CUR with the prefix resourceTags/{key} and iamPrincipal/{key}. For example, a tag with key Team appears as resourceTags/Team.

Attribution method	How tags appear in CUR
IAM principal tags	Tags from the IAM user or role making the request
Session tags	Tags passed during role assumption or federation
Project tags	Tags assigned to a Amazon Bedrock Project
Application inference profile tags	Tags assigned to an application inference profile

Tags must be activated as cost allocation tags in the AWS Billing console before they appear in CUR. For more information, see Activating cost allocation tags.

Key things to know when reading your CUR data

To get the most out of your CUR data and avoid confusion when analyzing costs, keep the following in mind.

Account for all token types. Amazon Bedrock charges separately for input, output, cache read, and cache write tokens. Each has a different unit price. If you only look at input and output tokens, your analysis will undercount costs, especially for workloads that use prompt caching heavily.
Apply the correct rate for each routing type. In-region and cross-region inference have different unit prices. If your workloads use both, make sure you use the matching rate for each when analyzing costs.
Activate tags before expecting them in CUR. Cost allocation tags must be activated in the AWS Billing console before they appear in CUR or Cost Explorer. After activation, allow up to 24 hours for tags to begin populating.
Use CUR 2.0 for IAM principal attribution. IAM principal identity and tag data requires CUR 2.0 (AWS Data Exports). If you are using the legacy CUR format, IAM principal fields will not be available. Per-token cost breakdowns are available in both CUR formats. For detailed setup instructions, see Using IAM principal for cost allocation.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Per-request metadata tagging

Additional Capabilities