View a markdown version of this page

Understanding your Amazon Bedrock Cost and Usage Report data - Amazon Bedrock

Understanding your Amazon Bedrock Cost and Usage Report data

AWS Data Exports Cost and Usage Reports (CUR 2.0) is the AWS recommended way to receive your detailed cost and usage data. CUR 2.0 provides line-item detail for every Amazon Bedrock inference request. Each request generates separate line items for each token type, with distinct usage types and unit prices. This page explains how to read Amazon Bedrock entries in CUR and reconcile them to your actual spend. For more information about AWS CUR 2.0, see AWS Data Exports documentation.

Understanding Amazon Bedrock pricing in CUR

Amazon Bedrock pricing in CUR is determined by three factors: the token type, the service tier, and whether the request was routed through cross-region inference. Understanding each of these is essential for accurate cost reconciliation.

Token types

Amazon Bedrock charges are broken down by four token types. Each has a different unit price.

Token type CUR usage type pattern Description
Input tokens *-input-tokens or *-mantle-input-tokens-* Tokens sent in the request prompt
Output tokens *-output-tokens or *-mantle-output-tokens-* Tokens generated in the response
Cache read tokens *-cache-read-input-token-count Tokens read from prompt cache (significantly cheaper than input)
Cache write tokens *-cache-write-input-token-count Tokens written to prompt cache (more expensive than input)
Important

All four token types must be accounted for when reconciling usage to spend. If you only sum input and output tokens, your totals will not match your bill. This is the most common source of reconciliation gaps, particularly for workloads that use prompt caching heavily.

Service tiers

Amazon Bedrock supports different service tiers that affect pricing and availability. The service tier for a request is reflected in the CUR usage type. When reconciling costs, ensure you apply the correct unit price for the service tier associated with each line item.

For more information on service tiers and how they affect inference pricing, see Service tiers for optimizing performance and cost.

CUR usage type examples

The line_item_usage_type field identifies the model, token type, service tier, and whether the request used cross-region inference. The format varies by endpoint:

  • {region}-{model}-{token-type} for bedrock-runtime standard tier requests

  • {region}-{model}-{token-type}-{tier} for bedrock-runtime priority or flex tier requests

  • {region}-{model}-mantle-{token-type}-standard for bedrock-mantle requests

  • {region}-{model}-{token-type}-cross-region-global for cross-region requests

Usage type Model Service Tier Token type Routing
USE1-openai.gpt-oss-120b-mantle-input-tokens-standard OpenAI gpt-oss-120b Standard Input In-region
USE1-gpt-oss-120b-output-tokens-priority OpenAI gpt-oss-120b Priority Output In-region
USE1-Nova2.0Lite-input-tokens-flex Amazon Nova 2 Lite Flex Input In-region
USE1-Claude4.6Sonnet-input-tokens Claude Sonnet 4.6 Standard Input In-region
USE1-Claude4.6Sonnet-cache-read-input-token-count Claude Sonnet 4.6 Standard Cache read In-region
USE1-Claude4.6Sonnet-output-tokens-cross-region-global Claude Sonnet 4.6 Standard Output Cross-region

Cost allocation tags in CUR

Cost allocation tags from IAM principals, Projects, and application inference profiles appear as columns in CUR with the prefix resourceTags/{key} and iamPrincipal/{key}. For example, a tag with key Team appears as resourceTags/Team.

Attribution method How tags appear in CUR
IAM principal tags Tags from the IAM user or role making the request
Session tags Tags passed during role assumption or federation
Project tags Tags assigned to a Amazon Bedrock Project
Application inference profile tags Tags assigned to an application inference profile

Tags must be activated as cost allocation tags in the AWS Billing console before they appear in CUR. For more information, see Activating cost allocation tags.

Key things to know when reading your CUR data

To get the most out of your CUR data and avoid confusion when analyzing costs, keep the following in mind.

  • Account for all token types. Amazon Bedrock charges separately for input, output, cache read, and cache write tokens. Each has a different unit price. If you only look at input and output tokens, your analysis will undercount costs, especially for workloads that use prompt caching heavily.

  • Apply the correct rate for each routing type. In-region and cross-region inference have different unit prices. If your workloads use both, make sure you use the matching rate for each when analyzing costs.

  • Activate tags before expecting them in CUR. Cost allocation tags must be activated in the AWS Billing console before they appear in CUR or Cost Explorer. After activation, allow up to 24 hours for tags to begin populating.

  • Use CUR 2.0 for IAM principal attribution. IAM principal identity and tag data requires CUR 2.0 (AWS Data Exports). If you are using the legacy CUR format, IAM principal fields will not be available. Per-token cost breakdowns are available in both CUR formats. For detailed setup instructions, see Using IAM principal for cost allocation.