

# Plan your deployment
Plan your deployment

This section describes the [cost](cost.md), [security](security-1.md), [Region](#supported-aws-regions), and [quota](quotas.md) considerations for planning your deployment.

**Important**  
This solution leverages Amazon Bedrock as the primary service for accessing AI-generated models. You must first request access to models before they are available for use within the solution. For details, refer to [Model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) in the *Amazon Bedrock User Guide*.

## Supported AWS Regions


**Important**  
This solution optionally uses the Amazon Bedrock and Amazon Kendra services, which are not currently available in all AWS Regions. You must launch this solution in an AWS Region where these services are available. For the most current availability of AWS services by Region, see the [AWS Regional Services List](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/).

Generative AI Application Builder on AWS is supported in the following AWS Regions:


| Region name |  | 
| --- | --- | 
|  US East (Ohio)  |  Canada (Central)  | 
|  US East (N. Virginia)  |  Europe (Frankfurt)  | 
|  US West (Northern California)  |  Europe (Ireland)  | 
|  US West (Oregon)  |  Europe (London)  | 
|  Asia Pacific (Mumbai)  |  Europe (Milan)  | 
|  Asia Pacific (Seoul)  |  Europe (Paris)  | 
|  Asia Pacific (Singapore)  |  Europe (Stockholm)  | 
|  Asia Pacific (Sydney)  |  Middle East (Bahrain)  | 
|  Asia Pacific (Tokyo)  |  South America (São Paulo)  | 

**Note**  
If using a foundation model accessed outside of AWS in your deployments, check with the model provider which Regions their APIs are available in. If their APIs are only available in certain Regions, you might experience instability in the form of high latency or even time outs. It’s also important to check with your organization’s legal and compliance teams to evaluate the considerations of data crossing regional boundaries.

# Cost


With this AWS Solution, you pay only for the resources you use and there are no minimum fees or setup charges. Users pay for the dashboard used to launch Generative AI use cases and, and for any use cases that are deployed. The cost of deployed use cases depends on the configurations. Example configurations:

1. A simple Deployment dashboard which costs approximately \$120 USD per month.

1. A simple production-ready chatbot use case deployed with default settings running in US East (N. Virginia), powered by Amazon Bedrock without access to documents, which also costs around \$1200 USD per month.

1. A scaled system in an Amazon VPC use case that supports 8,000 queries per day over tens of thousands of documents, which costs around \$11,500 USD per month. The cost of the use case will vary depending on the configuration, such as Text use cases with different model providers, with or without Retrieval Augmented Generation (RAG) enabled, and so on.


| Workload description | Estimated cost (USD/month) | 
| --- | --- | 
|   [Sample cost for Deployment dashboard](#sample-deployment-dashboard-cost)   |  \$120/month  | 
|   [Sample costs for a text-based proof of concept](#sample-costs-for-a-text-based-proof-of-concept)  (includes Deployment dashboard and 1 Text use case, \$1100 interactions per day)  |  \$140/month  | 
|   [Sample costs for a highly scalable generative AI query engine](#sample-costs-for-a-highly-scalable-generative-ai-query-engine)  (Includes Deployment dashboard, 1 Text use case, and an Amazon Kendra Index for RAG up to 100K documents with \$18K queries per day, with [VPC enabled](#incremental-cost-of-enabling-amazon-vpc-for-a-use-case)   |  \$11,500/month  | 
|   [Sample costs for an agent-based proof of concept](#sample-costs-for-an-agent-based-proof-of-concept)  (Includes Deployment dashboard, 1 Bedrock Agent use case with Amazon Bedrock Knowledge Bases and Amazon Bedrock Guardrails enabled, \$1100 interactions per day)  |  \$1840/month  | 
|   [Sample costs for MCP Server](#sample-costs-for-mcp-server)  (Includes Deployment dashboard, 1 MCP Server use case with Gateway method for Lambda integration, \$1100 tool invocations per day)  |  \$122/month  | 
|   [Sample costs for Agent Builder](#sample-costs-for-agent-builder)  (Includes Deployment dashboard, 1 Agent Builder use case with MCP integration and long-term memory enabled, \$1100 interactions per day)  |  \$155/month  | 
|   [Sample costs for Workflow Builder](#sample-costs-for-workflow-builder)  (Includes Deployment dashboard, 1 Workflow with 3 Agent Builder agents, \$1100 interactions per day)  |  \$1109/month  | 

**Important**  
These examples are only intended to help you estimate the costs for your specific workloads. The use of different LLMs, configurations, or AWS services can change your costs (example, serverless/on-demand billing vs. provisioned/time-billed). To manage costs, we recommend [creating a budget](https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-create.html) through [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/). Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution.

## Sample costs for running the Deployment dashboard


The following table provides the cost breakdown for a Deployment dashboard with default parameters and 100 active users in the US East (N. Virginia) Region for one month, which will cost about \$120/month.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway, DynamoDB, CloudFront, Amazon S3, Lambda, Systems Manager Parameter Store  |  5,000 512 KB REST API calls per month without caching enabled  |  \$11.97  | 
|  Amazon Cognito  |  100 active users per month with advanced security features enabled and no users signing in through SAML or OIDC federation  |  \$15.55  | 
|  AWS WAF  |  10,000 web requests across 1 web ACL and 7 defined rules without any rule groups  |  \$112.60  | 
|  Total Deployment dashboard cost  |  |   **\$120.12**   | 

## Sample costs for a text-based proof of concept


A Deployment dashboard can have many use cases deployed at a given time. The following table shows the cost breakdown of a use case deployed without RAG for 1 business user performing 100 queries per day with the LLM. Queries are sent as a text message on the WebSocket and the response is streamed back as tokens with the assumption that streaming is enabled. Using the Amazon Bedrock Nova Pro model, the cost of running this use case is about \$120/month.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, AWS Systems Manager Parameter Store  |  100 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection.  |  \$10.61  | 
|  CloudWatch  |  1.5 GB CloudWatch logs with verbose mode on for experimentation  |  \$17.23  | 
|  Amazon DynamoDB  |  Conversation history table, 1 GB storage LLM configuration table, 1 GB storage  |  \$13.05  | 
|   **Subtotal of the use case costs (not including LLMs)**   |  |   **\$110.89**   | 
|  Amazon Bedrock (Nova Pro)  |  Assumptions for 100 interactions per day: \$1 Monthly cost for 190K input tokens per day = \$10.152 × 30 \$1 Monthly cost for 16K output tokens per day = \$10.0512 × 30  |  \$16.10  | 
|   **Total application cost with Amazon Bedrock (Nova Pro)**   |   **\$110.89 (Use Case cost) \$1 \$16.10 (Amazon Bedrock cost)**   |   **\$117.00**   | 

**Note**  
The costs of inference calls made to services outside the AWS network are not included in these estimates. Refer to the pricing guide of your LLM provider if you’re not using an AWS model provider.  
Pricing guides for AWS services can be found at: [Amazon Bedrock pricing](https://aws.amazon.com/bedrock/pricing/) and [Amazon SageMaker AI pricing](https://aws.amazon.com/sagemaker/pricing/).

## Sample costs for a highly scalable generative AI query engine


The following table provides the cost breakdown of a RAG-enabled use case with Amazon Bedrock’s Nova Pro model as the LLM. When a Bedrock Knowledge Base is added, this use case costs about \$11300/month


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway (WebSocket)  |  8000 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection.  |  \$138.89  | 
|  CloudFront  |  240,000 requests per month with 100 GB data transferred out to the internet and 1 GB data transferred out to the origin  |  \$18.76  | 
|  Amazon Bedrock (Nova Pro)  |  Assumptions: Input tokens = promptTemplate (400) \$1 context (400)\$1 chatHistory (1080) \$1 query Input tokens (20)= 1,900 Output tokens = 160 (average) With 8,000 transactions a day, Daily Input Tokens cost (1,900 x 8,000 = 15,200,000 tokens x 0.0008/1000 price per token) Daily Output Tokens cost (160 x 8,000 = 1,280,000 tokens x 0.0032/1000 price per token) Monthly cost ((\$112.16 \$1 \$14.10) x 30)  |  \$1487.80  | 
|  CloudWatch  |  24 metrics using 5 GB data ingested for logs and 1 dashboard  |  \$19.72  | 
|  DynamoDB  |  DynamoDB table to keep track of conversation history with each record up to 1 KB data, 8,000 read and writes per day  |  \$111.70  | 
|  Lambda  |  Container size - 128 MB, 512 MB ephemeral storage, 2 Lambda functions used for authorization Container size - 256 MB, 512 MB ephemeral storage, 5 requests per second with 20 seconds average compute time  |  \$120.89  | 
|   **Total use case cost**   |  |   **\$1577.76/month \$1 knowledge base cost (see below)**   | 

**Note**  
The costs of API calls made to any services outside of the AWS network are not included in these estimates. See the pricing guide of your LLM provider if not using Amazon Bedrock.

## Costs for adding a knowledge base


Knowledge base costs will vary based on the type of knowledge base used, and (in the case of Bedrock) the backing vector store used by the knowledge base. Provisioning and managing the knowledge bases is outside of the scope of the solution.

 **Amazon Bedrock Knowledge Bases** 

The solution does not manage or provision any resources related to Amazon Bedrock Knowledge Bases. Amazon Bedrock does not incur cost for using the knowledge base feature itself, however you will be charged for the usage of the embedding model used by your use case on each query. Additionally, the backing vector store for your knowledge base (for example, an index in [Amazon OpenSearch Service](https://aws.amazon.com/opensearch-service), or a database inside Amazon Relational Database Service) will have an associated cost which cannot be provided or calculated here.

For the above highly scalable generative AI query engine scenario, the costs incurred by this service for calling the Amazon Bedrock embeddings model are as follows:


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  Amazon Bedrock (Amazon Titan Text Embeddings V2)  |  8,000 queries a day with 1,900 input tokens per query = 15,200,000 tokens = \$10.30 USD per day. Daily cost x 30 days = \$19.00 USD monthly cost  |  \$19.00  | 
|  Amazon OpenSearch Service (Serverless) Sample Usage  |  Basic serverless configuration with 4 x OpenSearch Compute Unit (OCU) (billable minimum) = \$123.04 USD per day Daily cost x 30 days = \$1691.20 USD  This provides a rough estimate, as some workloads will require more OCUs, while customers with existing provisioned OpenSearch resources will incur less cost here.   |  \$1691.20  | 
|   **Total additional cost**   |  |  \$1 700.20  | 

 **Amazon Kendra** 

The solution can provision a Kendra index for you, or you can bring your own. The cost for running a configuration suited to the above highly scalable generative AI query engine is as follows:


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  Amazon Kendra  |  0-8,000 queries a day and up to 100,000 documents with Amazon Kendra Enterprise Edition with 0-50 data sources  |  \$11,008.00  | 

**Note**  
You can share the Amazon Kendra index between use cases, but this can drive up the number of queries per index. If this falls outside the Amazon Kendra Enterprise edition, additional charges will apply.

## Incremental cost of enabling Amazon VPC for a use case


The following table provides the cost breakdown of enabling Amazon VPC for a use case deployed in two AZs.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  Amazon NAT Gateway  |  Assumption: 2 AZ deployment, with a NAT Gateway in each AZ. 100 GB of data processed through NAT Gateway 730 hours, 100 GB data processed per month  |  \$174.70  | 
|  AWS PrivateLink (VPC Endpoints)  |  Assumptions: 2 AZ deployment, with 1 private subnet in each AZ and 1 VPC Endpoint with 2 elastic network interfaces (ENIs). 6 VPC endpoints, 2 ENIs per VPC endpoint, 730 hours with 1,024 GB data processed in a month  |  \$197.84  | 
|  Public IPv4 address  |  Assumption: 2 AZ deployment, 1 public subnet in each AZ with a NAT Gateway in each public subnet. Each NAT Gateway configured with 1 active public IPv4. 2 active public IPv4 address x 730 hours in a month x \$10.005 hourly charge = \$17.3 USD  |  \$17.30  | 
|  Additional cost (for Amazon VPC)  |  |   **\$1179.93**   | 

## Cost implications when using Provisioned Throughput


Provisioned throughput costs will vary based on the type of model you’ve provisioned and your commitment period as well as Model Units selected for the commitment period. There is an additional cost associated with using Provisioned Throughput.

For more information and the most up-to-date pricing, you can refer to [Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/).

## Cost for using cross-region inference


There is no additional cost for routing or data transfer for using [cross-region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html). You pay the same price per token for models as in your source or primary Region.

## Sample costs for an agent-based proof of concept


When you use Amazon Bedrock Agents, you’re charged based on the components comprising the agent, such as the backing model and knowledge base (if RAG is enabled), along with additional capabilities that you add. The following table shows the cost breakdown of a Bedrock Agent use case configured with an on-demand Claude 3.5 Sonnet model, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails.

Similar to the [cost for adding Amazon Bedrock Knowledge Bases](#cost-of-adding-a-knowledge-base), this solution doesn’t manage or provision resources related to Amazon Bedrock Agents. The solution also doesn’t incur cost for using Amazon Bedrock Knowledge Bases, but does incur cost for:
+ Using the embedding model for each query that is sent to it
+ The backing vector store for your knowledge base (for example, an index in Amazon OpenSearch Service, or a database inside Amazon RDS)

The following table assumes 100 interactions per day with 1,900 input tokens and 160 output tokens per query.

**Note**  
For this sample Bedrock Agent use case, if there were an action group configured to use an external API, those costs would be additional. They are outside the scope of the calculations in this table.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, Systems Manager Parameter Store  |  100 chat interactions per day, average message size 32 KB per message, 5 minutes per connection  |  \$10.61  | 
|  CloudWatch  |  1.5 GB CloudWatch Logs with verbose mode on for experimentation  |  \$17.23  | 
|  DynamoDB  |  LLM configuration table for 1KB record size and 1 GB storage  |  \$10.25  | 
|   **Subtotal of costs (not including LLMs)**   |  |   **\$18.09**   | 
|  Anthropic Claude 3.5 Sonnet  |  \$1 Daily cost for 190K input tokens per day (0.003/1,000 tokens) = \$10.57 \$1 Daily cost × 30 days = \$117.10 \$1 Daily cost for 16K output tokens per day (0.015/1,000 tokens) = \$10.24 \$1 Daily cost × 30 days = \$17.20  |  \$124.30  | 
|  Amazon Bedrock (Amazon Titan Text Embeddings V2) for Amazon Bedrock Knowledge Bases  |  Daily cost for 190K input tokens per day (0.00002/1000 tokens) = 0.004 Daily cost × 30 days = \$10.12  |  \$10.12  | 
|  Amazon OpenSearch Service (Serverless) sample usage  |  Basic serverless configuration with 4 × OpenSearch Compute Unit (OCU) (billable minimum) = \$123.04 per day Daily cost × 30 days = \$1691.20  |  \$1691.20  | 
|  Amazon Bedrock Guardrails  |  190K tokens is roughly equivalent of 760K (190,000 × 4) characters and 3,800 text units (760K characters / 200) Consider a guardrail configured with content filters, personally identifiable information (PII) filter, sensitive information filter (regular expression) and word filters Daily content filter cost (0.75/1000 text units) \$1 PII filter cost (\$10.1/1000 text units) \$1 sensitive information filter (regex) \$1 word filters = \$12.85 \$1 \$10.38 \$1 \$10 \$1 \$10 Monthly cost = Daily cost × 30 days = \$196.90  |  \$196.90  | 
|   **Total application cost for an agent backed by Anthropic Claude 3.5 Sonnet**   |   *\$18.09 (use case cost) \$1* **\$1812.52 (other agent configurations)**   |  \$1820.61  | 

**Note**  
Refer to the pricing guide of your LLM provider if you’re not using an AWS model provider. Pricing guides for AWS services can be found at: [Amazon Bedrock pricing](https://aws.amazon.com/bedrock/pricing/) and [Amazon SageMaker AI pricing](https://aws.amazon.com/sagemaker/pricing/).

## Sample costs for MCP Server


MCP Server use cases enable deployment and management of Model Context Protocol servers on Amazon Bedrock AgentCore. The following table shows the cost breakdown of an MCP Server use case using the Gateway method to wrap existing Lambda functions.

The solution manages the AgentCore Gateway deployment and configuration. You’re charged for:
+ Infrastructure costs (API Gateway, Lambda, DynamoDB, CloudWatch, S3)
+ AgentCore Gateway consumption (per tool invocation)
+ Lambda function execution costs (for Gateway method with Lambda targets)
+ External API costs (for Gateway method with API or MCP Server targets, if applicable)


| Item | Calculations | Cost | 
| --- | --- | --- | 
|  Amazon API Gateway (REST API)  |  100 tool invocations per day × 30 days = 3,000 requests per month  |  \$10.05  | 
|  AWS Lambda (orchestration)  |  100 invocations per day × 30 days × 1 second average × 512 MB = 3,000 GB-seconds per month  |  \$10.05  | 
|  Amazon DynamoDB  |  3,000 read/write requests per month \$1 1 GB storage  |  \$10.15  | 
|  Amazon CloudWatch  |  Standard monitoring and logging for 3,000 invocations  |  \$11.00  | 
|  Amazon S3  |  Configuration storage and logs (minimal usage)  |  \$10.25  | 
|  Amazon Bedrock AgentCore Gateway  |  3,000 tool invocations per month  |  \$10.05  | 
|  Target Lambda Function  |  100 invocations per day × 30 days × 0.5 seconds × 128 MB = 1,500 GB-seconds per month  |  \$10.25  | 
|   **Total monthly cost**   |   *\$11.75 (infrastructure) \$1 \$10.05 (AgentCore Gateway)*   |  \$11.80  | 

**Note**  
Costs vary based on deployment method (Gateway vs Runtime), target types, and usage patterns. Runtime method deployments incur AgentCore Runtime charges instead of Gateway charges. External API costs and custom container hosting costs are additional.

## Sample costs for Agent Builder


Agent Builder enables you to create and deploy custom agents on Amazon Bedrock AgentCore. The following table shows the cost breakdown of an Agent Builder use case configured with Claude 3.5 Sonnet, MCP server integration, and long-term memory enabled.

The solution manages the AgentCore Runtime deployment and configuration. You’re charged for:
+ Infrastructure costs (API Gateway, Lambda, DynamoDB, CloudWatch, S3)
+ AgentCore Runtime consumption (CPU and memory hours based on actual agent execution time)
+ Foundation model inference (input and output tokens)
+ AgentCore Memory (short-term events and long-term storage/retrieval)

The following table assumes 100 interactions per day with 1,900 input tokens and 160 output tokens per query, with an average agent execution time of 5 seconds per interaction.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, Systems Manager Parameter Store  |  100 chat interactions per day, average message size 32 KB per message, 5 minutes per connection  |  \$10.61  | 
|  CloudWatch  |  1.5 GB CloudWatch Logs with verbose mode on for experimentation  |  \$17.23  | 
|  DynamoDB  |  LLM configuration table for 1KB record size and 1 GB storage  |  \$10.25  | 
|   **Subtotal of infrastructure costs**   |  |   **\$18.09**   | 
|  Amazon Bedrock AgentCore Runtime  |  \$1 CPU: 1 vCPU × 5 seconds × 100 interactions = 125 vCPU-seconds/day = 0.140 vCPU-hours/day \$1 Daily cost: 0.140 × \$10.0895 = \$10.013 \$1 Monthly cost: \$10.013 × 30 = \$10.38 \$1 Memory: 512 MB (0.5 GB) × 5 seconds × 100 interactions = 250 GB-seconds/day = 0.069 GB-hours/day \$1 Daily cost: 0.069 × \$10.00945 = \$10.0007 \$1 Monthly cost: \$10.0007 × 30 = \$10.02  |  \$10.40  | 
|  Anthropic Claude 3.5 Sonnet  |  \$1 Daily cost for 190K input tokens per day (0.003/1,000 tokens) = \$10.57 \$1 Daily cost × 30 days = \$117.10 \$1 Daily cost for 16K output tokens per day (0.015/1,000 tokens) = \$10.24 \$1 Daily cost × 30 days = \$17.20  |  \$124.30  | 
|  Amazon Bedrock AgentCore Memory  |  \$1 Short-term memory: 100 new events/day × \$10.25/1,000 events = \$10.025/day \$1 Monthly cost: \$10.025 × 30 = \$10.75 \$1 Long-term memory storage (built-in strategy): 100 records × \$10.75/1,000 records/month = \$10.075/month \$1 Long-term memory retrieval: 100 retrievals/day × \$10.50/1,000 retrievals = \$10.05/day \$1 Monthly cost: \$10.05 × 30 = \$11.50  |  \$12.33  | 
|   **Total application cost for Agent Builder with Claude 3.5 Sonnet**   |   *\$18.09 (infrastructure) \$1 \$10.40 (AgentCore Runtime) \$1 \$124.30 (model) \$1 \$12.33 (memory)*   |   **\$135.12**   | 

**Note**  
AgentCore Runtime pricing is consumption-based. Actual costs depend on:  
Agent execution time (CPU and memory usage during active processing)
Number of interactions and their complexity
MCP tool usage (additional CPU/memory for tool execution)
Memory configuration (short-term vs. long-term memory enabled)
For detailed AgentCore pricing, refer to [Amazon Bedrock pricing](https://aws.amazon.com/bedrock/agentcore/pricing/).

**Note**  
If using MCP servers that invoke external APIs or services, those costs are additional and outside the scope of this calculation. Similarly, if using AgentCore Browser or Code Interpreter tools, consumption-based charges apply at \$10.0895 per vCPU-hour and \$10.00945 per GB-hour.

## Sample costs for Workflow Builder


Workflow Builder creates a supervisor agent that orchestrates multiple Agent Builder agents. The following table shows the cost breakdown for a workflow with 1 supervisor agent and 3 specialized Agent Builder agents, all configured with Claude 3.5 Sonnet and long-term memory enabled.

Assumptions: 100 interactions per day, average 2 agent delegations per interaction, 5 seconds execution time per agent.


| AWS service | Dimensions | Cost [USD] | 
| --- | --- | --- | 
|  API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, Systems Manager Parameter Store  |  100 chat interactions per day, average message size 32 KB per message, 5 minutes per connection  |  \$10.61  | 
|  CloudWatch  |  1.5 GB CloudWatch Logs with verbose mode on for experimentation  |  \$17.23  | 
|  DynamoDB  |  LLM configuration table for 1KB record size and 1 GB storage  |  \$10.25  | 
|   **Subtotal of infrastructure costs**   |  |   **\$18.09**   | 
|  Amazon Bedrock AgentCore Runtime (Supervisor Agent)  |  \$1 CPU: 1 vCPU × 5 seconds × 100 interactions = 0.140 vCPU-hours/day × 30 = \$10.38 \$1 Memory: 0.5 GB × 5 seconds × 100 interactions = 0.069 GB-hours/day × 30 = \$10.02  |  \$10.40  | 
|  Amazon Bedrock AgentCore Runtime (3 Specialized Agents)  |  \$1 Average 2 delegations per interaction = 200 agent executions/day \$1 CPU: 1 vCPU × 5 seconds × 200 = 0.278 vCPU-hours/day × 30 = \$10.75 \$1 Memory: 0.5 GB × 5 seconds × 200 = 0.139 GB-hours/day × 30 = \$10.04  |  \$10.79  | 
|  Anthropic Claude 3.5 Sonnet (Supervisor Agent)  |  \$1 Input: 190K tokens/day × \$10.003/1K = \$10.57/day × 30 = \$117.10 \$1 Output: 16K tokens/day × \$10.015/1K = \$10.24/day × 30 = \$17.20  |  \$124.30  | 
|  Anthropic Claude 3.5 Sonnet (Specialized Agents)  |  \$1 Average 2 delegations per interaction \$1 Input: 380K tokens/day × \$10.003/1K = \$11.14/day × 30 = \$134.20 \$1 Output: 32K tokens/day × \$10.015/1K = \$10.48/day × 30 = \$114.40  |  \$148.60  | 
|  Amazon Bedrock AgentCore Memory (Supervisor Agent)  |  \$1 Short-term: 100 events/day × \$10.25/1K × 30 = \$10.75 \$1 Long-term storage: 100 records × \$10.75/1K = \$10.08 \$1 Long-term retrieval: 100 retrievals/day × \$10.50/1K × 30 = \$11.50  |  \$12.33  | 
|  Amazon Bedrock AgentCore Memory (Specialized Agents)  |  \$1 Short-term: 200 events/day × \$10.25/1K × 30 = \$11.50 \$1 Long-term storage: 200 records × \$10.75/1K = \$10.15 \$1 Long-term retrieval: 200 retrievals/day × \$10.50/1K × 30 = \$13.00  |  \$14.65  | 
|   **Total application cost for Workflow Builder with 3 agents**   |   *\$18.09 (infrastructure) \$1 \$11.19 (AgentCore Runtime) \$1 \$172.90 (models) \$1 \$16.98 (memory)*   |   **\$189.16**   | 

**Note**  
Higher delegation rates increase token consumption proportionally
For detailed AgentCore pricing, refer to [Amazon Bedrock pricing](https://aws.amazon.com/bedrock/pricing/).

# Security


When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This [shared responsibility model](https://aws.amazon.com/compliance/shared-responsibility-model/) reduces your operational burden because AWS operates, manages, and controls the components including the host operating system, virtualization layer, and physical security of the facilities in which the services operate. For more information about AWS security, visit [AWS Cloud Security](https://aws.amazon.com/security/).

## Using foundation models on Amazon Bedrock


Amazon Bedrock hosts a collection of models from Amazon Nova models to other leading foundation models (FMs). When using Amazon Bedrock, all models are hosted within the AWS infrastructure. This means that when using Amazon Bedrock as the LLM provider, all of your inference requests will remain within the AWS network and network traffic will not leave your Region.

**Note**  
All foundation models (FMs) available through Amazon Bedrock are hosted directly on AWS infrastructure managed and owned by AWS. Model providers do not have access to customer data such as prompts and continuations, or Amazon Bedrock service logs. For additional information about Amazon Bedrock’s security posture, refer to [Data protection in Amazon Bedrock](https://docs.aws.amazon.com/bedrock/) in the *Amazon Bedrock User Guide*.

## IAM roles


IAM roles allow customers to assign granular access policies and permissions to services and users on the AWS Cloud. This solution creates IAM roles that grant the solution’s Lambda functions access to create Regional resources.

## CloudWatch Logs


You can enable verbose mode while deploying a use case using the Deployment Dashboard model selection page, under Additional Settings. Verbose mode enables detailed CloudWatch logs which can be helpful for debugging and prompt experimentation.

**Note**  
When verbose mode is enabled, retrieved documents from the knowledge base (if RAG is enabled) and prompts will also be logged, which may contain sensitive information.

# VPC


The solution provides two options for Amazon VPC configuration:

1. Let the solution build an Amazon VPC for you.

1. Managing and bringing your own Amazon VPC for use within the solution.

## Let the solution build an Amazon VPC for you


If you select the option to let the solution build an Amazon VPC, it will deploy as a 2-AZ architecture by default with a CIDR range 10.10.0.0/20. You have the option to use [Amazon VPC IP Address Manager (IPAM)](https://docs.aws.amazon.com/vpc/latest/ipam/what-it-is-ipam.html), with 1 public subnet and 1 private subnet in each AZ. The solution creates NAT Gateways in each of the public subnets, and configures Lambda functions to create the [ENIs](https://docs.aws.amazon.com/Lambda/latest/dg/foundation-networking.html) in the private subnets. Additionally, this configuration creates route tables and its entries, security groups and its rules, network ACLs, VPC endpoints (gateway and interface endpoints).

## Managing your own Amazon VPC


When deploying the solution with an Amazon VPC, you have the option to use an existing Amazon VPC in your AWS account and Region. We recommended that you make your VPC available in at least two availability zones to ensure high availability. Your VPC must also have the following VPC endpoints and their associated IAM policies for your VPC and route table configurations.

### For a Deployment dashboard Amazon VPC


1.  [Gateway endpoint for DynamoDB](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-ddb.html).

1.  [Gateway endpoint for S3](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html).

1.  [Interface endpoint for CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CloudWatch-logs-and-interface-VPC.html).

1.  [Interface endpoint for AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-vpce-bucketnames.html).

### For a use case Amazon VPC


1.  [Gateway endpoint for DynamoDB](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-ddb.html).

1.  [Gateway endpoint for S3](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html).

1.  [Interface endpoint for CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CloudWatch-logs-and-interface-VPC.html).

1.  [Interface endpoint for Systems Manager Parameter Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/setup-create-vpc.html).
**Note**  
The solution only requires `com.amazonaws.region.ssm`.

1.  [Interface endpoint for Amazon Bedrock (bedrock-runtime, agent-runtime, bedrock-agent-runtime)](https://docs.aws.amazon.com/bedrock/latest/userguide/vpc-interface-endpoints.html).

1. Optional: If the deployment will use Amazon Kendra as a knowledge base, then an [interface endpoint for Amazon Kendra](https://docs.aws.amazon.com/kendra/latest/dg/vpc-interface-endpoints.html) is needed.

1. Optional: if the deployment will use any LLM under Amazon Bedrock, then an [interface endpoint for Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/vpc-interface-endpoints.html) is needed.
**Note**  
The solution only requires `com.amazonaws.region.bedrock-runtime`.

1. Optional: If the deployment will use Amazon SageMaker AI for the LLM, then an [interface endpoint for Amazon SageMaker AI](https://docs.aws.amazon.com/sagemaker/latest/dg/interface-vpc-endpoint.html) is needed.

**Note**  
The solution will not delete or modify the VPC configuration when using the **Bring your own VPC deployment** option. However, it will delete any VPCs that are created by the solution in the **Create a VPC for me** option. For this reason, you must be careful when sharing a solution-managed VPC across stacks/deployments.  
For example, deployment A uses **Create a VPC for me** option. Deployment B uses **Bring my own VPC** using the VPC created by deployment A. If deployment A is deleted before deployment B, then deployment B will no longer work because the VPC has been deleted. Also because deployment B is using the ENIs created by the Lambda functions, deleting deployment A might have errors and retention of residual resources.

# Amazon CloudFront


This solution deploys a web console [hosted](https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html) in an Amazon S3 bucket. To help reduce latency and improve security, this solution includes a CloudFront distribution with an origin access identity, which is a CloudFront user that provides public access to the solution’s website bucket contents. For more information, see [Restricting Access to Amazon S3 Content by Using an Origin Access Identity](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html) in the *Amazon CloudFront Developer Guide*.

**Note**  
CloudFront has an account-level soft quota limit of 20 response header policies. This solution creates custom response header policies for security purposes. If you have more than 20 deployments of the Generative AI Application Builder on AWS or its use cases, new deployments may fail due to hitting the quota limit.

To resolve this issue, you can request a quota increase for the **Response Header Policies** quota in the AWS Service Quotas console by following these steps:

1. Open the AWS Service Quotas console.

1. In the navigation pane, select **AWS services**.

1. Search for and select **Amazon CloudFront**.

1. Scroll to the **Response Header Policies** quota and choose **Request quota increase**.

1. Follow the prompts to request an increase in the quota limit for your AWS account.

By increasing the **Response Header Policies** quota, you can ensure that new deployments of the Generative AI Application Builder on AWS or its use cases do not fail due to the quota limit.

# Quotas


Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.

## Quotas for AWS services in this solution


Make sure you have sufficient quota for each of the [services implemented in this solution](architecture-details.md#aws-services-in-this-solution). For more information, refer to [AWS service quotas](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html).

Use the following links to go to the page for that service. To view the service quotas for all AWS services in the documentation without switching pages, view the information in the [Service endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/aws-general.pdf#aws-service-information) page in the PDF instead.

## Amazon Bedrock AgentCore quotas


For Agent Builder deployments, be aware of the following Amazon [Bedrock AgentCore service quotas](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/bedrock-agentcore-limits.html):


| Quota | US East (N. Virginia) | Other Regions | 
| --- | --- | --- | 
|  Active Session workloads per account  |  1000  |  500  | 
|  Total agents per account  |  1,000  |  1,000  | 
|  Versions per account  |  1,000  |  1,000  | 