GENREL05-BP03 Verify that agent capabilities are available across all regions of availability
Agents require supporting infrastructure to service requests from foundation models. Using agents across a region of availability requires the supporting infrastructure to be available in that region.
Desired outcome: When implemented, this best practice improves the reliability of your generative AI workload by verifying that agents have access to the appropriate supporting infrastructure such as APIs or functions, so they may service a wider region of availability.
Benefits of establishing this best practice: Scale horizontally to increase aggregate workload availability - Data replication across a region of availability horizontally scales data access infrastructure, enabling foundation models to consistently service inference requests across a region of availability.
Level of risk exposed if this best practice is not established: Medium
Implementation guidance
Agents for Amazon Bedrock can be made available across regions, so long as the models and supporting infrastructure exist in the desired regions. Amazon Bedrock Agents make API calls on behalf of a user. Once deployed to a new region, these agents must have access to the same or regionally-equivalent API. Consider deploying your APIs across multiple regions behind a CloudFront distribution with latency-based routing. When possible, leverage Amazon RouteĀ 53 with latency-based routing to direct traffic within your VPC (and on the Amazon backbone) rather than taking private traffic public to route to an internal service. If your agent is not making calls to a foundation model using a cross-region inference profile, be sure to configure model access in all required regions.
When using agents in your generative AI architecture, make the supporting infrastructure, such as APIs and functions, available across all Regions where your agents are deployed. This involves replicating the necessary components and configuring appropriate routing mechanisms to maintain consistent agent functionality regardless of user location.
Implementation steps
-
Deploy supporting agent infrastructure (APIs, functions) in primary and secondary Regions.
-
Implement latency-based routing or similar mechanisms to distribute agent requests.
-
Verify that agents can access the required resources in all Regions.
-
Monitor agent performance and resource utilization across Regions.
Resources
Related best practices:
Related documents:
Related examples: