# Guidance for Multi-Tenant Knowledge Base Management for scalable RAG Applications on AWS

## Overview

This Guidance demonstrates how enterprises can address the challenge of managing fragmented knowledge bases across multiple tenant environments by implementing a centralized synchronization system that automatically distributes content updates to all connected RAG applications. The system uses authenticated web interfaces and secure APIs to handle content ingestion from various sources including S3 storage and on-premises systems through Direct Connect. Step Functions orchestrate the processing pipeline while Kinesis streams distribute updates to tenant-specific queues, ensuring each organization receives relevant content without affecting others. You can reduce operational overhead by up to 60% while ensuring consistent, real-time knowledge synchronization across all your RAG applications and tenant environments.

## Benefits

### Scale RAG applications with tenant isolation

Deliver knowledge base content to each tenant through dedicated queues and isolated pipelines, so you can onboard new tenants without disrupting existing workloads or compromising data boundaries.


### Accelerate AI deployment with serverless orchestration

Automate your end-to-end data ingestion and distribution pipeline using serverless compute and workflow orchestration, so your teams spend less time on infrastructure and more time building AI-powered applications.


### Strengthen compliance with built-in audit trails.

Capture data snapshots at every processing stage so you maintain a complete audit trail, support governance requirements, and roll back to any prior state when needed.


## How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

[Download the architecture diagram](https://d1.awsstatic.com/onedam/marketing-channels/website/aws/en_US/solutions/approved/documents/architecture-diagrams/multi-tenant-knowledge-base-management-for-scalable-rag-applications-on-aws.pdf)

![Architecture diagram](/images/solutions/multi-tenant-knowledge-base-management-for-scalable-rag-applications-on-aws/images/multi-tenant-knowledge-base-management-for-scalable-rag-applications-on-aws-1.png)

1. **Step 1**: Authenticated tenants access the solution through the Knowledge Base Management Web Application, which serves as the primary interface for both submitting content and querying knowledge base data. All requests from the web application are routed through Amazon API Gateway, which acts as the secure entry point for all API interactions. Amazon Cognito handles identity management, authenticating and authorizing users before any request is processed. Together, API Gateway and Cognito ensure that only verified tenants can perform read and write operations — such as configuring data sources and triggering ingestion workflows through a consistent and secure API layer.
1. **Step 2**: Amazon S3 is the primary content ingestion provider for unstructured contents such as html, pdf, images, etc. AWS Direct Connect enables secure connectivity to on-premises data providers.
1. **Step 3**: AWS Lambda serves as the serverless computing service for AWS Step Functions tasks. AWS Step Functions orchestrates the end-to-end data pipeline, coordinating extract, transform, and distribution steps. Amazon S3 stores snapshots of data between intermediate processing steps. This caching strategy serves multiple purposes: audit trail maintenance, governance compliance, performance optimization through reduced reprocessing, and enabling rollback capabilities. Amazon DynamoDB used to persist workflow state and tenant configuration data.
1. **Step 4**: Processed content updates from the ingestion workflows are streamed into Amazon Kinesis, which serves as the entry point for the distribution layer. A dedicated AWS Lambda function acts as a Kinesis consumer, continuously reading records from the stream. For each content update, this Lambda function looks up the tenant configuration stored in Amazon DynamoDB to determine which tenants have subscribed to the content and identifies the corresponding tenant-specific Amazon SQS queue. The Lambda then publishes the update to the SQS queue. Each tenant has its own isolated SQS queue, which decouples the publishing process across tenants. This isolation ensures that a failure in delivering content to Tenant A's account does not affect Tenant B providing fault tolerance and independent retry behavior per tenant. Downstream AWS Lambda functions subscribed to each tenant's SQS queue then consume the messages and route the content to the appropriate tenant AWS account for final delivery.
1. **Step 5**: Amazon S3 stores distributed contents as an interim data source for ingestion into target knowledge base systems such as Amazon Bedrock Knowledge Bases and Amazon Quick Index (part of Amazon Quick Suite), an AI-powered workspace that creates a unified knowledge foundation by consolidating documents, files, and application data. Amazon Bedrock Knowledge Bases is often used with Amazon Bedrock AgentCore, a comprehensive set of enterprise-grade services for deploying and operating AI agents at scale, and can be implemented using frameworks like Strands Agents SDK for agentic use cases.
[Read usage guidelines](/solutions/guidance-disclaimers/)

