本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# AWS 使用 Terraform 和 Amazon Bedrock 在 上部署 RAG 使用案例
<a name="deploy-rag-use-case-on-aws"></a>

*Martin Maritsch、Nicolas Jacob Baer、Olivier Brique、Julian Ferdinand Grueber、Alice Morano 和 Nicola D Orazio、Amazon Web Services*

## 總結
<a name="deploy-rag-use-case-on-aws-summary"></a>

AWS 提供各種選項來建置支援[擷取增強生成 (RAG) 的](https://aws.amazon.com/what-is/retrieval-augmented-generation/)生成式 AI 使用案例。此模式為您提供以 LangChain 和 Amazon Aurora PostgreSQL 相容作為向量存放區之 RAG 型應用程式的解決方案。您可以使用 Terraform 直接將此解決方案部署到 ， AWS 帳戶 並實作下列簡單的 RAG 使用案例：

1. 使用者手動將檔案上傳至 Amazon Simple Storage Service (Amazon S3) 儲存貯體，例如 Microsoft Excel 檔案或 PDF 文件。（如需支援檔案類型的詳細資訊，請參閱[非結構化](https://docs.unstructured.io/open-source/core-functionality/partitioning)文件。)

1. 檔案的內容會解壓縮並內嵌至以無伺服器 Aurora PostgreSQL 相容為基礎的知識資料庫中，該資料庫支援近乎即時地將文件擷取至向量存放區。此方法可讓 RAG 模型存取和擷取低延遲之使用案例的相關資訊。

1. 當使用者與文字產生模型互動時，它會透過從先前上傳的檔案擷取相關內容擴增來增強互動。

模式使用 [Amazon Titan Text Embeddings v2](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html) 做為內嵌模型，而 [Anthropic Claude 3 Sonnet](https://aws.amazon.com/bedrock/claude/) 做為文字產生模型，兩者皆可在 Amazon Bedrock 上使用。

## 先決條件和限制
<a name="deploy-rag-use-case-on-aws-prereqs"></a>

**先決條件**
+ 作用中 AWS 帳戶。
+ AWS Command Line Interface (AWS CLI) 已安裝並使用 設定 AWS 帳戶。如需安裝說明，請參閱 AWS CLI 文件[中的安裝或更新至最新版本的 AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) 。若要檢閱您的 AWS 登入資料和對帳戶的存取，請參閱 AWS CLI 文件中的[組態和登入資料檔案設定](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)。
+ 在 Amazon Bedrock 主控台中為所需的大型語言模型 (LLMs) 啟用的模型存取 AWS 帳戶。此模式需要下列 LLMs：
  + `amazon.titan-embed-text-v2:0`
  + `anthropic.claude-3-sonnet-20240229-v1:0`

**限制**
+ 此範例架構不包含使用向量資料庫進行程式設計問題回答的界面。如果您的使用案例需要 API，請考慮使用執行擷取和問答任務的 AWS Lambda 函數新增 [Amazon API Gateway](https://docs.aws.amazon.com/apigateway/latest/developerguide)。 
+ 此範例架構不包含已部署基礎設施的監控功能。如果您的使用案例需要監控，請考慮新增[AWS 監控服務](https://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/welcome.html)。
+ 如果您在短時間內將大量文件上傳至 Amazon S3 儲存貯體，Lambda 函數可能會遇到速率限制。作為解決方案，您可以將 Lambda 函數與 Amazon Simple Queue Service (Amazon SQS) 佇列分離，您可以在其中控制 Lambda 調用速率。
+ 有些 AWS 服務 完全無法使用 AWS 區域。如需區域可用性，請參閱[AWS 服務 依區域](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/)。如需特定端點，請參閱[服務端點和配額](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)，然後選擇服務的連結。

**產品版本**
+ [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) 第 2 版或更新版本
+ [Docker](https://docs.docker.com/get-started/) 26.0.0 版或更新版本
+ [Poetry](https://pypi.org/project/poetry/) 1.7.1 版或更新版本
+ [Python](https://www.python.org/downloads/) 3.10 版或更新版本
+ [Terraform](https://developer.hashicorp.com/terraform/install) 1.8.4 版或更新版本

## Architecture
<a name="deploy-rag-use-case-on-aws-architecture"></a>

下圖顯示此模式的工作流程和架構元件。

![\[在 Amazon Bedrock 上使用 Aurora PostgreSQL 和 LLMs建立 RAG 型應用程式的工作流程。\]](http://docs.aws.amazon.com/zh_tw/prescriptive-guidance/latest/patterns/images/pattern-img/8f184945-7f17-4760-8806-6d0eaeef372a/images/3771b7a0-05bd-4eb3-ad5b-199e22f86184.png)


此圖表說明下列項目：

1. 在 Amazon S3 儲存貯體 中建立物件時`bedrock-rag-template-<account_id>`，[Amazon S3 通知](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html)會叫用 Lambda 函數 `data-ingestion-processor`。

1. Lambda 函數`data-ingestion-processor`是以存放在 Amazon Elastic Container Registry (Amazon ECR) 儲存庫 中的 Docker 映像為基礎`bedrock-rag-template`。

   函數使用 [LangChain S3FileLoader](https://python.langchain.com/v0.1/docs/integrations/document_loaders/aws_s3_file/) 將檔案讀取為 [LangChain 文件](https://api.python.langchain.com/en/v0.0.339/schema/langchain.schema.document.Document.html)。然後，[LangChain RecursiveCharacterTextSplitter](https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/) 會指定每個文件區塊，而 `CHUNK_SIZE`和 `CHUNK_OVERLAP`取決於 Amazon Titan Text Embedding V2 內嵌模型的最大字符大小。接著，Lambda 函數會叫用 Amazon Bedrock 上的內嵌模型，將區塊內嵌到數值向量表示法中。最後，這些向量會存放在 Aurora PostgreSQL 資料庫中。若要存取資料庫，Lambda 函數會先從中擷取使用者名稱和密碼 AWS Secrets Manager。

1. 在 Amazon SageMaker AI [筆記本執行個體](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html) 上`aws-sample-bedrock-rag-template`，使用者可以撰寫問題提示。此程式碼會在 Amazon Bedrock 上叫用 Claude 3，並將知識庫資訊新增至提示的內容。因此，Claude 3 會使用文件中的資訊提供回應。

此模式的聯網和安全性方法如下：
+ Lambda 函數`data-ingestion-processor`位於虛擬私有雲端 (VPC) 內的私有子網路中。由於 Lambda 函數的安全群組，因此不允許將流量傳送至公有網際網路。因此，流向 Amazon S3 和 Amazon Bedrock 的流量只會透過 VPC 端點路由。因此，流量不會周遊公有網際網路，這可減少延遲，並在聯網層級增加額外的安全層。
+ 適用時，所有資源和資料都會使用別名 的 AWS Key Management Service (AWS KMS) 金鑰進行加密`aws-sample/bedrock-rag-template`。

**自動化和擴展**

此模式使用 Terraform 將基礎設施從程式碼儲存庫部署到 AWS 帳戶。

## 工具
<a name="deploy-rag-use-case-on-aws-tools"></a>

**AWS 服務**
+ [Amazon Aurora PostgreSQL 相容版本](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.AuroraPostgreSQL.html)是完全受管的 ACID 相容關聯式資料庫引擎，可協助您設定、操作和擴展 PostgreSQL 部署。在此模式中，Aurora PostgreSQL 相容會使用 pgvector 外掛程式做為向量資料庫。
+ [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) 是一項全受管服務，可讓您透過統一 API 使用來自領導 AI 新創公司的高效能基礎模型 (FMs) 和 Amazon。
+ [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) 是一種開放原始碼工具，可協助您 AWS 服務 透過命令列 shell 中的命令與 互動。
+ [Amazon Elastic Container Registry (Amazon ECR)](https://docs.aws.amazon.com/AmazonECR/latest/userguide/what-is-ecr.html) 是一種受管容器映像登錄服務，安全、可擴展且可靠。在此模式中，Amazon ECR 會託管 `data-ingestion-processor` Lambda 函數的 Docker 映像。
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) 透過控制已驗證並獲授權使用的人員，協助您安全地管理對 AWS 資源的存取。
+ [AWS Key Management Service (AWS KMS)](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html) 可協助您建立和控制密碼編譯金鑰，以協助保護您的資料。
+ [AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/welcome.html) 是一項運算服務，可協助您執行程式碼，無需佈建或管理伺服器。它只會在需要時執行程式碼並自動擴展，因此您只需支付使用的運算時間。在此模式中，Lambda 會將資料擷取至向量存放區。
+ [Amazon SageMaker AI](https://docs.aws.amazon.com/sagemaker/?id=docs_gateway) 是一種受管機器學習 (ML) 服務，可協助您建置和訓練 ML 模型，然後將模型部署到生產就緒的託管環境中。
+ [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html) 可協助您將程式碼中的硬式編碼憑證 (包括密碼) 取代為 Secrets Manager 的 API 呼叫，以便透過程式設計方法來擷取機密。
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) 是一種雲端型物件儲存服務，可協助您儲存、保護和擷取任何數量的資料。
+ [Amazon Virtual Private Cloud (Amazon VPC)](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) 可協助您在已定義的虛擬網路中啟動 AWS 資源。此虛擬網路與您在自己的資料中心中操作的傳統網路相似，且具備使用 AWS可擴展基礎設施的優勢。VPC 包含子網路和路由表，以控制流量流程。

**其他工具**
+ [Docker](https://docs.docker.com/manuals/) 是一組平台即服務 (PaaS) 產品，可在作業系統層級使用虛擬化在容器中交付軟體。
+ [HashiCorp Terraform](https://www.terraform.io/docs) 是一種基礎設施即程式碼 (IaC) 工具，可協助您使用程式碼來佈建和管理雲端基礎設施和資源。
+ [Poetry](https://pypi.org/project/poetry/) 是一種在 Python 中管理相依性和封裝的工具。
+ [Python](https://www.python.org/) 是一種一般用途的電腦程式設計語言。

**程式碼儲存庫**

此模式的程式碼可在 GitHub [terraform-rag-template-using-amazon-bedrock](https://github.com/aws-samples/terraform-rag-template-using-amazon-bedrock) 儲存庫中使用。

## 最佳實務
<a name="deploy-rag-use-case-on-aws-best-practices"></a>
+ 雖然此程式碼範例可以部署到任何 AWS 區域，但我們建議您使用美國東部 （維吉尼亞北部） – `us-east-1`或美國西部 （加利佛尼亞北部） – `us-west-1`。此建議是根據此模式發佈時 Amazon Bedrock 中基礎模型和內嵌模型的可用性。如需 中 up-to-date清單 AWS 區域，請參閱 Amazon Bedrock 文件中的 [模型支援 AWS 區域](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html)。如需將此程式碼範例部署到其他區域的詳細資訊，請參閱[其他資訊](#deploy-rag-use-case-on-aws-additional)。
+ 此模式僅提供proof-of-concept(PoC) 或試行示範。如果您想要將程式碼帶入生產環境，請務必使用下列最佳實務：
  + 啟用 Amazon S3 的伺服器存取記錄。
  + 設定 Lambda 函數的[監控和提醒](https://docs.aws.amazon.com/lambda/latest/dg/lambda-monitoring.html)。
  + 如果您的使用案例需要 API，請考慮使用執行擷取和問答任務的 Lambda 函數新增 Amazon API Gateway。
+ 遵循最低權限原則，並授予執行任務所需的最低許可。如需詳細資訊，請參閱 IAM 文件中的[授予最低權限](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html#grant-least-priv)和[安全最佳實務](https://docs.aws.amazon.com/IAM/latest/UserGuide/IAMBestPracticesAndUseCases.html)。

## 史詩
<a name="deploy-rag-use-case-on-aws-epics"></a>

### 在 中部署解決方案 AWS 帳戶
<a name="deploy-the-solution-in-an-aws-account"></a>


| 任務 | Description | 所需的技能 | 
| --- | --- | --- | 
| 複製儲存庫。 | 若要複製此模式隨附的 GitHub 儲存庫，請使用下列命令：<pre>git clone https://github.com/aws-samples/terraform-rag-template-using-amazon-bedrock</pre> | AWS DevOps | 
| 設定變數。 | 若要設定此模式的參數，請執行下列動作：[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/prescriptive-guidance/latest/patterns/deploy-rag-use-case-on-aws.html) | AWS DevOps | 
| 部署解決方案。 | 若要部署解決方案，請執行下列動作：[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/prescriptive-guidance/latest/patterns/deploy-rag-use-case-on-aws.html)基礎設施部署會在 VPC 內佈建 SageMaker AI 執行個體，並具有存取 Aurora PostgreSQL 資料庫的許可。 | AWS DevOps | 

### 測試解決方案
<a name="test-the-solution"></a>


| 任務 | Description | 所需的技能 | 
| --- | --- | --- | 
| 執行示範。 | 先前的基礎設施部署成功後，請使用下列步驟在 Jupyter 筆記本中執行示範：[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/prescriptive-guidance/latest/patterns/deploy-rag-use-case-on-aws.html)Jupyter 筆記本會引導您完成下列程序：[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/prescriptive-guidance/latest/patterns/deploy-rag-use-case-on-aws.html) | 一般 AWS | 

### 清除基礎設施
<a name="clean-up-infrastucture"></a>


| 任務 | Description | 所需的技能 | 
| --- | --- | --- | 
| 清除基礎設施。 | 若要移除您不再需要的所有資源，請使用下列命令：<pre>terraform destroy -var-file=commons.tfvars</pre> | AWS DevOps | 

## 相關資源
<a name="deploy-rag-use-case-on-aws-resources"></a>

**AWS resources**
+ [使用 Python 建置 Lambda 函數](https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html)
+ [基礎模型的推論參數](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html)
+ [存取 Amazon Bedrock 基礎模型](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)
+ [向量資料庫在生成式 AI 應用程式中的角色 ](https://aws.amazon.com/blogs/database/the-role-of-vector-datastores-in-generative-ai-applications/)(AWS 資料庫部落格）
+ [使用 Amazon Aurora PostgreSQL](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.AuroraPostgreSQL.html)

**其他資源**
+ [pgvector 文件](https://github.com/pgvector/pgvector)

## 其他資訊
<a name="deploy-rag-use-case-on-aws-additional"></a>

**實作向量資料庫**

此模式使用 Aurora PostgreSQL 相容來實作 RAG 的向量資料庫。作為 Aurora PostgreSQL 的替代方案， 為 RAG AWS 提供其他功能和服務，例如 Amazon Bedrock 知識庫和 Amazon OpenSearch Service。您可以選擇最符合您特定需求的解決方案：
+ [Amazon OpenSearch Service](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/what-is.html) 提供分散式搜尋和分析引擎，您可以用來存放和查詢大量資料。
+ [Amazon Bedrock 知識庫](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html)旨在建置和部署知識庫作為額外的抽象概念，以簡化 RAG 擷取和擷取程序。Amazon Bedrock 知識庫可以同時使用 Aurora PostgreSQL 和 Amazon OpenSearch Service。

**部署到其他 AWS 區域**

如[架構](#deploy-rag-use-case-on-aws-architecture)中所述，我們建議您使用美國東部 （維吉尼亞北部） – `us-east-1`或美國西部 `us-west-1` （加利佛尼亞北部） – 部署此程式碼範例。不過，有兩種可能的方法來將此程式碼範例部署到 `us-east-1`和 以外的區域`us-west-1`。您可以在 `commons.tfvars` 檔案中設定部署區域。對於跨區域基礎模型存取，請考慮下列選項：
+ **周遊公有網際網路** – 如果流量可以周遊公有網際網路，請將網際網路閘道新增至 VPC。然後，調整指派給 Lambda 函數`data-ingestion-processor`和 SageMaker AI 筆記本執行個體的安全群組，以允許輸出流量到公有網際網路。
+ **不周遊公有網際網路** – 若要將此範例部署到 `us-east-1`或 以外的任何區域`us-west-1`，請執行下列動作：

1. 在 `us-east-1`或 `us-west-1`區域中，建立額外的 VPC，包括 的 VPC 端點`bedrock-runtime`。

1. 使用 [VPC 對等互連或傳輸閘道至應用程式 VPC 來建立對等互連](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html)。 [https://docs.aws.amazon.com/vpc/latest/tgw/tgw-peering.html](https://docs.aws.amazon.com/vpc/latest/tgw/tgw-peering.html)

1. 在 `bedrock-runtime` `us-east-1`或 之外的任何 Lambda 函數中設定 boto3 用戶端時`us-west-1`，請將 `bedrock-runtime` `us-east-1`或 us-west-1 中 VPC 端點的私有 DNS 名稱傳遞`endpoint_url`給 boto3 用戶端。