

# Use project data as a data source
<a name="data-source-project"></a>

You can configure an Amazon Bedrock knowledge base to use data sources that are already configured for your project.

**Topics**
+ [Project data sources](#data-source-project-data-sources)
+ [Create a knowledge base with a project data source](#data-source-project-procedure)

## Project data sources
<a name="data-source-project-data-sources"></a>

You can include the following data sources from your project:

### Amazon S3 bucket
<a name="data-source-project-s3"></a>

[Amazon S3](https://docs.aws.amazon.com/s3/) is an object storage service that stores data as objects within buckets. You can use files in your project's bucket as a data source for a knowledge base.

### Amazon Redshift
<a name="data-source-project-redshift"></a>

[Amazon Redshift](https://docs.aws.amazon.com/redshift/) is a serverless data warehouse service that automatically provisions and scales data warehouse capacity to deliver high performance for demanding and unpredictable workloads without the need to manage infrastructure.

You can include all data tables from an Amazon Redshift database or select up to 50 data tables from the available schemas. After selecting the tables, you can select the columms that you want include. You can also preview data from the database, based on the selected columns.

### lakehouse architecture
<a name="data-source-project-lakehouse"></a>

 [lakehouse architecture](https://docs.aws.amazon.com/sagemaker-lakehouse-architecture/latest/userguide/what-is-smlh.html) unifies your data across Amazon S3 data lakes and Amazon Redshift data warehouses.

## Create a knowledge base with a project data source
<a name="data-source-project-procedure"></a>

The following procedure shows how to create a knowledge base with an Amazon S3 bucket, an Amazon Redshift data warehouse, or with lakehouse architecture. 

**To create a knowledge base with a project data source**

1. Navigate to the Amazon SageMaker Unified Studio landing page by using the URL from your administrator.

1. Access Amazon SageMaker Unified Studio using your IAM or single sign-on (SSO) credentials. For more information, see [Access Amazon SageMaker Unified Studio](getting-started-access-the-portal.md).

1. Choose the **Build** menu at the top of the page.

1. In the **MACHINE LEARNING & GENERATIVE AI** section, choose **My apps**.

1. In the **Select or create a new project to continue** dialog box, select the project that you want to use.

1. In the left pane, choose **Asset gallery**.

1. Choose **My components**.

1. In the **Components** section, choose **Create component** and then **Knowledge Base**. The **Create Knowledge Base** pane is shown.

1. For **Name**, enter a name for the Knowledge Base.

1. For **Description**, enter a description for the Knowledge Base.

1. For **Select data source type**, select **Project data sources**.

1. In **Select data source**, select an existing data source (**S3**, **Redshift**, or **Lakehouse**). Alternatively choose to add a new connection. 
   + **S3** – Do the following: 

     1. For **S3 URI** enter the the Amazon S3 Uniform Resource Identifier (URI) of the file or folder that you want to use. Alternatively, choose **Browse** to browse the bucket and choose file or folder.

     1. Choose **Save** to save your changes.
   + **Redshift (Lakehouse)** – Do the following:

     1. For **Select a database** select the database that you want to use.

     1. Choose **Update data tables and columns** to choose the tables and columns that you want to use. To preview the data from the selections you made, you choose **Data**.

     1. Choose **Save** to save your changes.
   + **Lakehouse** – Do the following:

     1. For **Select catalog** select the catalog that you want to use.

     1. For **Select a database** select the database that you want to use.

     1. Choose **Update data tables and columns** to choose the tables and columns that you want to use. To preview the data from the selections you made, you choose **Data**.

     1. Choose **Save** to save your changes.
   + (Optional) For Amazon Redshift and lakehouse architecture data sources you can make the following configuration changes:
     + **Maximum query time** ‐ Limit the time that a query can take by setting a maximum query time, in seconds. 
     + **Descriptions** ‐ Add descriptions and annotations to the names of tables and columns to improve the accuracy of responses from a chat agent app.
     + **Curated queries** ‐ Use curated queries that help guide the agent to create better responses. A curated query is an example question along with the matching SQL query for the question.

1. Choose **Create** to create the Knowledge Base.

1. Use the Knowledge Base in an app, by doing one of the following:
   + If your app is a chat agent app, do [Add an Amazon Bedrock Knowledge Base component to a chat agent app](add-kb-component-chat-app.md).
   + If your app is a flow app, do [Add a Knowledge Base component to a flow app](add-kb-component-prompt-flow-app.md).