

# Creating a configured table in AWS Clean Rooms
<a name="create-configured-table"></a>

A *configured table* is a reference to an existing table in a data source. It contains an analysis rule that determines how the data can be queried in AWS Clean Rooms. Configured tables can be associated to one or more collaborations.

For information about how to create a configured table using the AWS SDKs, see the [https://docs.aws.amazon.com/clean-rooms/latest/apireference/Welcome.html](https://docs.aws.amazon.com/clean-rooms/latest/apireference/Welcome.html).

**Topics**
+ [Creating a configured table – Amazon S3 data source](create-config-table-s3.md)
+ [Creating a configured table – Amazon Athena data source](create-config-table-athena.md)
+ [Creating a configured table – Snowflake data source](create-config-table-snowflake.md)

# Creating a configured table – Amazon S3 data source
<a name="create-config-table-s3"></a>

In this procedure, the [member](glossary.md#glossary-member) does the following tasks: 
+  Configures an existing AWS Glue table for use in AWS Clean Rooms. (This step can be done before or after joining a collaboration, unless using Cryptographic Computing for Clean Rooms.)
**Note**  
AWS Clean Rooms supports AWS Glue tables. For more information about getting your data in AWS Glue, see [Step 3: Upload your data table to Amazon S3](prepare-data-S3.md#upload-to-s3). 
+ Names the [configured table](glossary.md#glossary-configured-table) and chooses which columns to use in the collaboration.

The following procedure assumes that:
+ The collaboration member has already [uploaded their data tables to Amazon S3](prepare-data-S3.md#upload-to-s3) and [created an AWS Glue table](prepare-data-S3.md#create-glue-crawler).
**Note**  
The **Results destination in Amazon S3** can't be within the same S3 bucket as any data source.
+ (Optional) For [encrypted](glossary.md#glossary-encryption) data tables only, the collaboration member has already [prepared encrypted data tables](prepare-encrypted-data.md) using the C3R encryption client.

You can use the statistic generation provided by AWS Glue to compute column-level statistics for AWS Glue Data Catalog tables. After AWS Glue generates statistics for tables in the Data Catalog, Amazon Redshift Spectrum automatically uses those statistics to optimize the query plan. For more information about computing column-level statistics using AWS Glue, see [Optimizing query performance using column statistics](https://docs.aws.amazon.com/glue/latest/dg/column-statistics.html) in the *AWS Glue User Guide*. For more information about AWS Glue, see the *[AWS Glue Developer Guide](https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html)*.

**To create a configured table – Amazon S3 data source**

1. Sign in to the AWS Management Console and open the AWS Clean Rooms console at [https://console.aws.amazon.com/cleanrooms](https://console.aws.amazon.com/cleanrooms/home).

1. In the left navigation pane, choose **Tables**.

1. In the upper right corner, choose **Configure new table**.

1. For **Data source**, under **AWS data sources**, choose **Amazon S3**. 

1. Under **Amazon S3 table**: 

   1. Select the **Region** where the S3 table is hosted.

      By default, the current Region (such as N. Virginia us-east-1) is selected. 
**Warning**  
When your Amazon S3 data source is in a different Region than your processing location, data processing may occur temporarily outside the source Region. Before proceeding, verify that cross-Region data movement complies with your data sovereignty requirements, regulatory compliance policies, and data governance standards. 

      For more information about Regions, see [Regions and Endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html) in the *AWS General Reference*. 

   1. Choose the **Database** from the dropdown list.

   1. Choose the **Table** that you want to configure from the dropdown list.
**Note**  
To verify that this is the correct table, do either one of the following:  
Choose **View in AWS Glue**.
Turn on **View schema from AWS Glue** to view the schema.
**Important**  
For AWS Glue tables where the data is in CSV format, the column names and order in the Glue schema must exactly match the CSV data. If they don't align, the allowed columns list for the configured table might not be enforced properly.

1. For **Columns and analysis methods allowed in collaborations**, 

   1. For **Which columns do you want to allow in collaborations?**
      + Choose **All columns** to allow all columns to be queried in the collaboration.
      + Choose **Custom list** to allow one or more columns from the **Specify allowed columns** dropdown list to be queried in the collaboration.

   1. For **Allowed analysis methods**,

      1. Choose **Direct query** to allow SQL queries to be run directly on this table

      1. Choose **Direct job** to allow PySpark jobs to be run directly on this table.  
**Example**  

   For example, if you want to allow collaboration members to run both direct SQL queries and PySpark jobs on all columns, then choose **All columns**, **Direct query**, and **Direct job**.

1. For **Configured table details**, 

   1. Enter a **Name** for the configured table.

      You can use the default name or rename this table.

   1. Enter a **Description** of the table. 

      The description helps differentiate between other configured tables with similar names.

1. If you want to enable **Tags** for the configured table resource, choose **Add new tag** and then enter the **Key** and **Value** pair. 

1. Choose **Configure new table**. 

Now that you have created a configured table, you are ready to: 
+ [Add an analysis rule to the configured table](add-analysis-rule.md)
+ [Associate the configured table to a collaboration](associate-configured-table.md)

# Creating a configured table – Amazon Athena data source
<a name="create-config-table-athena"></a>

The Amazon Athena data source option allows you to query data stored in Amazon S3, cataloged in the AWS Glue data catalog or federated catalogs, and access controlled via AWS Lake Formation. Both tables and AWS Glue Data Catalog Views are supported. Lake Formation resource links can be used to share tables and views across AWS accounts and across AWS Regions to the AWS Clean Rooms member account that joins them to an AWS Clean Rooms collaboration. 

**Note**  
Only Amazon S3-based datasets can be queried via the Athena data source integration.

In this procedure, the [member](glossary.md#glossary-member) does the following tasks: 
+ Configures an existing table or view in the AWS Glue Data Catalog for use the AWS Clean Rooms
+ Names the [configured table](glossary.md#glossary-configured-table) and chooses which columns to use in the collaboration.

The following procedure assumes that:
+ The collaboration member has already created the AWS Glue Data Catalog database and table or GDC view. 

**To create a configured table – Athena data source**

1. Sign in to the AWS Management Console and open the AWS Clean Rooms console at [https://console.aws.amazon.com/cleanrooms](https://console.aws.amazon.com/cleanrooms/home).

1. In the left navigation pane, choose **Tables**.

1. In the upper right corner, choose **Configure new table**.

1. For **Data source**, under **AWS data sources**, choose **Amazon Athena**. 

1. Under **Amazon Athena table**: 

   1. Select the **Region** where the Amazon Athena table is hosted.

      By default, the current Region (such as N. Virginia us-east-1) is selected. 
**Warning**  
When your Amazon Athena data source is in a different Region than your processing location, data processing may occur temporarily outside the source Region. Before proceeding, verify that cross-Region data movement complies with your data sovereignty requirements, regulatory compliance policies, and data governance standards. 

      For more information about Regions, see [Regions and Endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html) in the *AWS General Reference*. 

   1. Choose the **Catalog** from the dropdown list.

      By default, **AWS Glue Data Catalog** is selected.
      + **AWS Glue Data Catalog** – The default catalog for tables in AWS Glue.
      + **Federated catalog** – Available if you've configured AWS Glue Catalog Federation to connect to remote Apache Iceberg REST catalogs. For more information, see [Catalog federation](https://docs.aws.amazon.com/lake-formation/latest/dg/catalog-federation.html) in the *AWS Lake Formation Developer Guide*.

   1. Choose the **Database** from the dropdown list.

   1. Choose the **Table** that you want to configure from the dropdown list.
**Note**  
To verify that this is the correct table, do either one of the following:  
Choose **View in AWS Glue** or **View in AWS Lake Formation** (depending on your catalog type).
Turn on **View schema from AWS Glue** to view the schema.

1. For **Amazon Athena configurations**,

   1. Choose a **Workgroup** from the dropdown list.

   1. For **S3 output location**, choose a recommended action, based on one of the following scenarios.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/create-config-table-athena.html)

1. For **Columns allowed in collaborations**, choose an option based on your goal.     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/create-config-table-athena.html)

1. For **Configured table details**, 

   1. Enter a **Name** for the configured table.

      You can use the default name or rename this table.

   1. Enter a **Description** of the table. 

      The description helps differentiate between other configured tables with similar names.

   1. If you want to enable **Tags** for the configured table resource, choose **Add new tag** and then enter the **Key** and **Value** pair. 

1. Choose **Configure new table**. 

Now that you have created a configured table, you are ready to: 
+ [Add an analysis rule to the configured table](add-analysis-rule.md)
+ [Associate the configured table to a collaboration](associate-configured-table.md)

# Creating a configured table – Snowflake data source
<a name="create-config-table-snowflake"></a>

In this procedure, the [member](glossary.md#glossary-member) does the following tasks: 
+ Configures an existing Snowflake table for use in AWS Clean Rooms. (This step can be done before or after joining a collaboration, unless using Cryptographic Computing for Clean Rooms.)
+ Names the [configured table](glossary.md#glossary-configured-table) and chooses which columns to use in the collaboration.

The following procedure assumes that:
+ The collaboration member has already uploaded their data tables to Snowflake.
+ (Optional) For [encrypted](glossary.md#glossary-encryption) data tables only, the collaboration member has already [prepared encrypted data tables](prepare-encrypted-data.md) using the C3R encryption client.

**To create a configured table – Snowflake data source**

1. Sign in to the AWS Management Console and open the AWS Clean Rooms console at [https://console.aws.amazon.com/cleanrooms](https://console.aws.amazon.com/cleanrooms/home).

1. In the left navigation pane, choose **Tables**.

1. In the upper right corner, choose **Configure new table**.

1. For **Data source**, under **Third-party clouds and data sources**, choose **Snowflake**. 

1. Specify the **Snowflake credentials** using an existing secret ARN or storing a new secret for this table.

------
#### [ Use existing secret ARN ]

   1. If you have a secret ARN, enter it in the **Secret ARN** field. 

      You can look up your secret ARN by choosing **Go to AWS Secrets Manager**.

   1. If you have an existing secret from another table, choose **Import secret ARN from existing table**. 

**Note**  
The secret ARN can be cross-account. 

------
#### [ Store a new secret for this table ]

   1. Enter the following Snowflake credentials:
      + **Snowflake username**
      + **Snowflake warehouse**
      + **Snowflake role**
      + **Snowflake Privacy Enhanced Mail (PEM) private key** 

   1. For encryption, do one of the following:
      + To use the AWS managed key (default), leave the **Customize encryption settings** checkbox cleared. 
      + To use a custom AWS KMS key:
        + Select the **Customize encryption settings** checkbox.
        + For **KMS key**, enter the key ARN or choose one from the list.

   1. Enter a **Secret name** to help you find your credentials later.

------

1. For **Snowflake table and schema details**, enter the details manually or automatically import the details.

------
#### [ Enter the details manually ]

   1. Enter the **Snowflake account identifier**.

      For more information, see [Account identifiers](https://docs.snowflake.com/en/user-guide/admin-account-identifier#finding-the-organization-and-account-name-for-an-account) in the Snowflake documentation. 

      Your account identifier must be in the format used for Snowflake drivers. You need to replace the period (.) with a hyphen (-) so the identifier is formatted as **<orgname>-<account\$1name>**.

   1. Enter the **Snowflake database**.

      For more information, see [Snowflake database](https://docs.snowflake.com/en/sql-reference/snowflake-db) in the Snowflake documentation.

   1. Enter the **Snowflake schema name**.

   1. Enter the **Snowflake table name**.

      For more information, see [Understanding Snowflake Table Structures](https://docs.snowflake.com/en/user-guide/tables-micro-partitions) in the Snowflake documentation. 

   1. For the **Schema**, enter the **Column name** and choose the **Data type** from the dropdown list. 

   1. Choose **Add column** to add more columns.
      +  If you choose an **Object data type**, specify the **Object schema**.   
**Example object schema**  

        ```
        name STRING,
        location OBJECT(
            x INT, 
            y INT, 
            metadata OBJECT(uuid STRING)
        ),
        history ARRAY(TEXT)
        ```
      + If you choose an **Array data type**, specify the **Array schema**.  
**Example array schema**  

        ```
        OBJECT(x INT, y INT)
        ```
      + If you choose a **Map data type**, specify the **Map schema**.  
**Example map schema**  

        ```
        STRING, OBJECT(x INT, y INT)
        ```

------
#### [ Automatically import the details ]

   1. Export your COLUMNS view from Snowflake as a CSV file.

      For more information about the Snowflake COLUMNS view, see [COLUMNS view](https://docs.snowflake.com/en/sql-reference/info-schema/columns) in the Snowflake documentation.

   1. Choose **Import from file** to import the CSV file and specify any additional information. 

      The database name, schema name, table name, column names and data types are automatically imported.
      +  If you choose an **Object data type**, specify the **Object schema**. 
      + If you choose an **Array data type**, specify the **Array schema**.
      + If you choose a **Map data type**, specify the **Map schema**.

   1. Enter the **Snowflake account identifier**.

      For more information, see [Account identifiers](https://docs.snowflake.com/en/user-guide/admin-account-identifier#finding-the-organization-and-account-name-for-an-account) in the Snowflake documentation. 

**Note**  
 Only S3 tables cataloged in AWS Glue can be used to retrieve the table schema automatically.

------

1. For **Columns allowed in collaborations**, choose an option based on your goal.     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/create-config-table-snowflake.html)

1. For **Configured table details**, 

   1. Enter a **Name** for the configured table.

      You can use the default name or rename this table.

   1. Enter a **Description** of the table. 

      The description helps differentiate between other configured tables with similar names.

   1. If you want to enable **Tags** for the configured table resource, choose **Add new tag** and then enter the **Key** and **Value** pair. 

1. Choose **Configure new table**. 

Now that you have created a configured table, you are ready to: 
+ [Add an analysis rule to the configured table](add-analysis-rule.md)
+ [Associate the configured table to a collaboration](associate-configured-table.md)