

# Analysis rules in AWS Clean Rooms
<a name="analysis-rules"></a>

As part of enabling a table to use in AWS Clean Rooms for collaboration analysis, the collaboration member must configure an *analysis rule*.

An analysis rule is a privacy-enhancing control that each data owner sets up on a configured table. An analysis rule determines how the configured table can be analyzed.

The analysis rule is an account-level control on the configured table (an account-level resource) and is enforced in any collaboration where the configured table is associated. If there is no analysis rule configured, the configured table can be associated to collaborations but it can’t be queried. Queries can only reference configured tables with the same analysis rule type. 

To configure an analysis rule, you first select a type of analysis and then specify the analysis rule. For both steps, you should consider the use case you want to enable and how you want to protect your underlying data. 

AWS Clean Rooms enforces the more restrictive controls across all configured tables referenced in a query. 

The following examples illustrate the restrictive controls.

**Example Restrictive control: Output constraint**  
+ Collaborator A has an output constraint on the identiﬁer column of 100. 
+ Collaborator B has an output constraint on the identiﬁer column of 150. 

  An aggregation query that references both conﬁgured tables requires at least 150 distinct values of identiﬁer within an output row for it to be displayed in the query output. The query output doesn't indicate that results are removed because of the output constraint. 

**Example Restrictive control: Analysis template not approved**  
+ Collaborator A has allowed an analysis template with a query that references configured tables from Collaborator A and Collaborator B in their custom analysis rule. 
+ Collaborator B hasn't allowed the analysis template. 

  Because Collaborator B hasn't allowed the analysis template, the member who can query can’t run that analysis template. 

## Analysis rule types
<a name="summary-table"></a>

There are three types of analysis rules: [aggregation](analysis-rules-aggregation.md), [list](analysis-rules-list.md) and [custom](analysis-rules-custom.md). The following tables compare the analysis rule types. Each type has a separate section that describes specifying the analysis rule.

**Note**  
There is an analysis rule type called the ID mapping table analysis rule. However, this analysis rule is managed by AWS Clean Rooms and can’t be modified. For more information, see [ID mapping table analysis rule](analysis-rules-id-mapping-table.md).

The following sections describe supported use cases and controls for each analysis rule type.

### Supported use cases
<a name="supported-use-cases"></a>

The following tables show a comparison summary of the supported use cases for each analysis rule type.


| Use case | [Aggregation](analysis-rules-aggregation.md) | [List](analysis-rules-list.md) | [Custom](analysis-rules-custom.md) | 
| --- | --- | --- | --- | 
| Supported analyses  | Queries that aggregate statistics using COUNT, SUM, and AVG functions along optional dimensions  | Queries that output row-level lists of the overlap between multiple tables  | Any custom analysis as long as the analysis template or the analysis creator have been reviewed and allowed  | 
| Common use cases  | Segment analysis, measurement, attribution  | Enrichment, segment building  | First-touch attribution, incremental analyses, audience discovery  | 
| SQL constructs  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules.html)  | Majority of SQL functions and SQL constructs available with the SELECT command | 
| Subqueries and common table expressions (CTEs)  | No | No | Yes | 
| Analysis templates  | No | No | Yes | 

### Supported controls
<a name="supported-controls"></a>

The following tables show a comparison summary of how each analysis rule type protects your underlying data.


| Control | [Aggregation](analysis-rules-aggregation.md) | [List](analysis-rules-list.md) | [Custom](analysis-rules-custom.md) | 
| --- | --- | --- | --- | 
| Control mechanism  | Control how data in the table can be used in a query*(For example, allow COUNT and SUM of column hashed\$1email.)* | Control how data in the table can be used in a query*(For example, allow use of column hashed\$1email only for joining.)* | Control what queries are allowed to run on the table*(For example, allow only queries defined in analysis templates "Custom query 1".)* | 
| Built-in privacy enhancing techniques  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules.html)  | 
| Review query before it can be run  | No | No | Yes, using analysis templates | 

For more information about the analysis rules that are available in AWS Clean Rooms, see the following topics. 
+ [Aggregation analysis rule](analysis-rules-aggregation.md)
+ [List analysis rule](analysis-rules-list.md)
+ [Custom analysis rule in AWS Clean Rooms](analysis-rules-custom.md)

# Aggregation analysis rule
<a name="analysis-rules-aggregation"></a>

In AWS Clean Rooms, an *aggregation analysis rule* generates aggregate statistics using COUNT, SUM, and/or AVG functions along optional dimensions. When the aggregation analysis rule is added to a configured table, it enables the member who can query to run queries on the configured table.

The aggregation analysis rule supports uses cases such as campaign planning, media reach, frequency measurement, and attribution. 

The supported query structure and syntax are defined in [Aggregation query structure and syntax](#agg-query-structure-syntax).

The parameters of the analysis rule, defined in [Aggregation analysis rule - query controls](#agg-query-controls), include query controls and query results controls. Its query controls include the ability to require that a configured table is joined to at least one configured table owned by the member who can query, either directly or transitively. This requirement allows you to ensure that the query is run on the intersection (INNER JOIN) of your table and theirs.

## Aggregation query structure and syntax
<a name="agg-query-structure-syntax"></a>

Queries on tables that have an aggregation analysis rule must adhere to the following syntax.

```
--select_aggregate_function_expression
SELECT 
aggregation_function(column_name) [[AS] column_alias ] [, ...]

 --select_grouping_column_expression                        
  [, {column_name|scalar_function(arguments)} [[AS] column_alias ]][, ...]   

--table_expression
FROM table_name [[AS] table_alias ]
  [[INNER] JOIN table_name [[AS] table_alias] ON join_condition] [...]

--where_expression
[WHERE where_condition]          

--group_by_expression                          
[GROUP BY {column_name|scalar_function(arguments)}, ...]]                  

--having_expression
[HAVING having_condition]                               

--order_by_expression    
[ORDER BY {column_name|scalar_function(arguments)} [{ASC|DESC}]] [,...]]
```

The following table explains each expression listed in the preceding syntax.


| Expression | Definition | Examples | 
| --- | --- | --- | 
| select\$1aggregate\$1function\$1expression |  A comma-separated list containing the following expressions: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-aggregation.html)  There must be at least one `select_aggregation_function_expression` in the `select_aggregate_expression`.    |  `SELECT SUM(PRICE), user_segment`  | 
| select\$1aggregation\$1function\$1expression |  One or more supported aggregation functions applied to one or more columns. Only columns are allowed as arguments of aggregation functions.  There must be at least one `select_aggregation_function_expression` in the `select_aggregate_expression`.    |  `AVG(PRICE)` `COUNT(DISTINCT user_id)`  | 
| select\$1grouping\$1column\$1expression |  An expression that can contain any expression using the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-aggregation.html)  `select_aggregate_expression` can alias columns with or without the `AS` parameter. For more information, see the [AWS Clean Rooms SQL Reference](https://docs.aws.amazon.com/clean-rooms/latest/sql-reference/sql-reference.html).   |  `TRUNC(timestampColumn)`  `UPPER(campaignName)`   | 
| table\$1expression |  A table, or join of tables, connecting join conditional expressions with `join_condition`. `join_condition` returns a Boolean.  The `table_expression` supports: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-aggregation.html)  |  <pre>FROM consumer_table <br />INNER JOIN provider_table<br />ON<br />consumer_table.identifier1 = provider_table.identifier1<br />AND<br />consumer_table.identifier2 = provider_table.identifier2</pre>  | 
| where\$1expression |  A conditional expression that returns a Boolean. It may be comprised of the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-aggregation.html) Supported comparison conditions are (`=, >, <, <=, >=, <>, !=, NOT, IN, NOT IN, LIKE, IS NULL, IS NOT NULL`).  Supported logical operators are (`AND, OR`). The `where_expression` is optional.  |  `WHERE where_condition` `WHERE price > 100`  `WHERE TRUNC(timestampColumn) = '1/1/2022'`  `WHERE timestampColumn = timestampColumn2 - 14`   | 
| group\$1by\$1expression |  A comma-separated list of expressions that match the requirements for the `select_grouping_column_expression`.   |  `GROUP BY TRUNC(timestampColumn), UPPER(campaignName), segment`  | 
| having\$1expression |  A conditional expression that returns an Boolean. They have a supported aggregation function applied to a single column (for example, `SUM(price)`) and are compared to a numerical literal. Supported conditions are (`=, >, <, <=, >=, <>, !=`).  Supported logical operators are (`AND, OR`). The `having_expression` is optional.  |  `HAVING SUM(SALES) > 500`  | 
| order\$1by\$1expression |  A comma-separated list of expressions that is compatible with the same requirements defined in `select_aggregate_expression` defined earlier.  The `order_by_expression` is optional.  `order_by_expression` permits `ASC` and `DESC` parameters. For more information, see ASC DESC parameters in the [AWS Clean Rooms SQL Reference](https://docs.aws.amazon.com/clean-rooms/latest/sql-reference/sql-reference.html).   |  `ORDER BY SUM(SALES), UPPER(campaignName)`  | 

For aggregation query structure and syntax, be aware of the following:
+ SQL commands other than SELECT are not supported.
+ Sub-queries and common table expressions (for example, WITH) are not supported.
+ Operators that combine multiple queries (for example, UNION) are not supported. 
+ TOP, LIMIT, and OFFSET parameters are not supported.

## Aggregation analysis rule - query controls
<a name="agg-query-controls"></a>

With aggregation query controls, you can control how the columns in your table are used to query the table. For example, you can control which column is used for joining, which column can be counted, or which column can be used in WHERE statements.

The following sections explain each control.

**Topics**
+ [Aggregation controls](#agg-functions)
+ [Join controls](#join-controls)
+ [Dimension controls](#dimension-controls)
+ [Scalar functions](#scalar-functions)

### Aggregation controls
<a name="agg-functions"></a>

By using *aggregation controls*, you can define which aggregation functions to allow, and what columns they must to be applied to. Aggregation functions can be used in the SELECT, HAVING, and ORDER BY expressions.


| Control | Definition | Usage | 
| --- | --- | --- | 
| aggregateColumns | Columns of configured table columns you allow for use within aggregation functions. |  `aggregateColumns` can be used inside an aggregation function in the SELECT, HAVING, and ORDER BY expressions. Some `aggregateColumns` can also be categorized as a `joinColumn` (defined later). Given `aggregateColumn` can't also be categorized as a `dimensionColumn` (defined later).  | 
| function | The COUNT, SUM, and AVG functions you allow for use on top of the aggregateColumns. |  `function` can be applied to an `aggregateColumns` that is associated to it.   | 

### Join controls
<a name="join-controls"></a>

A `JOIN` clause is used to combine rows from two or more tables, based on a related column between them.

You can use *Join controls* to control how your table can be joined to other tables in the `table_expression`. AWS Clean Rooms only supports INNER JOIN. INNER JOIN statements can only use columns that have been explicitly categorized as a `joinColumn` in your analysis rule, subject to the controls that you define. 

The INNER JOIN must operate on a `joinColumn` from your configured table and a `joinColumn` from another configured table in the collaboration. You decide which columns from your table can be used as `joinColumn`.

Each match condition within the ON clause is required to use the equality comparison condition (`=`) between two columns. 

Multiple match conditions within an ON clauses can be: 
+ Combined using the `AND` logical operator
+ Separated using the `OR` logical operator

**Note**  
All JOIN match conditions must match one row from each side of the JOIN. All conditionals connected by an `OR` or an `AND` logical operator must adhere to this requirement as well.

The following is an example of a query with an `AND` logical operator.

```
SELECT some_col, other_col 
FROM table1 
    JOIN table2 
    ON table1.id = table2.id AND table1.name = table2.name
```

The following is an example of a query with an `OR` logical operator.

```
SELECT some_col, other_col 
FROM table1 
    JOIN table2 
    ON table1.id = table2.id OR table1.name = table2.name
```


| Control | Definition | Usage | 
| --- | --- | --- | 
| joinColumns | The columns (if any) that you want to allow the member who can query to use in the INNER JOIN statement. |  A specific `joinColumn` can also be categorized as a `aggregateColumn` (see [Aggregation controls](#agg-functions)). The same column can't be used both as `joinColumn` and `dimensionColumns` (see later). Unless it has also been categorized as an `aggregateColumn`, a `joinColumn` can't be used in any other parts of the query other than the INNER JOIN.  | 
| joinRequired | Control whether you require an INNER JOIN with a configured table from the member who can query.  |  If you enable this parameter, an INNER JOIN is required. If you don't enable this parameter, an INNER JOIN is optional. Assuming you enable this parameter, the member who can query is required to include a table they own in the INNER JOIN. They must JOIN your table with theirs, either directly or transitively (that is, join their table to another table, which itself is joined to your table).  | 

Following is an example of transitivity.

```
ON 
my_table.identifer = third_party_table.identifier
....
ON
third_party_table.identifier = member_who_can_query_table.id
```

**Note**  
The member who can query can also use the `joinRequired` parameter. In that case, the query must join their table with at least one other table. 

### Dimension controls
<a name="dimension-controls"></a>

*Dimension controls* control the column along which the aggregation columns can be filtered, grouped, or aggregated.


| Control | Definition | Usage | 
| --- | --- | --- | 
| dimensionColumns |  The columns (if any) that you allow the member who can query to use in SELECT, WHERE, GROUP BY, and ORDER BY.  |  A `dimensionColumn` can be used in SELECT (`select_grouping_column_expression`), WHERE, GROUP BY, and ORDER BY. The same column can't be both a `dimensionColumn`, a `joinColumn`, and/or an `aggregateColumn`.  | 

### Scalar functions
<a name="scalar-functions"></a>

*Scalar functions* control which scalar functions can be used on dimension columns.


| Control | Definition | Usage | 
| --- | --- | --- | 
| scalarFunctions |  The scalar functions that can be used on `dimensionColumns` in the query.  |  Specifies the scalar functions (if any) that you allow (for example, CAST) to be applied on `dimensionColumns`.  Scalar functions can't be used on top of other functions or within other functions. Arguments of scalar functions can be columns, string literals, or numeric literals.  | 

The following scalar functions are supported:
+ Math functions – ABS, CEILING, FLOOR, LOG, LN, ROUND, SQRT
+ Data type formatting functions – CAST, CONVERT, TO\$1CHAR, TO\$1DATE, TO\$1NUMBER, TO\$1TIMESTAMP
+ String functions – LOWER, UPPER, TRIM, RTRIM, SUBSTRING
  + For RTRIM, custom character sets to trim aren't allowed. 
+ Conditional expressions – COALESCE
+ Date functions – EXTRACT, GETDATE, CURRENT\$1DATE, DATEADD
+ Other functions – TRUNC

For more details, see the [AWS Clean Rooms SQL Reference](https://docs.aws.amazon.com/clean-rooms/latest/sql-reference/sql-reference.html).

## Aggregation analysis rule - query results controls
<a name="agg-query-results-controls"></a>

With the aggregation query results controls, you can control which results are returned by specifying one or more conditions that each output row must meet for it to be returned. AWS Clean Rooms supports aggregation constraints in the form of `COUNT (DISTINCT column) >= X`. This form requires that each row aggregates at least X distinct values of a choice from your configured table (for example, a minimum number of distinct `user_id` values). This minimum threshold is automatically enforced, even if the submitted query itself does not use the specified column. They are enforced collectively across each configured table in the query from the configured tables from each member in the collaboration. 

Each configured table must have at least one aggregation constraint in their analysis rule. Configured table owners can add multiple `columnName` and associated `minimum` and they are enforced collectively. 

### Aggregation constraints
<a name="agg-constraints"></a>

*Aggregation constraints* control which rows in the query results are returned. To be returned, a row must meet the specified minimum number of distinct values in each column specified in the aggregation constraint. This requirement applies even if the column isn't explicitly mentioned in the query or in other parts of the analysis rule.


| Control | Definition | Usage | 
| --- | --- | --- | 
| columnName |  The `aggregateColumn` that is used in the condition that each output row must meet.  |  Can be any column in the configured table.  | 
| minimum |  The minimum number of distinct values for the associated `aggregateColumn` that the output row must have (for example, COUNT DISTINCT) for it to be returned in the query results.   |  The `minimum` must be at least value of 2.  | 

## Aggregation analysis rule structure
<a name="agg-analysis-rule-template"></a>

The following example shows a predefined structure for an aggregation analysis rule. 

In the following example, *`MyTable`* refers to your data table. You can replace each *user input placeholder* with your own information. 

```
{
  "aggregateColumns": [
    {
      "columnNames": [MyTable column names], "function": [Allowed Agg Functions]
    },
  ],
  "joinRequired": ["QUERY_RUNNER"],  
  "joinColumns": [MyTable column names],
  "dimensionColumns": [MyTable column names],
  "scalarFunctions": [Allowed Scalar functions],
  "outputConstraints": [
    {
      "columnName": [MyTable column names], "minimum": [Numeric value] 
    },
  ]
}
```

## Aggregation analysis rule - example
<a name="agg-analysis-rule-example"></a>

The following example demonstrates how two companies can collaborate in AWS Clean Rooms using aggregation analysis.

Company A has customer and sales data. Company A is interested in understanding product return activity. Company B is one of Company A's retailers and has returns data. Company B also has segment attributes on customers that are useful to Company A (for example, purchased related products, uses customer service from the retailer). Company B doesn't want to provide row-level customer return data and attribute information. Company B only wants to enable a set of queries for Company A to obtain aggregate statistics on overlapping customers at a minimum aggregation threshold. 

Company A and Company B decide to collaborate so that Company A can understand product return activity and deliver better products at Company B and other channels. 

To create the collaboration and run an aggregation analysis, the companies do the following: 

1. Company A creates a collaboration and creates a membership. The collaboration has Company B as another member in the collaboration. Company A enables query logging in the collaboration, and it enables query logging in their account. 

1. Company B creates a membership in the collaboration. It enables query logging in its account. 

1. Company A creates a sales configured table.

1. Company A adds the following aggregation analysis rule to the sales configured table.

   ```
   {
     "aggregateColumns": [
       {
         "columnNames": [
           "identifier"
         ],
         "function": "COUNT_DISTINCT"
       },
       {
         "columnNames": [
           "purchases"
         ],
         "function": "AVG"
       },
       {
         "columnNames": [
           "purchases"
         ],
         "function": "SUM"
       }
     ],
     "joinColumns": [
       "hashedemail"
     ],
     "dimensionColumns": [
       "demoseg",
       "purchasedate",
       "productline"
     ],
     "scalarFunctions": [
       "CAST",
       "COALESCE",
       "TRUNC"
     ],
     "outputConstraints": [
       {
         "columnName": "hashedemail",
         "minimum": 2,
         "type": "COUNT_DISTINCT"
       },
     ]
   }
   ```

   `aggregateColumns` – Company A wants to count the number of unique customers in the overlap between sales data and returns data. Company A also wants to sum the number of `purchases` made to compare to number of `returns`.

   `joinColumns` – Company A wants to use `identifier` to match customers from sales data to customers from returns data. This will help Company A match returns to the right purchases. It also helps Company A segment overlapping customers.

   `dimensionColumns` – Company A uses `dimensionColumns` to filter by the specific product, compare purchases and returns over a certain period of time, ensure the return date is after the product date, and help segment overlapping customers. 

   `scalarFunctions` – Company A selects `CAST` scalar function to help update data type formats if needed based on the configured table Company A associates to the collaboration. It also adds scalar functions to help formatting columns if needed. 

   `outputConstraints` – Company A sets minimum output constraints. It doesn't need to constrain the results because the analyst is allowed to see row-level data from their sales table 
**Note**  
Company A doesn't include `joinRequired` in the analysis rule. It provides flexibility for their analyst to query the sales table alone.

1. Company B creates a returns configured table.

1. Company B adds the following aggregation analysis rule to the returns configured table.

   ```
   {
     "aggregateColumns": [
       {
         "columnNames": [
           "identifier"
         ],
         "function": "COUNT_DISTINCT"
       },
       {
         "columnNames": [
           "returns"
         ],
         "function": "AVG"
       },
       {
         "columnNames": [
           "returns"
         ],
         "function": "SUM"
       }
     ],
     "joinColumns": [
       "hashedemail"
     ],
     "joinRequired": [
       "QUERY_RUNNER"
     ],
     "dimensionColumns": [
       "state",
       "popularpurchases",
       "customerserviceuser",
       "productline",
       "returndate"
     ],
     "scalarFunctions": [
       "CAST",
       "LOWER",
       "UPPER",
       "TRUNC"
     ],
     "outputConstraints": [
       {
         "columnName": "hashedemail",
         "minimum": 100,
         "type": "COUNT_DISTINCT"
       },
       {
         "columnName": "producttype",
         "minimum": 2,
         "type": "COUNT_DISTINCT"
       }
     ]
   }
   ```

   `aggregateColumns` – Company B enables Company A to sum `returns` to compare to the number of purchases. They have at least one aggregate column because they are enabling an aggregate query. 

   `joinColumns` – Company B enables Company A to join on `identifier` to match customers from return data to customers from sales data. `identifier` data is particularly sensitive and having it as a `joinColumn` ensures that the data will never be outputted in a query. 

   `joinRequired` – Company B requires queries on the return data to be overlapped with the sales data. They don’t want to enable Company A to query all individuals in their dataset. They also agreed on that restriction in their collaboration agreement. 

   `dimensionColumns` – Company B enables Company A to filter and group by `state`, `popularpurchases`, and `customerserviceuser` which are unique attributes that could help make the analysis for Company A. Company B enables Company A to use `returndate` to filter output on `returndate` that occurs after `purchasedate`. With this filtering, the output is more accurate for evaluating the impact of the product change. 

   `scalarFunctions` – Company B enables the following: 
   + TRUNC for dates
   + LOWER and UPPER in case the `producttype` is entered in a different format in their data
   + CAST if Company A needs to convert data types in sales to be the same as data types in returns

   Company A doesn't enable other scalar functions because they don’t believe they are required for queries.

   `outputConstraints` – Company B sets minimum output constraints on `hashedemail` to help reduce the ability to re-identify customers. It also adds minimum output constraint on `producttype` to reduce the ability to re-identify specific products that were returned. Certain product types could be more dominant based on dimensions of the output (for example, `state`). Their output constraints will always be enforced regardless of output constraints added by Company A to their data. 

1. Company A creates a sales table association to collaboration.

1. Company B creates a returns table association to collaboration.

1. Company A runs queries, such as the following example, to better understand the quantity of returns in Company B as compared to total purchases by location in 2022.

   ```
   SELECT
     companyB.state,
     SUM(companyB.returns),
     COUNT(DISTINCT companyA.hashedemail)
   FROM
     sales companyA
     INNER JOIN returns companyB ON companyA.identifier = companyB.identifier
   WHERE
     companyA.purchasedate BETWEEN '2022-01-01' AND '2022-12-31' AND
     TRUNC(companyB.returndate) > companyA.purchasedate
   GROUP BY
     companyB.state;
   ```

1. Company A and Company B review query logs. Company B verifies that the query aligns with what was agreed upon in the collaboration agreement. 

## Troubleshooting aggregation analysis rule issues
<a name="troubleshooting-agg-analysis-rule"></a>

Use the information here to help you diagnose and fix common issues when you work with aggregation analysis rules. 

**Topics**
+ [My query didn’t return any results](#query-no-results)

### My query didn’t return any results
<a name="query-no-results"></a>

This can happen when there are no matching results or when the matching results don’t meet one or more minimum aggregation thresholds. 

For more information about minimum aggregation thresholds, see [Aggregation analysis rule - example](#agg-analysis-rule-example).

# List analysis rule
<a name="analysis-rules-list"></a>

In AWS Clean Rooms, a *list analysis rule* outputs row-level lists of the overlap between the configured table that it's added to and the configured tables of the member who can query. The member who can query runs queries that include a list analysis rule.

The list analysis rule type supports uses cases such as enrichment and audience building. 

For more information about the predefined query structure and syntax for this analysis rule, see [List analysis rule predefined structure](#intersection-list-params-template).

The parameters of the list analysis rule, defined in [List analysis rule - query controls](#parameters-list-query-controls), have query controls. Its query controls include the ability to select the columns that can be listed in the output. The query is required to have at least one join with a configured table from the member who can query, either directly or transitively.

There are no query results controls like there are for the [Aggregation analysis rule](analysis-rules-aggregation.md). 

List queries can only use mathematical operators. They can't use other functions (such as aggregation or scalar).

**Topics**
+ [List query structure and syntax](#list-query-controls)
+ [List analysis rule - query controls](#parameters-list-query-controls)
+ [List analysis rule predefined structure](#intersection-list-params-template)
+ [List analysis rule - example](#list-example)

## List query structure and syntax
<a name="list-query-controls"></a>

Queries on tables that have a list analysis rule must adhere to the following syntax. 

```
--select_list_expression
SELECT DISTINCT column_name [[AS] column_alias ] [, ...] 

--table_expression
FROM table_name [[AS] table_alias ]
  [[INNER] JOIN table_name [[AS] table_alias] ON join_condition] [...]

--where_expression
[WHERE where_condition]          

--limit_expression
[LIMIT number]
```

The following table explains each expression listed in the preceding syntax. 


| Expression | Definition | Examples | 
| --- | --- | --- | 
| select\$1list\$1expression |  A comma-separated list containing at least one table column name. A `DISTINCT` parameter is required.   The `select_list_expression` can alias columns with or without the `AS` parameter.  For more information, see the [AWS Clean Rooms SQL Reference](https://docs.aws.amazon.com/clean-rooms/latest/sql-reference/sql-reference.html).   |  `SELECT DISTINCT segment`  | 
| table\$1expression |  A table, or join of tables, with `join_condition` to connect it to `join_condition`.  `join_condition` returns a Boolean.  The `table_expression` supports: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-list.html)  |  <pre>FROM consumer_table <br />INNER JOIN provider_table<br />ON<br />consumer_table.identifier1 = provider_table.identifier1<br />AND<br />consumer_table.identifier2 = provider_table.identifier2</pre>  | 
| where\$1expression | A conditional expression that returns a Boolean. It can be comprised of the following:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/clean-rooms/latest/userguide/analysis-rules-list.html)Supported comparison conditions are (`=, >, <, <=, >=, <>, !=, NOT, IN, NOT IN, LIKE, IS NULL, IS NOT NULL`). Supported logical operators are (`AND, OR`).The `where_expression` is optional. |  `WHERE state + '_' + city = 'NY_NYC'` `WHERE timestampColumn = timestampColumn2 - 14`   | 
| limit\$1expression |  This expression must take a positive integer. The `limit_expression` is optional.   |  `LIMIT 100`  | 

For list query structure and syntax, be aware of the following:
+ SQL commands other than SELECT are not supported.
+ Subqueries and common table expressions (for example, WITH) are not supported
+ HAVING, GROUP BY, and ORDER BY clauses are not supported
+ OFFSET parameter is not supported

## List analysis rule - query controls
<a name="parameters-list-query-controls"></a>

With list query controls, you can control how the columns in your table are used to query the table. For example, you can control which column is used for joining, or which column can be used in SELECT statement and WHERE clause.

The following sections explain each control.

**Topics**
+ [Join controls](#list-controls-join-controls)
+ [List controls](#list-controls)

### Join controls
<a name="list-controls-join-controls"></a>

With *Join controls*, you can control how your table can be joined to other tables in the **table\$1expression**. AWS Clean Rooms only supports INNER JOIN. In the list analysis rule, at least one INNER JOIN is required and the member who can query is required to include a table they own in the INNER JOIN. This means they must join your table with theirs, either directly or transitively.

Following is an example of transitivity.

```
ON 
my_table.identifer = third_party_table.identifier 
.... 
ON 
third_party_table.identifier = member_who_can_query_table.id
```

INNER JOIN statements can only use columns that have been explicitly categorized as a `joinColumn` in your analysis rule. 

The INNER JOIN must operate on a `joinColumn` from your configured table and a `joinColumn` from another configured table in the collaboration. You decide which columns from your table can be used as `joinColumn`. 

Each match condition within the ON clause is required to use the equality comparison condition (`=`) between two columns. 

Multiple match conditions within an ON clause can be:
+ Combined using the `AND` logical operator
+ Separated using the `OR` logical operator

**Note**  
All JOIN match conditions must match one row from each side of the JOIN. All conditionals connected by an `OR` or an `AND` logical operator must adhere to this requirement as well.

The following is an example of a query with an `AND` logical operator.

```
SELECT some_col, other_col 
FROM table1 
    JOIN table2 
    ON table1.id = table2.id AND table1.name = table2.name
```

The following is an example of a query with an `OR` logical operator.

```
SELECT some_col, other_col 
FROM table1 
    JOIN table2 
    ON table1.id = table2.id OR table1.name = table2.name
```


| Control | Definition | Usage | 
| --- | --- | --- | 
| joinColumns | The columns that you want to allow the member who can query to use in the INNER JOIN statement. |  The same column can't be categorized as both a `joinColumn` and `listColumn` (see [List controls](#list-controls)). `joinColumn` can't be used in any other parts of the query other than INNER JOIN.  | 

### List controls
<a name="list-controls"></a>

*List controls* control the columns that can be listed in the query output (that is, used in the SELECT statement) or used to filter results (that is, used in the WHERE statement).


| Control | Definition | Usage | 
| --- | --- | --- | 
| listColumns | The columns that you allow the member who can query to use in the SELECT and WHERE | A listColumn can be used in SELECT and WHERE.The same column can't be used as both a `listColumn` and `joinColumn`. | 

## List analysis rule predefined structure
<a name="intersection-list-params-template"></a>

The following example includes a predefined structure that shows how you complete a list analysis rule. 

In the following example, *`MyTable`* refers to your data table. You can replace each *user input placeholder* with your own information. 

```
{
  "joinColumns": [MyTable column name(s)],
  "listColumns": [MyTable column name(s)],
}
```

## List analysis rule - example
<a name="list-example"></a>

The following example demonstrates how two companies can collaborate in AWS Clean Rooms using list analysis.

Company A has customer relationship management (CRM) data. Company A wants to obtain additional segment data on its customers to learn more about their customers and potentially use attributes as input into other analyses. Company B has segment data comprised of unique segment attributes that they created based on their first party data. Company B wants to provide the unique segment attributes to Company A only on customers that are overlapping between their data and Company A data. 

The companies decide to collaborate so that Company A can enrich the overlapping data. Company A is the member who can query, and Company B is the contributor.

To create a collaboration and run list analysis in collaboration, the companies do the following: 

1. Company A creates a collaboration and creates a membership. The collaboration has Company B as another member on the collaboration. Company A enables query logging in the collaboration, and it enables query logging in its account. 

1. Company B creates a membership in the collaboration. It enables query logging in its account. 

1. Company A creates a CRM configured table

1. Company A adds the analysis rule to the customer configured table, as shown in the following example.

   ```
   {
     "joinColumns": [
       "identifier1",
       "identifier2"
     ],
     "listColumns": [
       "internalid",
       "segment1",
       "segment2",
       "customercategory"
     ]
   }
   ```

   `joinColumns` – Company A wants to use `hashedemail` and/or `thirdpartyid` (obtained from an identity vendor) to match customers from CRM data to customers from segment data. This will help ensure Company A matches enriched data for the right customers. They have two joinColumns to potentially improve the match rate of the analysis. 

   `listColumns` – Company A uses `listColumns` to obtain enriched columns beside an `internalid` they use within their own systems. They add `segment1`, `segment2`, and `customercategory` to potentially limit the enrichment to specific segments by using them in filters. 

1. Company B creates a segment configured table.

1. Company B adds the analysis rule to the segment configured table. 

   ```
   {
     "joinColumns": [
       "identifier2"
     ],
     "listColumns": [
       "segment3",
       "segment4"
     ]
   }
   ```

   `joinColumns` – Company B enables Company A to join on `identifier2` to match customers from segment data to CRM data. Company A and Company B worked with the identity vendor to obtain `identifier2` which would match for this collaboration. They didn't add other `joinColumns` because they believed `identifier2` provides the highest and most accurate match rate and other identifiers aren't required for the queries. 

   `listColumns` – Company B enables Company A to enrich their data with `segment3` and `segment4` attributes which are unique attributes they have created, collected and aligned on (with customer A) to be a part of data enrichment. They want Company A to obtain these segments for the overlap at a row-level because this is a data enrichment collaboration. 

1. Company A creates a CRM table association to the collaboration.

1. Company B creates a segment table association to the collaboration.

1. Company A runs queries, such as the following one to enrich overlapping customer data. 

   ```
   SELECT companyA.internalid, companyB.segment3, companyB.segment4
   INNER JOIN returns companyB
    ON companyA.identifier2 = companyB.identifier2
   WHERE companyA.customercategory > 'xxx'
   ```

1. Company A and Company B review query logs. Company B verifies that the query aligns with what was agreed upon in the collaboration agreement.

# Custom analysis rule in AWS Clean Rooms
<a name="analysis-rules-custom"></a>

In AWS Clean Rooms, a *custom analysis rule* is a new type of analysis rule that allows custom queries to be run on the configured table. Custom SQL queries are still restricted to having only the SELECT command but can use more SQL constructs than [aggregation](analysis-rules-aggregation.md#agg-query-controls) and [list](analysis-rules-list.md#list-query-controls) queries (for example, window functions, OUTER JOIN, CTEs, or subqueries; see the [AWS Clean Rooms SQL Reference](https://docs.aws.amazon.com/clean-rooms/latest/sql-reference/sql-reference.html) for a complete list). Custom SQL queries don’t have to follow a query structure like [aggregation](analysis-rules-aggregation.md#agg-query-structure-syntax) and [list](analysis-rules-list.md#list-query-controls) queries. 

The custom analysis rule supports more advanced use cases than those that can be supported by the aggregation and list analysis rule such as custom attribution analysis, benchmarking, incrementality analysis, and audience discovery. This is in addition to a superset of the use cases supported by aggregation and list analysis rule. 

The custom analysis rule also supports differential privacy. Differential privacy is a mathematically-rigorous framework for data privacy protection. For more information, see [AWS Clean Rooms Differential Privacy](differential-privacy.md). When you create an analysis template, AWS Clean Rooms Differential Privacy checks the template to determine whether it is compatible with the general-purpose query structure for AWS Clean Rooms Differential Privacy. This validation ensures that you don't create an analysis template that isn't allowed with a differential privacy protected table.

To configure the custom analysis rule, data owners can choose to allow specific custom queries, stored in [analysis templates](create-analysis-template.md), to run on their configured tables. Data owners review analysis templates before adding them to the allowed analysis control in the custom analysis rule. Analysis templates are available and visible only in the collaboration in which they are created (even if the table is associated to other collaborations) and can be run only by the member who can query in that collaboration.

Alternatively, members can choose to allow other members (query providers) to create queries without review. Members add query providers’ accounts the allowed query providers control in the custom analysis rule. If the query provider is the member who can query, they could run any query directly on the configured table. Query providers could also create queries by [creating analysis templates](create-analysis-template.md). Any queries that have been created by the query providers are automatically allowed to run on the table in all collaborations in which the AWS account is present and the table is associated.

Data owners can only allow analysis templates or accounts to create queries, not both. If the data owner leaves it empty, the member who can query can't run queries on the configured table.

**Disallowed output columns query constraint and CACHE TABLE**  
The [disallowed output columns constraint](https://docs.aws.amazon.com/clean-rooms/latest/userguide/disallowed-columns.html) in the custom analysis rule is enforced on cached tables. A cached table cannot reference a disallowed output column in its SELECT clause. To use a column with a disallowed output column constraint in a subsequent part of your query, convert the cached table to a common table expression (CTE).

**Topics**
+ [Custom analysis rule predefined structure](#custom-predefined-structure)
+ [Custom analysis rule example](#custom-example)
+ [Custom analysis rule with differential privacy](#custom-diff-privacy)

## Custom analysis rule predefined structure
<a name="custom-predefined-structure"></a>

The following example includes a predefined structure that shows you how to complete a custom analysis rule with differential privacy turned on. The `userIdentifier` value is the column that uniquely identifies your users, such as *user\$1id*. When you have two or more tables with differential privacy turned on in a collaboration, AWS Clean Rooms requires you to configure the same column as the user identifier column in both of the analysis rules to maintain a consistent definition of the users across tables. 

```
{
  "allowedAnalyses": ["ANY_QUERY"] | string[],
  "allowedAnalysisProviders": [],
  "differentialPrivacy": {
    "columns": [
      {
        "name": "userIdentifier"
      }
    ]
  }
}
```

You can either: 
+ Add analysis template ARNs to allowed analyses control. In this case, the `allowedAnalysisProviders` control is not included.

  ```
  {
    allowedAnalyses: string[]
  }
  ```
+ Add member AWS account IDs to the `allowedAnalysisProviders` control. In this case, you add `ANY_QUERY` to the `allowedAnalyses` control. 

  ```
  {
    allowedAnalyses: ["ANY_QUERY"],
    allowedAnalysisProviders: string[]
  }
  ```

## Custom analysis rule example
<a name="custom-example"></a>

The following example demonstrates how two companies can collaborate in AWS Clean Rooms using the custom analysis rule.

Company A has customer and sales data. Company A is interested in understanding the sales incrementality of an advertising campaign on Company B site. Company B has viewership data and segment attributes that are useful to Company (for example, the device they used when viewing the advertising). 

Company A has a specific incrementality query they want to run in the collaboration. 

To create a collaboration and run a custom analysis in collaboration, the companies do the following: 

1. Company A creates a collaboration and creates a membership. The collaboration has Company B as another member on the collaboration. Company A enables query logging in the collaboration, and it enables query logging in its account. 

1. Company B creates a membership in the collaboration. It enables query logging in its account. 

1. Company A creates a CRM configured table

1. Company A adds empty custom analysis rule to sales configured table.

1. Company A associates sales configured table to the collaboration.

1. Company B creates viewership configured table.

1. Company B adds an empty custom analysis rule to the viewership configured table.

1. Company B associates viewership configured table to the collaboration.

1. Company A views the sales table and viewership table associated to the collaboration and creates analysis template, adding the incrementality query and parameter for campaign month.

   ```
   {
       "analysisParameters": [
       {
           "defaultValue": ""
           "type": "DATE"
           "name": "campaign_month"
       }
       ],
       "description": "Monthly incrementality query using sales and viewership data"
       "format": "SQL"
       "name": "Incrementality analysis"
       "source": 
           "WITH labeleddata AS
           (
           SELECT hashedemail, deviceid, purchases, unitprice, purchasedate,
           CASE
               WHEN testvalue IN ('value1', 'value2', 'value3') THEN 0
               ELSE 1
           END AS testgroup
           FROM viewershipdata
           )
           SELECT labeleddata.purchases, provider.impressions
           FROM labeleddata 
           INNER JOIN salesdata
             ON labeleddata.hashedemail = provider.hashedemail
           WHERE MONTH(labeleddata.purchasedate) > :campaignmonth
           AND testgroup = :group
          "
   }
   ```

1. Company A adds their account (for example, 444455556666) to the allowed analysis provider control in the custom analysis rule. They use the allowed analysis provider control because they want to allow any queries they create to run on their sales configured table.

   ```
   {
     "allowedAnalyses": [
       "ANY_QUERY"
     ],
     "allowedAnalysisProviders": [
       "444455556666"
     ]
   }
   ```

1. Company B sees the created analysis template in the collaboration and reviews its contents including the query string and parameter.

1. Company B determines that the analysis template achieves the incrementality use case and meets their privacy requirements for how their viewership configured table can be queried.

1. Company B adds the analysis template ARN to the allowed analysis control in the custom analysis rule of the viewership table. They use the allowed analysis control because they only want to allow the incrementality query to run on their viewership configured table.

   ```
   {
     "allowedAnalyses": [
       "arn:aws:cleanrooms:us-east-1:111122223333:membership/41327cc4-bbf0-43f1-b70c-a160dddceb08/analysistemplate/1ff1bf9d-781c-418d-a6ac-2b80c09d6292"
     ]
   }
   ```

1. Company A runs the analysis template and uses the parameter value `05-01-2023`.

## Custom analysis rule with differential privacy
<a name="custom-diff-privacy"></a>

In AWS Clean Rooms, the custom analysis rule supports differential privacy. Differential privacy is a mathematically-rigorous framework for data privacy protection that helps you protect your data against re-identification attempts.

Differential privacy supports aggregate analysis such as ad campaign planning, post-ad-campaign measurement, benchmarking in a financial institution consortium, and A/B testing for healthcare research.

The supported query structure and syntax are defined in [Query structure and syntax](#dp-query-structure-syntax).

### Custom analysis rule with differential privacy example
<a name="custom-diff-privacy-example"></a>

**Note**  
AWS Clean Rooms Differential Privacy is only available for collaborations where the data is stored in Amazon S3.

Consider the [custom analysis rule example](#custom-example) presented in the previous section. This example demonstrates how you can use differential privacy to protect your data against re-identification attempts while allowing your partner to learn business-critical insights from your data. Assume that Company B, who has the viewership data, wants to protect their data using differential privacy. To complete the differential privacy setup, Company B completes the following steps:

1. Company B turns on differential privacy while adding custom analysis rule to the viewership configured table. Company B selects `viewershipdata.hashedemail` as the user identifier column.

1. Company B [adds a differential privacy policy](configure-differential-privacy.md) in the collaboration to make their viewership data table available for querying. Company B selects the default policy to quickly complete the setup.

Company A, who wants to understand the sales incrementality of an advertising campaign on Company B's site, runs the analysis template. Because the query is compatible with the general-purpose [query structure](#dp-query-structure-syntax) of AWS Clean Rooms Differential Privacy, the query runs successfully. 

### Query structure and syntax
<a name="dp-query-structure-syntax"></a>

Queries containing at least one table that have the differential privacy turned on must adhere to the following syntax.

```
query_statement:
    [cte, ...] final_select

 cte:
    WITH sub_query AS (
       inner_select
       [ UNION | INTERSECT | UNION_ALL | EXCEPT/MINUS ]
       [ inner_select ]
    )
   
 inner_select:
     SELECT [user_id_column, ] expression [, ...] 
     FROM table_reference [, ...] 
     [ WHERE condition ]
     [ GROUP BY user_id_column[, expression] [, ...] ] 
     [ HAVING condition ] 

 final_select:
     SELECT [expression, ...] | COUNT | COUNT_DISTINCT | SUM | AVG | STDDEV
     FROM table_reference [, ...]
     [ WHERE condition ]
     [ GROUP BY expression [, ...] ] 
     [ HAVING COUNT | COUNT_DISTINCT | SUM | AVG | STDDEV | condition ]
     [ ORDER BY column_list ASC | DESC ] 
     [ OFFSET literal ]
     [ LIMIT literal ]

 expression:
    column_name [, ...] | expression AS alias | aggregation_functions | window_functions_on_user_id | scalar_function | CASE | column_name math_expression [, expression]  
    
 window_functions_on_user_id:
    function () OVER (PARTITION BY user_id_column, [column_name] [ORDER BY column_list ASC|DESC])
```

**Note**  
For differential privacy query structure and syntax, be aware of the following:   
Sub-queries are not supported. 
Common Table Expressions (CTEs) should emit the user identifier column if a table or CTE involve data protected by differential privacy. Filters, groupings, and aggregations should be done at the user level.
Final\$1select allows COUNT DISTINCT, COUNT, SUM, AVG, and STDDEV aggregate functions.

For more details about which SQL keywords are supported for differential privacy, see [SQL capabilities of AWS Clean Rooms Differential Privacy](dp-sql-capabilities.md).

# ID mapping table analysis rule
<a name="analysis-rules-id-mapping-table"></a>

In AWS Clean Rooms, an *ID mapping table analysis rule* isn't a standalone analysis rule. This type of analysis rule is managed by AWS Clean Rooms and is used to join disparate identity data to facilitate querying. It's automatically added to ID mapping tables and can't be edited. It inherits the behaviors of the other analysis rules in the collaboration – as long as those analysis rules are homogeneous.

The ID mapping table analysis rule enforces security on an ID mapping table. It restricts a collaboration member from directly selecting or inspecting the non-overlap population between the two members' datasets using the ID mapping table. The ID mapping table analysis rule is used to protect the sensitive data in the ID mapping table when used in queries with other analysis rules implicitly.

With the ID mapping table analysis rule, AWS Clean Rooms enforces an overlap on both sides of the ID mapping table in expanded SQL. This enables you to do the following tasks: 
+ Use the overlap of the ID mapping table in JOIN statements.

  AWS Clean Rooms allows an INNER, LEFT, or RIGHT join on the ID mapping table if it respects the overlap. To protect sensitive mapping information, the ID mapping table must always be on the "inner" side of any JOIN operation. For example, the following JOIN operations are valid:
  + table LEFT JOIN id\$1mapping\$1table
  + id\$1mapping\$1table RIGHT JOIN table
  + table INNER JOIN id\$1mapping\$1table

  The following JOIN operations are invalid:
  + id\$1mapping\$1table LEFT JOIN table
  + table RIGHT JOIN id\$1mapping\$1table

  This prevents exposure of mapping records that don't have corresponding matches in your dataset. Allowing such operations could potentially reveal sensitive information about other collaboration members' data mappings.
+ Use the mapping table columns in JOIN statements. 

  You can’t use the mapping table columns in the following statements: SELECT, WHERE, HAVING, GROUP BY, or ORDER BY (unless protections are modified on the source ID namespace association or the target ID namespace association).
+ In expanded SQL, AWS Clean Rooms also supports OUTER JOIN, implicit JOIN, and CROSS JOIN. These joins can't satisfy overlap requirements. Instead, AWS Clean Rooms uses `requireOverlap` to specify which columns have to be joined on.

The supported query structure and syntax are deﬁned in [ID mapping table query structure and syntax](#id-mapping-table-query-controls).

The parameters of the analysis rule, deﬁned in [ID mapping table analysis rule query controls](#parameters-id-mapping-query-controls), include query controls and query results controls. Its query controls include the ability to require overlap of the ID mapping table in JOIN statements (that is, `requireOverlap`).

**Topics**
+ [ID mapping table query structure and syntax](#id-mapping-table-query-controls)
+ [ID mapping table analysis rule query controls](#parameters-id-mapping-query-controls)
+ [ID mapping table analysis rule predefined structure](#id-mapping-table-predefined-structure)
+ [ID mapping table analysis rule – example](#id-mapping-table-example)

## ID mapping table query structure and syntax
<a name="id-mapping-table-query-controls"></a>

Queries on tables that have an ID mapping table analysis rule must adhere to the following syntax.

```
--select_list_expression
SELECT 
provider.data_col, consumer.data_col 

--table_expression
FROM provider

JOIN idMappingTable idmt ON provider.id = idmt.sourceId

JOIN consumer ON consumer.id = idmt.targetId
```

### Collaboration tables
<a name="collab-table-structure"></a>

The following tables represent configured tables that exist in an AWS Clean Rooms collaboration. The **id** column in both the **cr\$1drivers\$1license** and **cr\$1insurance** tables represents a column to match with the ID mapping table.

**cr\$1drivers\$1license**


|  |  |  | 
| --- |--- |--- |
| id | driver\$1name | state\$1of\$1registration | 
| 1 | Eduard | TX | 
| 2 | Dana | MA | 
| 3 | Gweneth | IL | 

**cr\$1insurance**


|  |  |  | 
| --- |--- |--- |
| id | policyholder\$1email | policy\$1number | 
| a | eduardo@internal.company.com | 17f9d04e-f5be-4426-bdc4-250ed59c6529 | 
| b | gwen@internal.company.com | 3f0092db-2316-48a8-8d44-09cf8f6e6c64 | 
| c | rosa@internal.company.com | d7692e84-3d3c-47b8-b46d-a0d5345f0601 | 

### ID mapping table
<a name="id-mapping-table-structure"></a>

The following table represents an existing ID mapping table that matches on the **cr\$1drivers\$1license** and **cr\$1insurance** tables. Not all entries will have IDs for both collaboration tables.


|  |  | 
| --- |--- |
| cr\$1drivers\$1license\$1id | cr\$1insurance\$1id | 
| 1 | a | 
| 2 | null | 
| 3 | b | 
| null | c | 

The ID mapping table analysis rule only allows for queries to run on the set of overlapping data, which would look as follows:


|  |  |  |  |  |  | 
| --- |--- |--- |--- |--- |--- |
| cr\$1drivers\$1license\$1id | cr\$1insurance\$1id | driver\$1name | state\$1of\$1registration | policyholder\$1email | policy\$1number | 
| 1 | a | Eduard | TX | eduardo@internal.company.com | 17f9d04e-f5be-4426-bdc4-250ed59c6529 | 
| 3 | b | Gweneth | IL | gwen@internal.company.com | 3f0092db-2316-48a8-8d44-09cf8f6e6c64 | 

### Example queries
<a name="id-mapping-table-example-queries"></a>

The following examples show valid locations for the ID mapping table joins:

```
-- Single ID mapping table
SELECT
    [ select_items ]FROM
    cr_drivers_license cr_dl
    [ INNER | LEFT ] JOIN cr_identity_mapping_table idmt ON idmt.cr_drivers_license_id = cr_dl.id
    [ INNER | RIGHT ] JOIN cr_insurance cr_in            ON idmt.cr_insurance_id       = cr_in.id
;
-- Single ID mapping table (Subquery)
SELECT
    [ select_items ]FROM (
    SELECT
        [ select_items ]
    FROM
        cr_drivers_license cr_dl
        [ INNER | LEFT ] JOIN cr_identity_mapping_table idmt ON idmt.cr_drivers_license_id = cr_dl.id
        [ INNER | RIGHT ] JOIN cr_insurance cr_in            ON idmt.cr_insurance_id       = cr_in.id
)
;
-- Single ID mapping table (CTE)
WITH
    matched_ids AS (
        SELECT
            [ select_items ]
        FROM
            cr_drivers_license cr_dl
            [ INNER | LEFT ] JOIN cr_identity_mapping_table idmt ON idmt.cr_drivers_license_id = cr_dl.id
            [ INNER | RIGHT ] JOIN cr_insurance cr_in            ON idmt.cr_insurance_id       = cr_in.id
    )SELECT
    [ select_items ]FROM
    matched_ids
;
```

### Considerations
<a name="id-mapping-table-considerations"></a>

For ID mapping table query structure and syntax, be aware of the following:
+ You can’t edit it.
+ It's applied to the ID mapping table by default.
+ It uses a source and target ID namespace association inside the collaboration. 
+ The ID mapping table is configured by default to provide default protections for the column that comes from the ID namepsace. You can modify this configuration so that the column that comes from the ID namespace (either `sourceID` or `targetID`) can be allowed anywhere in the query. For more information, see [ID namespaces in AWS Clean Rooms](working-with-id-namespaces.md).
+ The ID mapping table analysis rule inherits the SQL restrictions of the other analysis rules in the collaboration.

## ID mapping table analysis rule query controls
<a name="parameters-id-mapping-query-controls"></a>

With ID mapping table query controls, AWS Clean Rooms controls how the columns in your table are used to query the table. For example, it controls which columns are used for joining, and which columns require overlap. The ID mapping table analysis rule also includes functionality that enables you to allow the `sourceID`, the `targetID`, or both, to be projected without requiring a JOIN. 

The following table explains each control.


| Control | Definition | Usage | 
| --- | --- | --- | 
| joinColumns | The columns that the member who can query can use in the INNER JOIN statement. | You can't use joinColumns in any other parts of the query other than INNER JOIN.For more information, see [Join controls](analysis-rules-aggregation.md#join-controls). | 
| dimensionColumns  | The columns (if any) that the member who can query can use in SELECT and GROUP BY statements.  |  A `dimensionColumn` can be used in SELECT and GROUP BY. A `dimensionColumn` can appear as `joinKeys`.  You can only use `dimensionColumns` in the JOIN clause if you specify it in brackets.  | 
| queryContraints:RequireOverlap |  The columns in the ID mapping table that must be joined on so the query can run.  |  These columns must be used to JOIN the ID Mapping table and a collaboration table.  | 

## ID mapping table analysis rule predefined structure
<a name="id-mapping-table-predefined-structure"></a>

The predeﬁned structure for an ID mapping table analysis rule comes with default protections that are applied to the `sourceID` and `targetID`. This means that the column with protections applied must be used in queries.

You can configure the ID mapping table analysis rule in the following ways:
+ Both `sourceID` and `targetID` protected

  In this configuration, the `sourceID` and `targetID` can't both be projected. The `sourceID` and `targetID` must be used in a JOIN when the ID mapping table is referenced.
+ Only `targetID` protected

  In this configuration, the `targetID` can't be projected. The `targetID` must be used in a JOIN when the ID mapping table is referenced. The `sourceID` can be used in query.
+ Only `sourceID` protected

  In this configuration, the `sourceID` can't be projected. The `sourceID` must be used in a JOIN when ID mapping table is referenced. The `targetID` can be used in query.
+ Neither `sourceID` or `targetID` protected

  In this configuration, the ID mapping table isn't subject to any specific enforcement that can be used in query.

The following example shows a predeﬁned structure for an ID mapping table analysis rule with the default protections applied to the `sourceID` and `targetID`. In this example, the ID mapping table analysis rule only allows an INNER JOIN on both the `sourceID` column and the `targetID` column. 

```
{
  "joinColumns": [
    "source_id",
    "target_id"
  ],
  "queryConstraints": [
    {
      "requireOverlap": {
        "columns": [
          "source_id",
          "target_id"
        ]
      }
    }
  ],
  "dimensionColumns": [] // columns that can be used in SELECT and JOIN
}
```

The following example shows a predeﬁned structure for an ID mapping table analysis rule with protections applied to the `targetID`. In this example, the ID mapping table analysis rule only allows an INNER JOIN on the `sourceID` column. 

```
{
  "joinColumns": [
    "source_id",
    "target_id"
  ],
  "queryConstraints": [
    {
      "requireOverlap": {
        "columns": [
          "target_id"
        ]
      }
    }
  ],
  "dimensionColumns": [
    "source_id"
  ]
}
```

The following example shows a predeﬁned structure for an ID mapping table analysis rule with protections applied to the `sourceID`. In this example, the ID mapping table analysis rule only allows an INNER JOIN on the `targetID` column. 

```
{
  "joinColumns": [
    "source_id",
    "target_id"
  ],
  "queryConstraints": [
    {
      "requireOverlap": {
        "columns": [
          "source_id"
        ]
      }
    }
  ],
  "dimensionColumns": [
    "target_id"
  ]
}
```

The following example shows a predeﬁned structure for an ID mapping table analysis rule without protections applied to the `sourceID` or `targetID`. In this example, the ID mapping table analysis rule allows an INNER JOIN on both the `sourceID` column and the `targetID` column. 

```
{
  "joinColumns": [
    "source_id",
    "target_id"
  ],
  "queryConstraints": [
    {
      "requireOverlap": {
        "columns": []
      }
    }
  ],
  "dimensionColumns": [
    "source_id",
    "target_id"
  ]
}
```

## ID mapping table analysis rule – example
<a name="id-mapping-table-example"></a>

Rather than writing a long waterfall statement that references Personally Identifiable Information (PII), for example, companies can use the ID mapping table analysis rule to use multi-party LiveRamp transcoding. The following example demonstrates how you can collaborate in AWS Clean Rooms using the ID mapping table analysis rule.

Company A is an advertiser who has customer and sales data, which will be used as a source. Company A also performs transcoding on behalf of the parties in the collaboration, and brings the LiveRamp credentials.

Company B is a publisher who has event data, which will be used as a target.

**Note**  
Either Company A or Company B can provide LiveRamp transcoding credentials and perform the transcoding.

To create a collaboration that enables the ID mapping table analysis in collaboration, the companies do the following:

1. Company A creates a collaboration and creates a membership. It adds Company B, who also creates a membership in the collaboration.

1. Company A either associates an existing ID namespace source or creates a new one in AWS Entity Resolution using the AWS Clean Rooms console.

   Company A creates a configured table with their sales data and a column keyed to the `sourceId` in the ID mapping table.

   The ID namespace source provides data to transcode.

1. Company B either associates an existing ID namespace target or creates a new one in AWS Entity Resolution using the AWS Clean Rooms console.

   Company B creates a configured table with their event data and a column keyed to the `targetId` in the ID mapping table.

   The ID namespace target doesn't provide data to transcode, only metadata around the LiveRamp configuration.

1. Company A discovers the two ID namespaces associated to the collaboration and creates and populates an ID mapping table.

1. Company A runs a query across the two datasets by joining on the ID mapping table.

   ```
   --- this would be valid for Custom or List
   SELECT provider.data_col, consumer.data_col
   FROM provider
     JOIN idMappingTable-123123123123-myMappingWFName idmt 
        ON provider.id = idmt.sourceId
     JOIN consumer 
        ON consumer.id = idmt.targetId
   ```