

# DQDL rule type reference
<a name="dqdl-rule-types"></a>

This section provides a reference for each rule type that AWS Glue Data Quality supports.

**Note**  
DQDL doesn't currently support nested or list-type column data.
Bracketed values in the below table will be replaced with the information provided in rule arguments.
Rules typically require an additional argument for expression.


| Ruletype | Description | Arguments | Reported Metrics | Supported as Rule? | Supported as Analyzer? | Returns row-level Results? | Dynamic rule support? | Generates Observations | Supports Where Clause Syntax? | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| AggregateMatch | Checks if two datasets match by comparing summary metrics like total sales amount. Useful for financial institutions to compare if all data is ingested from source systems. | One or more aggregations | When first and second aggregation column names match:<br />`Column.[Column].AggregateMatch`<br />When first and second aggregation column names different:<br />`Column.[Column1,Column2].AggregateMatch` | Yes | No | No | No | No | No | 
| AllStatistics | Standalone analyzer to gather multiple metrics for the provided column in a dataset. | A single column name | For columns of all types:<br />`Dataset.*.RowCount`<br />`Column.[Column].Completeness`<br />`Column.[Column].Uniqueness`<br />Additional metrics for string-valued columns:<br />`ColumnLength metrics`<br />Additional metrics for numeric-valued columns:<br />`ColumnValues metrics` | No | Yes | No | No | No | No | 
| ColumnCorrelation | Checks how well two columns are correlated. | Exactly two column names | Multicolumn.[Column1,Column2].ColumnCorrelation | Yes | Yes | No | Yes | No | Yes | 
| ColumnCount | Checks if any columns are dropped. | None | Dataset.\*.ColumnCount | Yes | Yes | No | Yes | Yes | No | 
| ColumnDataType | Checks if a column is compliant with a datatype. | Exactly one column name | Column.[Column].ColumnDataType.Compliance | Yes | No | No | Yes, in row-level threshold expression | No | Yes | 
| ColumnExists | Checks if columns exist in a dataset. This allows customers building self service data platforms to ensure certain columns are made available. | Exactly one column name | N/A | Yes | No | No | No | No | No | 
| ColumnLength | Checks if length of data is consistent. | Exactly one column name | `Column.[Column].MaximumLength`<br />`Column.[Column].MinimumLength`<br />Additional metric when row-level threshold provided:<br />`Column.[Column].ColumnValues.Compliance` | Yes | Yes | Yes, when row-level threshold provided | No | Yes. Only generates observations by analyzing Minimum and Maximum length | Yes | 
| ColumnNamesMatchPattern | Checks if column names match defined patterns. Useful for governance teams to enforce column name consistency.  | A regex for column names | Dataset.\*.ColumnNamesPatternMatchRatio | Yes | No | No | No | No | No | 
| ColumnValues | Checks if data is consistent per defined values. This rule supports regular expressions. | Exactly one column name | `Column.[Column].Maximum`<br />`Column.[Column].Minimum`<br />Additional metric when row-level threshold provided:<br />`Column.[Column].ColumnValues.Compliance` | Yes | Yes | Yes, when row-level threshold provided | No | Yes. Only generates observations by analyzing Minimum and Maximum values | Yes | 
| Completeness | Checks for any blank or NULLs in data. | Exactly one column name | `Column.[Column].Completeness` | Yes | Yes | Yes | Yes | Yes | Yes | 
| CustomSql |  Customers can implement almost any type of data quality checks in SQL. | A SQL statement<br />(Optional) A row-level threshold | `Dataset.*.CustomSQL`<br />Additional metric when row-level threshold provided:<br />`Dataset.*.CustomSQL.Compliance` | Yes | No | Yes, when row-level threshold provided | Yes | No | No | 
| DataFreshness | Checks if data is fresh. | Exactly one column name | Column.[Column].DataFreshness.Compliance | Yes | No | Yes | No | No | Yes | 
| DatasetMatch | Compares two datasets and identifies if they are in synch. | Name of a reference dataset<br />A column mapping<br />(Optional) Columns to check for matches | Dataset.[ReferenceDatasetAlias].DatasetMatch | Yes | No | Yes | Yes | No | No | 
| DistinctValuesCount | Checks for duplicate values. | Exactly one column name | Column.[Column].DistinctValuesCount | Yes | Yes | Yes | Yes | Yes | Yes | 
| DetectAnomalies | Checks for anomalies in another rule type's reported metrics. | A rule type | Metric(s) reported by the rule type argument | Yes | No | No | No | No | No | 
| Entropy | Checks for entropy of the data. | Exactly one column name | Column.[Column].Entropy | Yes | Yes | No | Yes | No | Yes | 
| IsComplete | Checks if 100% of the data is complete. | Exactly one column name | Column.[Column].Completeness | Yes | No | Yes | No | No | Yes | 
| IsPrimaryKey | Checks if a column is a primary key (not NULL and unique). | Exactly one column name | For single column:<br />`Column.[Column].Uniqueness`<br />For multiple columns:<br />`Multicolumn.[CommaDelimitedColumns].Uniqueness` | Yes | No | Yes | No | No | Yes | 
| IsUnique | Checks if 100% of the data is unique. | Exactly one column name | Column.[Column].Uniqueness | Yes | No | Yes | No | No | Yes | 
| Mean | Checks if the mean matches the set threshold. | Exactly one column name | Column.[Column].Mean | Yes | Yes | Yes | Yes | No | Yes | 
| ReferentialIntegrity | Checks if two datasets have referential integrity. | One or more column names from dataset<br />One or more column names from reference dataset | Column.[ReferenceDatasetAlias].ReferentialIntegrity | Yes | No | Yes | Yes | No | No | 
| RowCount | Checks if record counts match a threshold. | None | Dataset.\*.RowCount | Yes | Yes | No | Yes | Yes | Yes | 
| RowCountMatch | Checks if record counts between two datasets match. | Reference dataset alias | Dataset.[ReferenceDatasetAlias].RowCountMatch | Yes | No | No | Yes | No | No | 
| StandardDeviation | Checks if standard deviation matches the threshold. | Exactly one column name | Column.[Column].StandardDeviation | Yes | Yes | Yes | Yes | No | Yes | 
| SchemaMatch | Checks if schema between two datasets match. | Reference dataset alias | Dataset.[ReferenceDatasetAlias].SchemaMatch | Yes | No | No | Yes | No | No | 
| Sum | Checks if sum matches a set threshold. | Exactly one column name | Column.[Column].Sum | Yes | Yes | No | Yes | No | Yes | 
| Uniqueness | Checks if uniqueness of dataset matches threshold. | Exactly one column name | Column.[Column].Uniqueness | Yes | Yes | Yes | Yes | No | Yes | 
| UniqueValueRatio | Checks if the unique value ration matches threshold. | Exactly one column name | Column.[Column].UniqueValueRatio | Yes | Yes | Yes | Yes | No | Yes | 
| FileFreshness | Checks if files in Amazon S3 are fresh. | File or Folder path and a threshold. | `Dataset.*.FileFreshness.Compliance`<br />`Dataset.*.FileCount` | Yes | No | No | No | No | No | 
| FileMatch | Checks if contents of file match to a checksum or with other file. This rule uses checksums to validate if two files are same. | Source File or Folder path and Target file or folder path. | No statistics are generated. | Yes | No | No | No | No | No | 
| FileSize | Checks if the size of a file matches with a specified condition. | File or folder path and threshold. | `Dataset.*.FileSize.Compliance`<br />`Dataset.*.FileCount`<br />`Dataset.*.MaximumFileSize`<br />`Dataset.*.MinimumFileSize` | Yes | No | No | No | No | No | 
| FileUniqueness | Checks if files are unique using checksums. | File or folder path and threshold. | `Dataset.*.FileUniquenessRatio`<br />`Dataset.*.FileCount` | Yes | No | No | No | No | No | 

**Topics**
+ [AggregateMatch](dqdl-rule-types-AggregateMatch.md)
+ [ColumnCorrelation](dqdl-rule-types-ColumnCorrelation.md)
+ [ColumnCount](dqdl-rule-types-ColumnCount.md)
+ [ColumnDataType](dqdl-rule-types-ColumnDataType.md)
+ [ColumnExists](dqdl-rule-types-ColumnExists.md)
+ [ColumnLength](dqdl-rule-types-ColumnLength.md)
+ [ColumnNamesMatchPattern](dqdl-rule-types-ColumnNamesMatchPattern.md)
+ [ColumnValues](dqdl-rule-types-ColumnValues.md)
+ [Completeness](dqdl-rule-types-Completeness.md)
+ [CustomSQL](dqdl-rule-types-CustomSql.md)
+ [DataFreshness](dqdl-rule-types-DataFreshness.md)
+ [DatasetMatch](dqdl-rule-types-DatasetMatch.md)
+ [DistinctValuesCount](dqdl-rule-types-DistinctValuesCount.md)
+ [Entropy](dqdl-rule-types-Entropy.md)
+ [IsComplete](dqdl-rule-types-IsComplete.md)
+ [IsPrimaryKey](dqdl-rule-types-IsPrimaryKey.md)
+ [IsUnique](dqdl-rule-types-IsUnique.md)
+ [Mean](dqdl-rule-types-Mean.md)
+ [ReferentialIntegrity](dqdl-rule-types-ReferentialIntegrity.md)
+ [RowCount](dqdl-rule-types-RowCount.md)
+ [RowCountMatch](dqdl-rule-types-RowCountMatch.md)
+ [StandardDeviation](dqdl-rule-types-StandardDeviation.md)
+ [Sum](dqdl-rule-types-Sum.md)
+ [SchemaMatch](dqdl-rule-types-SchemaMatch.md)
+ [Uniqueness](dqdl-rule-types-Uniqueness.md)
+ [UniqueValueRatio](dqdl-rule-types-UniqueValueRatio.md)
+ [DetectAnomalies](dqdl-rule-types-DetectAnomalies.md)
+ [FileFreshness](dqdl-rule-types-FileFreshness.md)
+ [FileMatch](dqdl-rule-types-FileMatch.md)
+ [FileUniqueness](dqdl-rule-types-FileUniqueness.md)
+ [FileSize](dqdl-rule-types-FileSize.md)