

# Enabling catalog-level automatic statistics generation


You can enable the automatic column statistics generation for all new Apache Iceberg tables and tables in non-OTF table (Parquet, JSON, CSV, XML, ORC, ION) formats in the Data Catalog. After creating the table, you can also explicitly update the column statistics settings manually.

 To update the Data Catalog settings to enable catalog-level, the IAM role used must have the `glue:UpdateCatalog` permission or AWS Lake Formation `ALTER CATALOG` permission on the root catalog. You can use `GetCatalog` API to verify the catalog properties. 

------
#### [ AWS Management Console ]

**To enable the automatic column statistics generation at the account-level**

1. Open the Lake Formation console at [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com/lakeformation/).

1. On the left navigation bar, choose **Catalogs**.

1. On the **Catalog summary** page, choose **Edit** under **Optimization configuration**.   
![\[The screenshot shows the options available to generate column stats.\]](http://docs.aws.amazon.com/glue/latest/dg/images/edit-column-stats-auto.png)

1. On the **Table optimization configuration** page, choose the **Enable automatic statistics generation for the tables of the catalog** option.  
![\[The screenshot shows the options available to generate column stats.\]](http://docs.aws.amazon.com/glue/latest/dg/images/edit-optimization-option.jpg)

1. Choose an existing IAM role or create a new one that has the necessary permissions to run the column statistics task.

1. Choose **Submit**.

------
#### [ AWS CLI ]

You can also enable catalog-level statistics collection through the AWS CLI. To configure table-level statistics collection using AWS CLI, run the following command:

```
aws glue update-catalog --cli-input-json '{
    "name": "123456789012",
    "catalogInput": {
        "description": "Updating root catalog with role arn",
        "catalogProperties": {
            "customProperties": {
                "ColumnStatistics.RoleArn": "arn:aws:iam::"123456789012":role/service-role/AWSGlueServiceRole",
                "ColumnStatistics.Enabled": "true"
            }
        }
    }
}'
```

 The above command calls AWS Glue's `UpdateCatalog` operation, which takes in a `CatalogProperties` structure with the following key-value pairs for catalog-level statistics generation: 
+ ColumnStatistics.RoleArn – IAM role ARN to be used for all tasks triggered for Catalog-level statistics generation
+ ColumnStatistics.Enabled – Boolean indicating whether the catalog-level settings is enabled or disabled

------