# Query Apache Iceberg tables
<a name="querying-iceberg"></a>

You can use Athena to perform read, time travel, write, and DDL queries on Apache Iceberg tables.

[Apache Iceberg](https://iceberg.apache.org/) is an open table format for very large analytic datasets. Iceberg manages large collections of files as tables, and it supports modern analytical data lake operations such as record-level insert, update, delete, and time travel queries. The Iceberg specification allows seamless table evolution such as schema and partition evolution and is designed for optimized usage on Amazon S3. Iceberg also helps guarantee data correctness under concurrent write scenarios.

For more information about Apache Iceberg, see [https://iceberg.apache.org/](https://iceberg.apache.org/).

## Considerations and limitations
<a name="querying-iceberg-considerations-and-limitations"></a>

Athena support for Iceberg tables has the following considerations and limitations:
+ **Iceberg version support** – Athena supports Apache Iceberg version 1.4.2. 
+ **Tables registered with Lake Formation** – Athena does not currently support DDL operations on Iceberg tables that are registered with Lake Formation. 
+ **Queries against information schema** – When querying the information schema of Iceberg tables, Athena uses S3 metadata as the source of truth for column metadata. This means that column information is derived from the underlying S3 files rather than from the catalog metadata. This behavior differs from other table formats where catalog metadata might be the primary source for column information.
+ **Tables with AWS Glue catalog only** – Only Iceberg tables created against the AWS Glue catalog based on specifications defined by the [open source glue catalog implementation](https://iceberg.apache.org/docs/latest/aws/#glue-catalog) are supported from Athena.
+ **Table locking support by AWS Glue only** – Unlike the open source Glue catalog implementation, which supports plug-in custom locking, Athena supports AWS Glue optimistic locking only. Using Athena to modify an Iceberg table with any other lock implementation will cause potential data loss and break transactions.
+ **Supported file formats** – Athena engine version 3 supports the following Iceberg file formats.
  + Parquet
  + ORC
  + Avro
+ **Iceberg restricted metadata** – Lake Formation does not evaluate the Iceberg metadata tables. Hence, the Iceberg metadata tables are restricted if there are any Lake Formation row or cell filters present on the base table or if you do not have permissions to view all columns in the base table. For such cases, when you query the `$partitions`, `$files`, `$manifests`, and `$snapshots` Iceberg metadata tables, it fails and you get an `AccessDeniedException` error. Additionally, the metadata column `$path` has the same Lake Formation restrictions and fails when selected by the query. All other metadata tables can be queried regardless of the Lake Formation filters. For more information, see [Metadata tables](https://trino.io/docs/current/connector/iceberg.html#metadata-tables).
+ **Iceberg v2 tables** – Athena only creates and operates on Iceberg v2 tables. For the difference between v1 and v2 tables, see [Format version changes](https://iceberg.apache.org/spec/#appendix-e-format-version-changes) in the Apache Iceberg documentation.
+ **Display of time types without time zone** – The time and timestamp without time zone types are displayed in UTC. If the time zone is unspecified in a filter expression on a time column, UTC is used.
+ **Timestamp related data precision** – Although Iceberg supports microsecond precision for the timestamp data type, Athena supports only millisecond precision for timestamps in both reads and writes. For data in time related columns that is rewritten during manual compaction operations, Athena retains only millisecond precision.
+ **Unsupported operations** – The following Athena operations are not supported for Iceberg tables. 
  + [ALTER TABLE SET LOCATION](alter-table-set-location.md)
+ **Views** – Use `CREATE VIEW` to create Athena views as described in [Work with views](views.md). If you are interested in using the [Iceberg view specification](https://github.com/apache/iceberg/blob/master/format/view-spec.md) to create views, contact [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com). 
+ **TTF management commands not supported in AWS Lake Formation** – Although you can use Lake Formation to manage read access permissions for TransactionTable Formats (TTFs) like Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake, you cannot use Lake Formation to manage permissions for operations like `VACUUM`, `MERGE`, `UPDATE` or `OPTIMIZE` with these table formats. For more information about Lake Formation integration with Athena, see [Using AWS Lake Formation with Amazon Athena](https://docs.aws.amazon.com/lake-formation/latest/dg/athena-lf.html) in the *AWS Lake Formation Developer Guide*.
+ **Partitioning by nested fields** – Partitioning by nested fields is not supported. Attempting to do so produces the message NOT\$1SUPPORTED: Partitioning by nested field is unsupported: *column\$1name*.*nested\$1field\$1name*.
+ **Skipping Amazon Glacier objects not supported ** – If objects in the Apache Iceberg table are in an Amazon Glacier storage class, setting the `read_restored_glacier_objects` table property to `false` has no effect.

  For example, suppose you issue the following command:

  ```
  ALTER TABLE table_name SET TBLPROPERTIES ('read_restored_glacier_objects' = 'false')
  ```

  For Iceberg and Delta Lake tables, the command produces the error Unsupported table property key: read\$1restored\$1glacier\$1objects. For Hudi tables, the `ALTER TABLE` command does not produce an error, but Amazon Glacier objects are still not skipped. Running `SELECT` queries after the `ALTER TABLE` command continues to return all objects.

If you would like Athena to support a particular feature, send feedback to [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com).

**Topics**
+ [

## Considerations and limitations
](#querying-iceberg-considerations-and-limitations)
+ [

# Create Iceberg tables
](querying-iceberg-creating-tables.md)
+ [

# Query Iceberg table data
](querying-iceberg-table-data.md)
+ [

# Perform time travel and version travel queries
](querying-iceberg-time-travel-and-version-travel-queries.md)
+ [

# Update Iceberg table data
](querying-iceberg-updating-iceberg-table-data.md)
+ [

# Manage Iceberg tables
](querying-iceberg-managing-tables.md)
+ [

# Evolve Iceberg table schema
](querying-iceberg-evolving-table-schema.md)
+ [

# Perform other DDL operations on Iceberg tables
](querying-iceberg-additional-operations.md)
+ [

# Optimize Iceberg tables
](querying-iceberg-data-optimization.md)
+ [

# Query AWS Glue Data Catalog materialized views
](querying-iceberg-gdc-mv.md)
+ [

# Supported data types for Iceberg tables in Athena
](querying-iceberg-supported-data-types.md)
+ [

# Additional resources
](querying-iceberg-additional-resources.md)

# Create Iceberg tables
<a name="querying-iceberg-creating-tables"></a>

To create an Iceberg table for use in Athena, you can use a `CREATE TABLE` statement as documented on this page, or you can use an AWS Glue crawler.

## Use a CREATE TABLE statement
<a name="querying-iceberg-creating-tables-query-editor"></a>

Athena creates Iceberg v2 tables. For the difference between v1 and v2 tables, see [Format version changes](https://iceberg.apache.org/spec/#appendix-e-format-version-changes) in the Apache Iceberg documentation.

Athena `CREATE TABLE` creates an Iceberg table with no data. You can query a table from external systems such as Apache Spark directly if the table uses the [Iceberg open source glue catalog](https://iceberg.apache.org/docs/latest/aws/#glue-catalog). You do not have to create an external table.

**Warning**  
Running `CREATE EXTERNAL TABLE` results in the error message External keyword not supported for table type ICEBERG. 

To create an Iceberg table from Athena, set the `'table_type'` table property to `'ICEBERG'` in the `TBLPROPERTIES` clause, as in the following syntax summary.

```
CREATE TABLE
  [db_name.]table_name (col_name data_type [COMMENT col_comment] [, ...] )
  [PARTITIONED BY (col_name | transform, ... )]
  LOCATION 's3://amzn-s3-demo-bucket/your-folder/'
  TBLPROPERTIES ( 'table_type' ='ICEBERG' [, property_name=property_value] )
```

For information about the data types that you can query in Iceberg tables, see [Supported data types for Iceberg tables in Athena](querying-iceberg-supported-data-types.md).

### Use partitions
<a name="querying-iceberg-partitioning"></a>

To create Iceberg tables with partitions, use `PARTITIONED BY` syntax. Columns used for partitioning must be specified in the columns declarations first. Within the `PARTITIONED BY` clause, the column type must not be included. You can also define [partition transforms](https://iceberg.apache.org/spec/#partition-transforms) in `CREATE TABLE` syntax. To specify multiple columns for partitioning, separate the columns with the comma (`,`) character, as in the following example.

```
CREATE TABLE iceberg_table (id bigint, data string, category string)
  PARTITIONED BY (category, bucket(16, id))
  LOCATION 's3://amzn-s3-demo-bucket/your-folder/'
  TBLPROPERTIES ( 'table_type' = 'ICEBERG' )
```

The following table shows the available partition transform functions.


****  

| Function | Description | Supported types | 
| --- | --- | --- | 
| year(ts) | Partition by year | date, timestamp | 
| month(ts) | Partition by month | date, timestamp | 
| day(ts)  | Partition by day | date, timestamp | 
| hour(ts) | Partition by hour | timestamp | 
| bucket(N, col) | Partition by hashed value mod N buckets. This is the same concept as hash bucketing for Hive tables. | int, long, decimal, date, timestamp, string, binary  | 
| truncate(L, col) | Partition by value truncated to L | int, long, decimal, string | 

Athena supports Iceberg's hidden partitioning. For more information, see [Iceberg's hidden partitioning](https://iceberg.apache.org/docs/latest/partitioning/#icebergs-hidden-partitioning) in the Apache Iceberg documentation.

### Specify table properties
<a name="querying-iceberg-table-properties"></a>

This section describes table properties that you can specify as key-value pairs in the `TBLPROPERTIES` clause of the `CREATE TABLE` statement. Athena allows only a predefined list of key-value pairs in the table properties for creating or altering Iceberg tables. The following tables show the table properties that you can specify. For more information about the compaction options, see [Optimize Iceberg tables](querying-iceberg-data-optimization.md) in this documentation. If you would like Athena to support a specific open source table configuration property, send feedback to [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com). 

***format***


****  

|  |  | 
| --- |--- |
| Description | File data format | 
| Allowed property values | Supported file format and compression combinations vary by Athena engine version. For more information, see [Use Iceberg table compression](compression-support-iceberg.md). | 
| Default value | parquet | 

***write\$1compression***


****  

|  |  | 
| --- |--- |
| Description | File compression codec | 
| Allowed property values | Supported file format and compression combinations vary by Athena engine version. For more information, see [Use Iceberg table compression](compression-support-iceberg.md). | 
| Default value |  Default write compression varies by Athena engine version. For more information, see [Use Iceberg table compression](compression-support-iceberg.md).  | 

***optimize\$1rewrite\$1data\$1file\$1threshold***


****  

|  |  | 
| --- |--- |
| Description | Data optimization specific configuration. If there are fewer data files that require optimization than the given threshold, the files are not rewritten. This allows the accumulation of more data files to produce files closer to the target size and skip unnecessary computation for cost saving. | 
| Allowed property values | A positive number. Must be less than 50. | 
| Default value | 5 | 

***optimize\$1rewrite\$1delete\$1file\$1threshold***


****  

|  |  | 
| --- |--- |
| Description | Data optimization specific configuration. If there are fewer delete files associated with a data file than the threshold, the data file is not rewritten. This allows the accumulation of more delete files for each data file for cost saving. | 
| Allowed property values | A positive number. Must be less than 50. | 
| Default value | 2 | 

***vacuum\$1min\$1snapshots\$1to\$1keep***


****  

|  |  | 
| --- |--- |
| Description |  Minimum number of snapshots to retain on a table's main branch. This value takes precedence over the `vacuum_max_snapshot_age_seconds` property. If the minimum remaining snapshots are older than the age specified by `vacuum_max_snapshot_age_seconds`, the snapshots are kept, and the value of `vacuum_max_snapshot_age_seconds` is ignored.  | 
| Allowed property values | A positive number. | 
| Default value | 1 | 

***vacuum\$1max\$1snapshot\$1age\$1seconds***


****  

|  |  | 
| --- |--- |
| Description | Maximum age of the snapshots to retain on the main branch. This value is ignored if the remaining minimum of snapshots specified by vacuum\$1min\$1snapshots\$1to\$1keep are older than the age specified. This table behavior property corresponds to the history.expire.max-snapshot-age-ms property in Apache Iceberg configuration. | 
| Allowed property values | A positive number. | 
| Default value | 432000 seconds (5 days) | 

***vacuum\$1max\$1metadata\$1files\$1to\$1keep***


****  

|  |  | 
| --- |--- |
| Description | The maximum number of previous metadata files to retain on the table's main branch. | 
| Allowed property values | A positive number. | 
| Default value | 100 | 

***write\$1data\$1path\$1enabled***


****  

|  |  | 
| --- |--- |
| Description | When set to true, the Iceberg table is created with the write.data.path property instead of the deprecated write.object-storage.path property. Use this option to ensure compatibility with Iceberg 1.9.0 and later, which no longer supports the deprecated property. | 
| Allowed property values | true, false | 
| Default value | false | 

### Example CREATE TABLE statement
<a name="querying-iceberg-example-create-table-statement"></a>

The following example creates an Iceberg table that has three columns.

```
CREATE TABLE iceberg_table (
  id int,
  data string,
  category string) 
PARTITIONED BY (category, bucket(16,id)) 
LOCATION 's3://amzn-s3-demo-bucket/iceberg-folder' 
TBLPROPERTIES (
  'table_type'='ICEBERG',
  'format'='parquet',
  'write_compression'='snappy',
  'optimize_rewrite_delete_file_threshold'='10'
)
```

## Use CREATE TABLE AS SELECT (CTAS)
<a name="querying-iceberg-creating-tables-ctas"></a>

For information about creating an Iceberg table using the `CREATE TABLE AS` statement, see [CREATE TABLE AS](create-table-as.md), with particular attention to the [CTAS table properties](create-table-as.md#ctas-table-properties) section.

## Use an AWS Glue crawler
<a name="querying-iceberg-creating-tables-crawler"></a>

You can use an AWS Glue crawler to automatically register your Iceberg tables into the AWS Glue Data Catalog. If you want to migrate from another Iceberg catalog, you can create and schedule an AWS Glue crawler and provide the Amazon S3 paths where the Iceberg tables are located. You can specify the maximum depth of the Amazon S3 paths that the AWS Glue crawler can traverse. After you schedule an AWS Glue crawler, the crawler extracts schema information and updates the AWS Glue Data Catalog with the schema changes every time it runs. The AWS Glue crawler supports schema merging across snapshots and updates the latest metadata file location in the AWS Glue Data Catalog. For more information, see [Data Catalog and crawlers in AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html). 

# Query Iceberg table data
<a name="querying-iceberg-table-data"></a>

To query an Iceberg dataset, use a standard `SELECT` statement like the following. Queries follow the Apache Iceberg [format v2 spec](https://iceberg.apache.org/spec/#format-versioning) and perform merge-on-read of both position and equality deletes.

```
SELECT * FROM [db_name.]table_name [WHERE predicate]
```

To optimize query times, all predicates are pushed down to where the data lives.

For information about time travel and version travel queries, see [Perform time travel and version travel queries](querying-iceberg-time-travel-and-version-travel-queries.md).

## Create and query views with Iceberg tables
<a name="querying-iceberg-views"></a>

To create and query Athena views on Iceberg tables, use `CREATE VIEW` views as described in [Work with views](views.md).

Example:

```
CREATE VIEW view1 AS SELECT * FROM iceberg_table
```

```
SELECT * FROM view1 
```

If you are interested in using the [Iceberg view specification](https://github.com/apache/iceberg/blob/master/format/view-spec.md) to create views, contact [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com). 

## Query Iceberg table metadata
<a name="querying-iceberg-table-metadata"></a>

In a `SELECT` query, you can use the following properties after *table\$1name*to query Iceberg table metadata:
+ **\$1files** – Shows a table's current data files.
+ **\$1manifests** – Shows a table's current file manifests.
+ **\$1history** – Shows a table's history.
+ **\$1partitions** – Shows a table's current partitions.
+ **\$1snapshots** – Shows a table's snapshots.
+ **\$1refs** – Shows a table's references.

### Examples
<a name="querying-iceberg-table-metadata-syntax"></a>

The following statement lists the files for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$files"
```

The following statement lists the manifests for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$manifests" 
```

The following statement shows the history for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$history"
```

The following example shows the partitions for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$partitions" 
```

The following example lists the snapshots for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$snapshots" 
```

The following example shows the references for an Iceberg table.

```
SELECT * FROM "dbname"."tablename$refs" 
```

## Use Lake Formation fine-grained access control
<a name="querying-iceberg-working-with-lf-fgac"></a>

Athena engine version 3 supports Lake Formation fine-grained access control with Iceberg tables, including column level and row level security access control. This access control works with time travel queries and with tables that have performed schema evolution. For more information, see [Lake Formation fine-grained access control and Athena workgroups](lf-athena-limitations.md#lf-athena-limitations-fine-grained-access-control).

If you created your Iceberg table outside of Athena, use [Apache Iceberg SDK](https://iceberg.apache.org/releases/) version 0.13.0 or higher so that your Iceberg table column information is populated in the AWS Glue Data Catalog. If your Iceberg table does not contain column information in AWS Glue, you can use the Athena [ALTER TABLE SET TBLPROPERTIES](querying-iceberg-alter-table-set-properties.md) statement or the latest Iceberg SDK to fix the table and update the column information in AWS Glue. 

# Perform time travel and version travel queries
<a name="querying-iceberg-time-travel-and-version-travel-queries"></a>

Each Apache Iceberg table maintains a versioned manifest of the Amazon S3 objects that it contains. Previous versions of the manifest can be used for time travel and version travel queries.

Time travel queries in Athena query Amazon S3 for historical data from a consistent snapshot as of a specified date and time. Version travel queries in Athena query Amazon S3 for historical data as of a specified snapshot ID.

## Time travel queries
<a name="querying-iceberg-time-travel-queries"></a>

To run a time travel query, use `FOR TIMESTAMP AS OF timestamp` after the table name in the `SELECT` statement, as in the following example.

```
SELECT * FROM iceberg_table FOR TIMESTAMP AS OF timestamp
```

The system time to be specified for traveling is either a timestamp or timestamp with a time zone. If not specified, Athena considers the value to be a timestamp in UTC time.

The following example time travel queries select CloudTrail data for the specified date and time.

```
SELECT * FROM iceberg_table FOR TIMESTAMP AS OF TIMESTAMP '2020-01-01 10:00:00 UTC'
```

```
SELECT * FROM iceberg_table FOR TIMESTAMP AS OF (current_timestamp - interval '1' day)
```

## Version travel queries
<a name="querying-iceberg-version-travel-queries"></a>

To execute a version travel query (that is, view a consistent snapshot as of a specified version), use `FOR VERSION AS OF version` after the table name in the `SELECT` statement, as in the following example.

```
SELECT * FROM [db_name.]table_name FOR VERSION AS OF version         
```

The *version* parameter is the `bigint` snapshot ID associated with an Iceberg table version.

The following example version travel query selects data for the specified version.

```
SELECT * FROM iceberg_table FOR VERSION AS OF 949530903748831860
```

**Note**  
The `FOR SYSTEM_TIME AS OF` and `FOR SYSTEM_VERSION AS OF` clauses in Athena engine version 2 have been replaced by the `FOR TIMESTAMP AS OF` and `FOR VERSION AS OF` clauses in Athena engine version 3.

### Retrieve the snapshot ID
<a name="querying-iceberg-table-snapshot-id"></a>

You can use the Java [SnapshotUtil](https://iceberg.apache.org/javadoc/1.6.0/org/apache/iceberg/util/SnapshotUtil.html) class provided by Iceberg to retrieve the Iceberg snapshot ID, as in the following example.

```
import org.apache.iceberg.Table;
import org.apache.iceberg.aws.glue.GlueCatalog;
import org.apache.iceberg.catalog.TableIdentifier;
import org.apache.iceberg.util.SnapshotUtil;

import java.text.SimpleDateFormat;
import java.util.Date;

Catalog catalog = new GlueCatalog();

Map<String, String> properties = new HashMap<String, String>();
properties.put("warehouse", "s3://amzn-s3-demo-bucket/my-folder");
catalog.initialize("my_catalog", properties);

Date date = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss").parse("2022/01/01 00:00:00");
long millis = date.getTime();

TableIdentifier name = TableIdentifier.of("db", "table");
Table table = catalog.loadTable(name);
long oldestSnapshotIdAfter2022 = SnapshotUtil.oldestAncestorAfter(table, millis);
```

## Combine time and version travel
<a name="querying-iceberg-combining-time-and-version-travel"></a>

You can use time travel and version travel syntax in the same query to specify different timing and versioning conditions, as in the following example.

```
SELECT table1.*, table2.* FROM 
  [db_name.]table_name FOR TIMESTAMP AS OF (current_timestamp - interval '1' day) AS table1 
  FULL JOIN 
  [db_name.]table_name FOR VERSION AS OF 5487432386996890161 AS table2 
  ON table1.ts = table2.ts 
  WHERE (table1.id IS NULL OR table2.id IS NULL)
```

# Update Iceberg table data
<a name="querying-iceberg-updating-iceberg-table-data"></a>

You can manage Iceberg table data directly on Athena by using `INSERT`, `UPDATE`, and `DELETE` queries. Each data management transaction produces a new snapshot, which can be queried using time travel. The `UPDATE` and `DELETE` statements follow the Iceberg format v2 row-level [position delete](https://iceberg.apache.org/spec/#position-delete-files) specification and enforce snapshot isolation.

**Note**  
Athena SQL does not currently support the copy-on-write approach. The `UPDATE`, `MERGE INTO`, and `DELETE FROM` operations always use the merge-on-read approach with positional deletes, regardless of specified table properties. In case you have setup table properties such as `write.update.mode`, `write.merge.mode`, and/or `write.delete.mode` to use copy-on-write, your queries won't fail as Athena will ignore them and keep using merge-on-read. 

Use the following commands to perform data management operations on Iceberg tables.

**Topics**
+ [

# INSERT INTO
](querying-iceberg-insert-into.md)
+ [

# DELETE
](querying-iceberg-delete.md)
+ [

# UPDATE
](querying-iceberg-update.md)
+ [

# MERGE INTO
](querying-iceberg-merge-into.md)

# INSERT INTO
<a name="querying-iceberg-insert-into"></a>

Inserts data into an Iceberg table. Athena Iceberg `INSERT INTO` is charged the same as current `INSERT INTO` queries for external Hive tables by the amount of data scanned. To insert data into an Iceberg table, use the following syntax, where *query* can be either `VALUES (val1, val2, ...)` or `SELECT (col1, col2, …) FROM [db_name.]table_name WHERE predicate`. For SQL syntax and semantic details, see [INSERT INTO](insert-into.md).

```
INSERT INTO [db_name.]table_name [(col1, col2, …)] query
```

The following examples insert values into the table `iceberg_table`.

```
INSERT INTO iceberg_table VALUES (1,'a','c1')
```

```
INSERT INTO iceberg_table (col1, col2, ...) VALUES (val1, val2, ...)
```

```
INSERT INTO iceberg_table SELECT * FROM another_table
```

# DELETE
<a name="querying-iceberg-delete"></a>

Athena Iceberg `DELETE` writes Iceberg position delete files to a table. This is known as a merge-on-read delete. In contrast to a copy-on-write delete, a merge-on-read delete is more efficient because it does not rewrite file data. When Athena reads Iceberg data, it merges the Iceberg position delete files with data files to produce the latest view of a table. To remove these position delete files, you can run the [REWRITE DATA compaction action](querying-iceberg-data-optimization.md#querying-iceberg-data-optimization-rewrite-data-action). `DELETE` operations are charged by the amount of data scanned. For syntax, see [DELETE](delete-statement.md).

The following example deletes rows from `iceberg_table` that have `c3` as the value for `category`.

```
DELETE FROM iceberg_table WHERE category='c3'
```

# UPDATE
<a name="querying-iceberg-update"></a>

Athena Iceberg `UPDATE` writes Iceberg position delete files and newly updated rows as data files in the same transaction. `UPDATE` can be imagined as a combination of `INSERT INTO` and `DELETE`. `UPDATE` operations are charged by the amount of data scanned. For syntax, see [UPDATE](update-statement.md).

The following example updates the specified values in the table `iceberg_table`.

```
UPDATE iceberg_table SET category='c2' WHERE category='c1'
```

# MERGE INTO
<a name="querying-iceberg-merge-into"></a>

Conditionally updates, deletes, or inserts rows into an Iceberg table. A single statement can combine update, delete, and insert actions. For syntax, see [MERGE INTO](merge-into-statement.md).

**Note**  
`MERGE INTO` is transactional and is supported only for Apache Iceberg tables in Athena engine version 3.

The following example deletes all customers from table `t` that are in the source table `s`.

```
MERGE INTO accounts t USING monthly_accounts_update s
ON t.customer = s.customer
WHEN MATCHED
THEN DELETE
```

The following example updates target table `t` with customer information from source table `s`. For customer rows in table `t` that have matching customer rows in table `s`, the example increments the purchases in table t. If table `t` has no match for a customer row in table `s`, the example inserts the customer row from table `s` into table `t`.

```
MERGE INTO accounts t USING monthly_accounts_update s
    ON (t.customer = s.customer)
    WHEN MATCHED
        THEN UPDATE SET purchases = s.purchases + t.purchases
    WHEN NOT MATCHED
        THEN INSERT (customer, purchases, address)
              VALUES(s.customer, s.purchases, s.address)
```

The following example conditionally updates target table `t` with information from the source table `s`. The example deletes any matching target row for which the source address is Centreville. For all other matching rows, the example adds the source purchases and sets the target address to the source address. If there is no match in the target table, the example inserts the row from the source table.

```
MERGE INTO accounts t USING monthly_accounts_update s
    ON (t.customer = s.customer)
    WHEN MATCHED AND s.address = 'Centreville'
        THEN DELETE
    WHEN MATCHED
        THEN UPDATE
            SET purchases = s.purchases + t.purchases, address = s.address
    WHEN NOT MATCHED
        THEN INSERT (customer, purchases, address)
              VALUES(s.customer, s.purchases, s.address)
```

# Manage Iceberg tables
<a name="querying-iceberg-managing-tables"></a>

Athena supports the following table DDL operations for Iceberg tables.

**Topics**
+ [

# ALTER TABLE RENAME
](querying-iceberg-alter-table-rename.md)
+ [

# ALTER TABLE SET TBLPROPERTIES
](querying-iceberg-alter-table-set-properties.md)
+ [

# ALTER TABLE UNSET TBLPROPERTIES
](querying-iceberg-alter-table-unset-properties.md)
+ [

# DESCRIBE
](querying-iceberg-describe-table.md)
+ [

# DROP TABLE
](querying-iceberg-drop-table.md)
+ [

# SHOW CREATE TABLE
](querying-iceberg-show-create-table.md)
+ [

# SHOW TBLPROPERTIES
](querying-iceberg-show-table-properties.md)

# ALTER TABLE RENAME
<a name="querying-iceberg-alter-table-rename"></a>

Renames a table.

Because the table metadata of an Iceberg table is stored in Amazon S3, you can update the database and table name of an Iceberg managed table without affecting underlying table information.

## Synopsis
<a name="querying-iceberg-alter-table-rename-synopsis"></a>

```
ALTER TABLE [db_name.]table_name RENAME TO [new_db_name.]new_table_name
```

## Example
<a name="querying-iceberg-alter-table-rename-example"></a>

```
ALTER TABLE my_db.my_table RENAME TO my_db2.my_table2
```

# ALTER TABLE SET TBLPROPERTIES
<a name="querying-iceberg-alter-table-set-properties"></a>

Adds properties to an Iceberg table and sets their assigned values.

In accordance with [Iceberg specifications](https://iceberg.apache.org/#spec/#table-metadata-fields), table properties are stored in the Iceberg table metadata file rather than in AWS Glue. Athena does not accept custom table properties. Refer to the [Specify table properties](querying-iceberg-creating-tables.md#querying-iceberg-table-properties) section for allowed key-value pairs. You can also use `ALTER TABLE SET TBLPROPERTIES` and `ALTER TABLE UNSET TBLPROPERTIES` to set or remove the `write.data.path` and `write.object-storage.path` Iceberg table properties. If you would like Athena to support a specific open source table configuration property, send feedback to [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com).

## Synopsis
<a name="querying-iceberg-alter-table-set-properties-synopsis"></a>

```
ALTER TABLE [db_name.]table_name SET TBLPROPERTIES ('property_name' = 'property_value' [ , ... ])
```

## Example
<a name="querying-iceberg-alter-table-set-properties-example"></a>

```
ALTER TABLE iceberg_table SET TBLPROPERTIES (
  'format'='parquet',
  'write_compression'='snappy',
  'optimize_rewrite_delete_file_threshold'='10'
)
```

The following example sets the `write.data.path` property on an existing Iceberg table.

```
ALTER TABLE iceberg_table SET TBLPROPERTIES (
  'write.data.path'='s3://amzn-s3-demo-bucket/your-folder/data'
)
```

# ALTER TABLE UNSET TBLPROPERTIES
<a name="querying-iceberg-alter-table-unset-properties"></a>

Drops existing properties from an Iceberg table.

## Synopsis
<a name="querying-iceberg-alter-table-unset-properties-synopsis"></a>

```
ALTER TABLE [db_name.]table_name UNSET TBLPROPERTIES ('property_name' [ , ... ])
```

## Example
<a name="querying-iceberg-alter-table-unset-properties-example"></a>

```
ALTER TABLE iceberg_table UNSET TBLPROPERTIES ('write_compression')
```

The following example removes the `write.data.path` property from an Iceberg table.

```
ALTER TABLE iceberg_table UNSET TBLPROPERTIES ('write.data.path')
```

# DESCRIBE
<a name="querying-iceberg-describe-table"></a>

Describes table information.

## Synopsis
<a name="querying-iceberg-describe-table-synopsis"></a>

```
DESCRIBE [FORMATTED] [db_name.]table_name
```

When the `FORMATTED` option is specified, the output displays additional information such as table location and properties.

## Example
<a name="querying-iceberg-describe-table-example"></a>

```
DESCRIBE iceberg_table
```

# DROP TABLE
<a name="querying-iceberg-drop-table"></a>

Drops an Iceberg table.

**Warning**  
Because Iceberg tables are considered managed tables in Athena, dropping an Iceberg table also removes all the data in the table.

## Synopsis
<a name="querying-iceberg-drop-table-synopsis"></a>

```
DROP TABLE [IF EXISTS] [db_name.]table_name
```

## Example
<a name="querying-iceberg-drop-table-example"></a>

```
DROP TABLE iceberg_table
```

# SHOW CREATE TABLE
<a name="querying-iceberg-show-create-table"></a>

Displays a `CREATE TABLE` DDL statement that can be used to recreate the Iceberg table in Athena. If Athena cannot reproduce the table structure (for example, because custom table properties are specified in the table), an UNSUPPORTED error is thrown.

## Synopsis
<a name="querying-iceberg-show-create-table-synopsis"></a>

```
SHOW CREATE TABLE [db_name.]table_name
```

## Example
<a name="querying-iceberg-show-create-table-example"></a>

```
SHOW CREATE TABLE iceberg_table
```

# SHOW TBLPROPERTIES
<a name="querying-iceberg-show-table-properties"></a>

Shows one or more table properties of an Iceberg table. Only Athena-supported table properties are shown.

## Synopsis
<a name="querying-iceberg-show-table-properties-synopsis"></a>

```
SHOW TBLPROPERTIES [db_name.]table_name [('property_name')]
```

## Example
<a name="querying-iceberg-show-table-properties-example"></a>

```
SHOW TBLPROPERTIES iceberg_table
```

# Evolve Iceberg table schema
<a name="querying-iceberg-evolving-table-schema"></a>

Iceberg schema updates are metadata-only changes. No data files are changed when you perform a schema update. 

The Iceberg format supports the following schema evolution changes:
+ **Add** – Adds a new column to a table or to a nested `struct`.
+ **Drop** – Removes an existing column from a table or nested `struct`.
+ **Rename** – Renames an existing column or field in a nested `struct`.
+ **Reorder** – Changes the order of columns.
+  **Type promotion** – Widens the type of a column, `struct` field, `map` key, `map` value, or `list` element. Currently, the following cases are supported for Iceberg tables: 
  + integer to big integer
  + float to double
  + increasing the precision of a decimal type

You can use the DDL statements in this section to modify Iceberg table schema.

**Topics**
+ [

# ALTER TABLE ADD COLUMNS
](querying-iceberg-alter-table-add-columns.md)
+ [

# ALTER TABLE DROP COLUMN
](querying-iceberg-alter-table-drop-column.md)
+ [

# ALTER TABLE CHANGE COLUMN
](querying-iceberg-alter-table-change-column.md)
+ [

# SHOW COLUMNS
](querying-iceberg-show-columns.md)

# ALTER TABLE ADD COLUMNS
<a name="querying-iceberg-alter-table-add-columns"></a>

Adds one or more columns to an existing Iceberg table.

## Synopsis
<a name="querying-iceberg-alter-table-add-columns-synopsis"></a>

```
ALTER TABLE [db_name.]table_name ADD COLUMNS (col_name data_type [,...])
```

## Examples
<a name="querying-iceberg-alter-table-add-columns-example"></a>

The following example adds a `comment` column of type `string` to an Iceberg table.

```
ALTER TABLE iceberg_table ADD COLUMNS (comment string)
```

The following example adds a `point` column of type `struct` to an Iceberg table.

```
ALTER TABLE iceberg_table 
ADD COLUMNS (point struct<x: double, y: double>)
```

The following example adds a `points` column that is an array of structs to an Iceberg table.

```
ALTER TABLE iceberg_table 
ADD COLUMNS (points array<struct<x: double, y: double>>)
```

# ALTER TABLE DROP COLUMN
<a name="querying-iceberg-alter-table-drop-column"></a>

Drops a column from an existing Iceberg table.

## Synopsis
<a name="querying-iceberg-alter-table-drop-column-synopsis"></a>

```
ALTER TABLE [db_name.]table_name DROP COLUMN col_name
```

## Example
<a name="querying-iceberg-alter-table-drop-column-example"></a>

```
ALTER TABLE iceberg_table DROP COLUMN userid
```

# ALTER TABLE CHANGE COLUMN
<a name="querying-iceberg-alter-table-change-column"></a>

Changes the name, type, order or comment of a column in an Iceberg table.

**Note**  
`ALTER TABLE REPLACE COLUMNS` is not supported. Because `REPLACE COLUMNS` removes all columns and then adds new ones, it is not supported for Iceberg. `CHANGE COLUMN` is the preferred syntax for schema evolution. 

## Synopsis
<a name="querying-iceberg-alter-table-change-column-synopsis"></a>

```
ALTER TABLE [db_name.]table_name
  CHANGE [COLUMN] col_old_name col_new_name column_type 
  [COMMENT col_comment] [FIRST|AFTER column_name]
```

## Example
<a name="querying-iceberg-alter-table-change-column-example"></a>

```
ALTER TABLE iceberg_table CHANGE comment blog_comment string AFTER id
```

# SHOW COLUMNS
<a name="querying-iceberg-show-columns"></a>

Shows the columns in a table.

## Synopsis
<a name="querying-iceberg-show-columns-synopsis"></a>

```
SHOW COLUMNS (FROM|IN) [db_name.]table_name
```

## Example
<a name="querying-iceberg-alter-table-change-column-example"></a>

```
SHOW COLUMNS FROM iceberg_table
```

# Perform other DDL operations on Iceberg tables
<a name="querying-iceberg-additional-operations"></a>

In addition to the schema evolution operations described in [Evolve Iceberg table schema](querying-iceberg-evolving-table-schema.md), you can also perform the following DDL operations on Apache Iceberg tables in Athena.

## Database level operations
<a name="querying-iceberg-additional-operations-database-level-operations"></a>

When you use [DROP DATABASE](drop-database.md) with the `CASCADE` option , any Iceberg table data is also removed. The following DDL operations have no effect on Iceberg tables.
+ [CREATE DATABASE](create-database.md)
+ [ALTER DATABASE SET DBPROPERTIES](alter-database-set-dbproperties.md)
+ [SHOW DATABASES](show-databases.md)
+ [SHOW TABLES](show-tables.md)
+ [SHOW VIEWS](show-views.md)

## Partition related operations
<a name="querying-iceberg-additional-operations-partition-related-operations"></a>

Because Iceberg tables use [hidden partitioning](https://iceberg.apache.org/docs/latest/partitioning/#icebergs-hidden-partitioning), you do not have to work with physical partitions directly. As a result, Iceberg tables in Athena do not support the following partition-related DDL operations:
+ [SHOW PARTITIONS](show-partitions.md)
+ [ALTER TABLE ADD PARTITION](alter-table-add-partition.md)
+ [ALTER TABLE DROP PARTITION](alter-table-drop-partition.md)
+ [ALTER TABLE RENAME PARTITION](alter-table-rename-partition.md)

If you would like to see Iceberg [ partition evolution](https://iceberg.apache.org/docs/latest/evolution/#partition-evolution) in Athena, send feedback to [athena-feedback@amazon.com](mailto:athena-feedback@amazon.com).

## Unload Iceberg tables
<a name="querying-iceberg-additional-operations-unload-iceberg-table"></a>

Iceberg tables can be unloaded to files in a folder on Amazon S3. For information, see [UNLOAD](unload.md).

## MSCK REPAIR
<a name="querying-iceberg-additional-operations-msck-repair"></a>

Because Iceberg tables keep track of table layout information, running [MSCK REPAIR TABLE](msck-repair-table.md) as one does with Hive tables is not necessary and is not supported.

# Optimize Iceberg tables
<a name="querying-iceberg-data-optimization"></a>

Athena provides several optimization features to improve query performance on Apache Iceberg tables. As data accumulates, queries can become less efficient due to increased file processing overhead and the computational cost of applying row-level deletes stored in Iceberg delete files. To address these challenges, Athena supports manual compaction and vacuum operators to optimize table structure. Athena also works with Iceberg statistics to enable cost-based query optimization and Parquet column indexing for precise data pruning during query execution. These features work together to reduce query execution time, minimize data scanning, and lower costs. This topic describes how to use these optimization capabilities to maintain high-performance queries on your Iceberg tables.

## OPTIMIZE
<a name="querying-iceberg-data-optimization-rewrite-data-action"></a>

The `OPTIMIZE table REWRITE DATA` compaction action rewrites data files into a more optimized layout based on their size and number of associated delete files. For syntax and table property details, see [OPTIMIZE](optimize-statement.md).

### Example
<a name="querying-iceberg-data-optimization-example"></a>

The following example merges delete files into data files and produces files near the targeted file size where the value of `category` is `c1`.

```
OPTIMIZE iceberg_table REWRITE DATA USING BIN_PACK
  WHERE category = 'c1'
```

## VACUUM
<a name="querying-iceberg-vacuum"></a>

`VACUUM` performs [snapshot expiration](https://iceberg.apache.org/docs/latest/spark-procedures/#expire_snapshots) and [orphan file removal](https://iceberg.apache.org/docs/latest/spark-procedures/#remove_orphan_files). These actions reduce metadata size and remove files not in the current table state that are also older than the retention period specified for the table. For syntax details, see [VACUUM](vacuum-statement.md).

### Example
<a name="querying-iceberg-vacuum-example"></a>

The following example uses a table property to configure the table `iceberg_table` to retain the last three days of data, then uses `VACUUM` to expire the old snapshots and remove the orphan files from the table.

```
ALTER TABLE iceberg_table SET TBLPROPERTIES (
  'vacuum_max_snapshot_age_seconds'='259200'
)

VACUUM iceberg_table
```

## Use Iceberg table statistics
<a name="querying-iceberg-data-optimization-statistics"></a>

Athena's cost-based optimizer uses Iceberg statistics to produce optimal query plans. When statistics have been generated for your Iceberg tables, Athena automatically uses this information to make intelligent decisions about join ordering, filters, and aggregation behavior, often improving query performance and reducing costs.

Iceberg statistics are turned on by default when you use S3 Tables. For other Iceberg tables, Athena uses the table property `use_iceberg_statistics` to determine whether to leverage statistics for cost-based optimization. To get started, see [Optimizing query performance using column statistics](https://docs.aws.amazon.com//glue/latest/dg/column-statistics.html) in the *AWS Glue User Guide* or use the [Athena console](https://docs.aws.amazon.com/athena/latest/ug/cost-based-optimizer.html) to generate on-demand statistics on your Iceberg tables.

## Use Parquet column indexing
<a name="querying-iceberg-data-optimization-parquet-column-indexing"></a>

Parquet column indexing makes it possible for Athena to perform more precise data pruning during query execution by leveraging page-level min/max statistics in addition to row group-level statistics. This allows Athena to skip unnecessary pages within row groups, significantly reducing the amount of data scanned and improving query performance. It works best for queries with selective filter predicates on sorted columns, improving both execution time and data scan efficiency while reducing the amount of data Athena needs to read from Amazon S3.

Athena uses Parquet column indexes by default with S3 Tables if column indexes are present in the underlying Parquet files. For other Iceberg tables, Athena uses the `use_iceberg_parquet_column_index` property to determine whether to utilize the column indexes in the Parquet file. Set this table property using the AWS Glue console or `UpdateTable` API.

# Query AWS Glue Data Catalog materialized views
<a name="querying-iceberg-gdc-mv"></a>

Athena allows you to query AWS Glue Data Catalog materialized views. Glue Data Catalog materialized views store pre-computed results of SQL queries as Apache Iceberg tables.

When you create Glue Data Catalog materialized views using Apache Spark in Amazon EMR or AWS Glue, the view definitions and metadata are stored in the AWS Glue Data Catalog. The pre-computed results are stored as Apache Iceberg tables in Amazon S3. You can query these materialized views from Athena using standard SQL `SELECT` statements, just as you would query regular Iceberg tables.

## Prerequisites
<a name="querying-iceberg-gdc-mv-prerequisites"></a>

Before you query materialized views in Athena, ensure the following:
+ The materialized view exists in the AWS Glue Data Catalog and was created using Apache Spark (Amazon EMR release 7.12.0 or later, or AWS Glue version 5.1 or later)
+ To query materialized views in Athena, you need the following AWS Lake Formation permissions:
  + `SELECT` permission on the materialized view
  + `DESCRIBE` permission on the materialized view
  + Access to the underlying Amazon S3 location where the materialized view data is stored
+ The materialized view's underlying data is stored in Amazon S3 Table buckets or Amazon S3 general purpose buckets
+ You have access to the AWS Glue Data Catalog database containing the materialized view
+ For materialized views stored in Amazon S3 Tables buckets, ensure your IAM role has the necessary permissions to access the S3 Tables catalog.

## Considerations and limitations
<a name="querying-iceberg-gdc-mv-considerations"></a>
+ Athena does not support the following operations on materialized views: `ALTER`, `CREATE MATERIALIZED VIEW`, `REFRESH MATERIALIZED VIEW`, `DROP`, `INSERT`, `UPDATE`, `MERGE`, `DELETE`, `OPTIMIZE`, `VACUUM`. To create materialized views, use Apache Spark in Amazon EMR or AWS Glue. Refresh operations must be performed through the AWS Glue Data Catalog API or Apache Spark. Modify materialized views using Apache Spark.

## Querying materialized views
<a name="querying-iceberg-gdc-mv-operations"></a>

Athena treats materialized views as standard Iceberg tables for read operations, allowing you to access the pre-computed data without requiring special syntax or configuration changes.

To query a materialized view in Athena, use standard `SELECT` statements:

```
SELECT * FROM my_database.sales_summary_mv;
```

You can apply filters, aggregations, and joins just as you would with regular tables:

```
SELECT
  region,
  SUM(total_sales) as sales_total
FROM my_database.sales_summary_mv
WHERE year = 2025
GROUP BY region
ORDER BY sales_total DESC;
```

## Supported operations
<a name="querying-iceberg-gdc-mv-supported"></a>

Athena supports the following operations on materialized views:
+ `SELECT` queries - Read data from materialized views using standard SQL `SELECT` statements
+ `DESCRIBE` - View the schema and metadata of materialized views
+ `SHOW TABLES` - List materialized views along with other tables in a database
+ `JOIN` operations - Join materialized views with other tables or views
+ Filtering and aggregation - Apply `WHERE` clauses, `GROUP BY`, and aggregate functions

# Supported data types for Iceberg tables in Athena
<a name="querying-iceberg-supported-data-types"></a>

Athena can query Iceberg tables that contain the following data types:

```
binary
boolean
date
decimal
double
float
int
list
long
map
string
struct
timestamp without time zone
```

For more information about Iceberg table types, see the [schemas page for Iceberg](https://iceberg.apache.org/docs/latest/schemas/) in the Apache documentation.

The following table shows the relationship between Athena data types and Iceberg table data types.


****  
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/athena/latest/ug/querying-iceberg-supported-data-types.html)

For more information about data types in Athena, see [Data types in Amazon Athena](data-types.md).

# Additional resources
<a name="querying-iceberg-additional-resources"></a>

The following article is in the AWS Prescriptive Guidance documentation.
+ [Working with Apache Iceberg tables by using Amazon Athena SQL](https://docs.aws.amazon.com/prescriptive-guidance/latest/apache-iceberg-on-aws/iceberg-athena.html) 

For in-depth articles on using Athena with Apache Iceberg tables, see the following posts in the *AWS Big Data Blog*.
+ [Implement a serverless CDC process with Apache Iceberg using Amazon DynamoDB and Amazon Athena](https://aws.amazon.com/blogs/big-data/implement-a-serverless-cdc-process-with-apache-iceberg-using-amazon-dynamodb-and-amazon-athena/) 
+ [Accelerate data science feature engineering on transactional data lakes using Amazon Athena with Apache Iceberg](https://aws.amazon.com/blogs/big-data/accelerate-data-science-feature-engineering-on-transactional-data-lakes-using-amazon-athena-with-apache-iceberg/) 
+ [Build an Apache Iceberg data lake using Amazon Athena, Amazon EMR, and AWS Glue](https://aws.amazon.com/blogs/big-data/build-an-apache-iceberg-data-lake-using-amazon-athena-amazon-emr-and-aws-glue/) 
+ [Perform upserts in a data lake using Amazon Athena and Apache Iceberg](https://aws.amazon.com/blogs/big-data/perform-upserts-in-a-data-lake-using-amazon-athena-and-apache-iceberg/) 
+ [Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena](https://aws.amazon.com/blogs/big-data/build-a-transactional-data-lake-using-apache-iceberg-aws-glue-and-cross-account-data-shares-using-aws-lake-formation-and-amazon-athena/) 
+ [Use Apache Iceberg in a data lake to support incremental data processing](https://aws.amazon.com/blogs/big-data/use-apache-iceberg-in-a-data-lake-to-support-incremental-data-processing/) 
+ [Build a real-time GDPR-aligned Apache Iceberg data lake](https://aws.amazon.com/blogs/big-data/build-a-real-time-gdpr-aligned-apache-iceberg-data-lake/) 
+ [Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue](https://aws.amazon.com/blogs/big-data/automate-replication-of-relational-sources-into-a-transactional-data-lake-with-apache-iceberg-and-aws-glue/) 
+ [Interact with Apache Iceberg tables using Amazon Athena and cross account fine-grained permissions using AWS Lake Formation](https://aws.amazon.com/blogs/big-data/interact-with-apache-iceberg-tables-using-amazon-athena-and-cross-account-fine-grained-permissions-using-aws-lake-formation/) 
+ [Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena](https://aws.amazon.com/blogs/big-data/build-a-serverless-transactional-data-lake-with-apache-iceberg-amazon-emr-serverless-and-amazon-athena/)