/AWS1/CL_GLUICEBERGCOMPACTCONF¶
The configuration for an Iceberg compaction optimizer. This configuration defines parameters for optimizing the layout of data files in Iceberg tables.
CONSTRUCTOR¶
IMPORTING¶
Optional arguments:¶
iv_strategy TYPE /AWS1/GLUCOMPACTIONSTRATEGY /AWS1/GLUCOMPACTIONSTRATEGY¶
The strategy to use for compaction. Valid values are:
binpack: Combines small files into larger files, typically targeting sizes over 100MB, while applying any pending deletes. This is the recommended compaction strategy for most use cases.
sort: Organizes data based on specified columns which are sorted hierarchically during compaction, improving query performance for filtered operations. This strategy is recommended when your queries frequently filter on specific columns. To use this strategy, you must first define a sort order in your Iceberg table properties using thesort_ordertable property.
z-order: Optimizes data organization by blending multiple attributes into a single scalar value that can be used for sorting, allowing efficient querying across multiple dimensions. This strategy is recommended when you need to query data across multiple dimensions simultaneously. To use this strategy, you must first define a sort order in your Iceberg table properties using thesort_ordertable property.If an input is not provided, the default value 'binpack' will be used.
iv_mininputfiles TYPE /AWS1/GLUNULLABLEINTEGER /AWS1/GLUNULLABLEINTEGER¶
The minimum number of data files that must be present in a partition before compaction will actually compact files. This parameter helps control when compaction is triggered, preventing unnecessary compaction operations on partitions with few files. If an input is not provided, the default value 100 will be used.
iv_deletefilethreshold TYPE /AWS1/GLUNULLABLEINTEGER /AWS1/GLUNULLABLEINTEGER¶
The minimum number of deletes that must be present in a data file to make it eligible for compaction. This parameter helps optimize compaction by focusing on files that contain a significant number of delete operations, which can improve query performance by removing deleted records. If an input is not provided, the default value 1 will be used.
Queryable Attributes¶
strategy¶
The strategy to use for compaction. Valid values are:
binpack: Combines small files into larger files, typically targeting sizes over 100MB, while applying any pending deletes. This is the recommended compaction strategy for most use cases.
sort: Organizes data based on specified columns which are sorted hierarchically during compaction, improving query performance for filtered operations. This strategy is recommended when your queries frequently filter on specific columns. To use this strategy, you must first define a sort order in your Iceberg table properties using thesort_ordertable property.
z-order: Optimizes data organization by blending multiple attributes into a single scalar value that can be used for sorting, allowing efficient querying across multiple dimensions. This strategy is recommended when you need to query data across multiple dimensions simultaneously. To use this strategy, you must first define a sort order in your Iceberg table properties using thesort_ordertable property.If an input is not provided, the default value 'binpack' will be used.
Accessible with the following methods¶
| Method | Description |
|---|---|
GET_STRATEGY() |
Getter for STRATEGY, with configurable default |
ASK_STRATEGY() |
Getter for STRATEGY w/ exceptions if field has no value |
HAS_STRATEGY() |
Determine if STRATEGY has a value |
minInputFiles¶
The minimum number of data files that must be present in a partition before compaction will actually compact files. This parameter helps control when compaction is triggered, preventing unnecessary compaction operations on partitions with few files. If an input is not provided, the default value 100 will be used.
Accessible with the following methods¶
| Method | Description |
|---|---|
GET_MININPUTFILES() |
Getter for MININPUTFILES, with configurable default |
ASK_MININPUTFILES() |
Getter for MININPUTFILES w/ exceptions if field has no value |
HAS_MININPUTFILES() |
Determine if MININPUTFILES has a value |
deleteFileThreshold¶
The minimum number of deletes that must be present in a data file to make it eligible for compaction. This parameter helps optimize compaction by focusing on files that contain a significant number of delete operations, which can improve query performance by removing deleted records. If an input is not provided, the default value 1 will be used.
Accessible with the following methods¶
| Method | Description |
|---|---|
GET_DELETEFILETHRESHOLD() |
Getter for DELETEFILETHRESHOLD, with configurable default |
ASK_DELETEFILETHRESHOLD() |
Getter for DELETEFILETHRESHOLD w/ exceptions if field has no |
HAS_DELETEFILETHRESHOLD() |
Determine if DELETEFILETHRESHOLD has a value |