

# Amazon EMR 6.x release versions
<a name="emr-release-6x"></a>

This section contains application versions, release notes, component versions, and configuration classifications available in each Amazon EMR 6.x release version.

When you launch a cluster, you can choose from multiple releases of Amazon EMR. This allows you to test and use application versions that fit your compatibility requirements. You specify the release number with the *release label*. Release labels are in the form `emr-x.x.x`. For example, `emr-7.12.0`.

New Amazon EMR releases are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

For a comprehensive table of application versions in every Amazon EMR 6.x release, see [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md).

**Topics**
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Amazon EMR release 6.15.0](emr-6150-release.md)
+ [Amazon EMR release 6.14.0](emr-6140-release.md)
+ [Amazon EMR release 6.13.0](emr-6130-release.md)
+ [Amazon EMR release 6.12.0](emr-6120-release.md)
+ [Amazon EMR release 6.11.1](emr-6111-release.md)
+ [Amazon EMR release 6.11.0](emr-6110-release.md)
+ [Amazon EMR release 6.10.1](emr-6101-release.md)
+ [Amazon EMR release 6.10.0](emr-6100-release.md)
+ [Amazon EMR release 6.9.1](emr-691-release.md)
+ [Amazon EMR release 6.9.0](emr-690-release.md)
+ [Amazon EMR release 6.8.1](emr-681-release.md)
+ [Amazon EMR release 6.8.0](emr-680-release.md)
+ [Amazon EMR release 6.7.0](emr-670-release.md)
+ [Amazon EMR release 6.6.0](emr-660-release.md)
+ [Amazon EMR release 6.5.0](emr-650-release.md)
+ [Amazon EMR release 6.4.0](emr-640-release.md)
+ [Amazon EMR release 6.3.1](emr-631-release.md)
+ [Amazon EMR release 6.3.0](emr-630-release.md)
+ [Amazon EMR release 6.2.1](emr-621-release.md)
+ [Amazon EMR release 6.2.0](emr-620-release.md)
+ [Amazon EMR release 6.1.1](emr-611-release.md)
+ [Amazon EMR release 6.1.0](emr-610-release.md)
+ [Amazon EMR release 6.0.1](emr-601-release.md)
+ [Amazon EMR release 6.0.0](emr-600-release.md)

# Application versions in Amazon EMR 6.x releases
<a name="emr-release-app-versions-6.x"></a>

The following table lists the application versions that are available in each Amazon EMR 6.x release.


**Application version information**  

|  | emr-6.15.0 | emr-6.14.0 | emr-6.13.0 | emr-6.12.0 | emr-6.11.1 | emr-6.11.0 | emr-6.10.1 | emr-6.10.0 | emr-6.9.1 | emr-6.9.0 | emr-6.8.1 | emr-6.8.0 | emr-6.7.0 | emr-6.6.0 | emr-6.5.0 | emr-6.4.0 | emr-6.3.1 | emr-6.3.0 | emr-6.2.1 | emr-6.2.0 | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | emr-6.0.0 | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| AWS SDK for Java | 2.20.160-amzn-0, 1.12.569 | 1.12.543 | 1.12.513 | 1.12.490 | 1.12.446 | 1.12.446 | 1.12.397 | 1.12.397 | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.31 | 1.12.31 | 1.11.977 | 1.11.977 | 1.11.880 | 1.11.880 | 1.11.828 | 1.11.828 | 1.11.711 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 2.11.12 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  | 
| Delta | 2.4.0 | 2.4.0 | 2.4.0 | 2.4.0 | 2.2.0 | 2.2.0 | 2.2.0 | 2.2.0 | 2.1.0 | 2.1.0 |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  | 
| Flink | 1.17.1-amzn-1 | 1.17.1-amzn-0 | 1.17.0 | 1.17.0 | 1.16.0 | 1.16.0 | 1.16.0 | 1.16.0 | 1.15.2 | 1.15.2 | 1.15.1 | 1.15.1 | 1.14.2 | 1.14.2 | 1.14.0 | 1.13.1 | 1.12.1 | 1.12.1 | 1.11.2 | 1.11.2 | 1.11.0 | 1.11.0 |  -  |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.17-amzn-3 | 2.4.17-amzn-2 | 2.4.17-amzn-1 | 2.4.17-amzn-0 | 2.4.15-amzn-1.1 | 2.4.15-amzn-1 | 2.4.15-amzn-0.1 | 2.4.15-amzn-0 | 2.4.13-amzn-0.1 | 2.4.13-amzn-0 | 2.4.12-amzn-0.1 | 2.4.12-amzn-0 | 2.4.4-amzn-3 | 2.4.4-amzn-2 | 2.4.4-amzn-1 | 2.4.4-amzn-0 | 2.2.6-amzn-1 | 2.2.6-amzn-1 | 2.2.6-amzn-0 | 2.2.6-amzn-0 | 2.2.5 | 2.2.5 | 2.2.3 | 2.2.3 | 
| HCatalog | 3.1.3-amzn-8 | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hadoop | 3.3.6-amzn-1 | 3.3.3-amzn-6 | 3.3.3-amzn-5 | 3.3.3-amzn-4 | 3.3.3-amzn-3.1 | 3.3.3-amzn-3 | 3.3.3-amzn-2.1 | 3.3.3-amzn-2 | 3.3.3-amzn-1.1 | 3.3.3-amzn-1 | 3.2.1-amzn-8.1 | 3.2.1-amzn-8 | 3.2.1-amzn-7 | 3.2.1-amzn-6 | 3.2.1-amzn-5 | 3.2.1-amzn-4 | 3.2.1-amzn-3.1 | 3.2.1-amzn-3 | 3.2.1-amzn-2.1 | 3.2.1-amzn-2 | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 3.2.1-amzn-0 | 
| Hive | 3.1.3-amzn-8 | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hudi | 0.14.0-amzn-0 | 0.13.1-amzn-2 | 0.13.1-amzn-1 | 0.13.1-amzn-0 | 0.13.0-amzn-0 | 0.13.0-amzn-0 | 0.12.2-amzn-0 | 0.12.2-amzn-0 | 0.12.1-amzn-0 | 0.12.1-amzn-0 | 0.11.1-amzn-0 | 0.11.1-amzn-0 | 0.11.0-amzn-0 | 0.10.1-amzn-0 | 0.9.0-amzn-1 | 0.8.0-amzn-0 | 0.7.0-amzn-0 | 0.7.0-amzn-0 | 0.6.0-amzn-1 | 0.6.0-amzn-1 | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 4.9.0 | 4.9.0 | 4.9.0 | 4.9.0 | 4.8.0 | 4.8.0 | 4.7.1 | 4.7.1 | 4.4.0 | 4.4.0 | 
| Iceberg | 1.4.0-amzn-0 | 1.3.1-amzn-0 | 1.3.0-amzn-1 | 1.3.0-amzn-0 | 1.2.0-amzn-0 | 1.2.0-amzn-0 | 1.1.0-amzn-0 | 1.1.0-amzn-0 | 0.14.1-amzn-0 | 0.14.1-amzn-0 | 0.14.0-amzn-0 | 0.14.0-amzn-0 | 0.13.1-amzn-0 | 0.13.1 | 0.12.0 |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 |  -  |  -  |  -  |  -  | 
| JupyterHub | 1.5.0 | 1.5.0 | 1.5.0 | 1.4.1 | 1.4.1 | 1.4.1 | 1.5.0 | 1.5.0 | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 1.2.2 | 1.2.2 | 1.1.0 | 1.1.0 | 1.1.0 | 1.1.0 | 1.0.0 | 1.0.0 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 0.6.0-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 1.8.0 | 1.8.0 | 1.8.0 | 1.8.0 | 1.7.0 | 1.7.0 | 1.7.0 | 1.7.0 | 1.6.0 | 1.6.0 | 1.5.1 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.0 | 5.2.0 | 5.2.0 | 5.2.0 | 5.1.0 | 5.1.0 | 
| Phoenix | 5.1.3 | 5.1.3 | 5.1.3 | 5.1.3 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 |  -  |  -  | 
| Presto | 0.283-amzn-0 | 0.281-amzn-2 | 0.281-amzn-1 | 0.281-amzn-0 | 0.279-amzn-0 | 0.279-amzn-0 | 0.278.1-amzn-0 | 0.278.1-amzn-0 | 0.276-amzn-0 | 0.276-amzn-0 | 0.273.3-amzn-0 | 0.273.3-amzn-0 | 0.272-amzn-0 | 0.267-amzn-0 | 0.261-amzn-0 | 0.254.1-amzn-0 | 0.245.1-amzn-0 | 0.245.1-amzn-0 | 0.238.3-amzn-1 | 0.238.3-amzn-1 | 0.232 | 0.232 | 0.230 | 0.230 | 
| Spark | 3.4.1-amzn-2 | 3.4.1-amzn-1 | 3.4.1-amzn-0 | 3.4.0-amzn-0 | 3.3.2-amzn-0.1 | 3.3.2-amzn-0 | 3.3.1-amzn-0.1 | 3.3.1-amzn-0 | 3.3.0-amzn-1.1 | 3.3.0-amzn-1 | 3.3.0-amzn-0.1 | 3.3.0-amzn-0 | 3.2.1-amzn-0 | 3.2.0-amzn-0 | 3.1.2-amzn-1 | 3.1.2-amzn-0 | 3.1.1-amzn-0.1 | 3.1.1-amzn-0 | 3.0.1-amzn-0.1 | 3.0.1-amzn-0 | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 |  -  |  -  | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 2.10.0 | 2.10.0 | 2.9.1 | 2.9.1 | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 2.3.1 | 2.3.1 | 2.1.0 | 2.1.0 | 1.14.0 | 1.14.0 | 
| Tez | 0.10.2-amzn-6 | 0.10.2-amzn-5 | 0.10.2-amzn-4 | 0.10.2-amzn-3 | 0.10.2-amzn-2.1 | 0.10.2-amzn-2 | 0.10.2-amzn-1.1 | 0.10.2-amzn-1 | 0.10.2-amzn-0.1 | 0.10.2-amzn-0 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 426-amzn-0 | 422-amzn-0 | 414-amzn-1 | 414-amzn-0 | 410-amzn-0 | 410-amzn-0 | 403-amzn-0 | 403-amzn-0 | 398-amzn-0 | 398-amzn-0 | 388-amzn-0 | 388-amzn-0 | 378-amzn-0 | 367-amzn-0 | 360 | 359 | 350 | 350 | 343 | 343 | 338 | 338 |  -  |  -  | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.0 | 0.10.0 | 0.10.0 | 0.9.0 | 0.9.0 | 0.9.0 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.7 | 3.5.7 | 3.5.7 | 3.5.7 | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

# Amazon EMR release 6.15.0
<a name="emr-6150-release"></a>

## 6.15.0 application versions
<a name="emr-6150-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.15.0 | emr-6.14.0 | emr-6.13.0 | emr-6.12.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 2.20.160-amzn-0, 1.12.569 | 1.12.543 | 1.12.513 | 1.12.490 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.4.0 | 2.4.0 | 2.4.0 | 2.4.0 | 
| Flink | 1.17.1-amzn-1 | 1.17.1-amzn-0 | 1.17.0 | 1.17.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.17-amzn-3 | 2.4.17-amzn-2 | 2.4.17-amzn-1 | 2.4.17-amzn-0 | 
| HCatalog | 3.1.3-amzn-8 | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 
| Hadoop | 3.3.6-amzn-1 | 3.3.3-amzn-6 | 3.3.3-amzn-5 | 3.3.3-amzn-4 | 
| Hive | 3.1.3-amzn-8 | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 
| Hudi | 0.14.0-amzn-0 | 0.13.1-amzn-2 | 0.13.1-amzn-1 | 0.13.1-amzn-0 | 
| Hue | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 
| Iceberg | 1.4.0-amzn-0 | 1.3.1-amzn-0 | 1.3.0-amzn-1 | 1.3.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.5.0 | 1.5.0 | 1.5.0 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.3 | 5.1.3 | 5.1.3 | 5.1.3 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.283-amzn-0 | 0.281-amzn-2 | 0.281-amzn-1 | 0.281-amzn-0 | 
| Spark | 3.4.1-amzn-2 | 3.4.1-amzn-1 | 3.4.1-amzn-0 | 3.4.0-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 
| Tez | 0.10.2-amzn-6 | 0.10.2-amzn-5 | 0.10.2-amzn-4 | 0.10.2-amzn-3 | 
| Trino (PrestoSQL) | 426-amzn-0 | 422-amzn-0 | 414-amzn-1 | 414-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.15.0 release notes
<a name="emr-6150-relnotes"></a>

The following release notes include information for Amazon EMR release 6.15.0. Changes are relative to 6.14.0. For information on the release timeline, see the [6.15.0 change log](#6150-changelog).

**New features**
+ **Application upgrades** – Amazon EMR 6.15.0 application upgrades include Apache Hadoop 3.3.6, Apache Hudi 0.14.0-amzn-0, Iceberg 1.4.0-amzn-0, and Trino 426.
+ **[Faster launches for EMR clusters that run on EC2](https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-emr-ec2-clusters-5-minutes-less/)** – It's now up to 35% faster to launch an Amazon EMR on EC2 cluster. With this improvement, most customers can launch their clusters in 5 minutes or less.
+ **[CodeWhisperer for EMR Studio](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-codewhisperer.html)** – You can now use Amazon CodeWhisperer with Amazon EMR Studio to get real-time recommendations as you write code in JupyterLab. CodeWhisperer can complete your comments, finish single lines of code, make line-by-line recommendations, and generate fully-formed functions.
+ **[Faster job restart times with Flink](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/flink-restart.html)** – With Amazon EMR 6.15.0 and higher, several new mechanisms are available for Apache Flink to improve the job restart time during task recovery or scaling operations. This optimizes the speed of recovery and restart of execution graphs to improve job stability.
+ **[Table-level and fine-grained access control for open-table formats](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-lf-enable.html)** – With Amazon EMR 6.15.0 and higher, when you run Spark jobs on Amazon EMR on EC2 clusters that access data in the AWS Glue Data Catalog, you can use AWS Lake Formation to apply table, row, column, and cell level permissions on Hudi, Iceberg, or Delta Lake based tables.
+ **Hadoop upgrade** – Amazon EMR 6.15.0 includes an upgrade of Apache Hadoop to version 3.3.6. Hadoop 3.3.6 was the latest version at the time of the Amazon EMR 6.15 deployment, released by Apache in June 2023. Prior releases of Amazon EMR (6.9.0 to 6.14.x) used Hadoop 3.3.3.

  The upgrade includes hundreds of improvements and fixes, and features that include reconfigurable datanode parameters, `DFSAdmin` option to initiate bulk reconfiguration operations on all live datanodes, and a vectored API that allows seek-heavy readers to specify multiple ranges to read. Hadoop 3.3.6 also adds support for HDFS APIs and semantics for its write-ahead log (WAL), so that HBase can run on other storage system implementations. For more information, see the changelogs for versions [3.3.4](https://hadoop.apache.org/docs/r3.3.4/hadoop-project-dist/hadoop-common/release/3.3.4/CHANGELOG.3.3.4.html), [3.3.5](https://hadoop.apache.org/docs/r3.3.5/hadoop-project-dist/hadoop-common/release/3.3.5/CHANGELOG.3.3.5.html), and [3.3.6](https://hadoop.apache.org/docs/r3.3.6/hadoop-project-dist/hadoop-common/release/3.3.6/CHANGELOG.3.3.6.html) in the *Apache Hadoop documentation*.
+ **Support for AWS SDK for Java, version 2** - Amazon EMR 6.15.0 applications can use AWS SDK for Java versions [1.12.569](https://github.com/aws/aws-sdk-java/tree/1.12.569) or [2.20.160](https://github.com/aws/aws-sdk-java-v2/tree/2.20.160) if the application supports v2. The AWS SDK for Java 2.x is a major rewrite of the version 1.x code base. It’s built on top of Java 8\$1 and adds several frequently requested features. These include support for non-blocking I/O, and the ability to plug in a different HTTP implementation at runtime. For more information, including a **Migration Guide from SDK for Java v1 to v2**, see the [AWS SDK for Java, version 2](https://docs.aws.amazon.com/sdk-for-java) guide.

**Known issues**
+ An on-cluster instance-state script that monitors health of the instance can consume excessive CPU and memory resources when there are a large number of threads and/or open file handles on the node.

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ To improve your high-availability EMR clusters, this release enables connectivity to Amazon EMR daemons on local host that use IPv6 endpoints.
+ This release enables TLS 1.2 for communication with ZooKeeper provisioned on all the primary nodes of your high-availability cluster.
+ This release improves the management of ZooKeeper transaction log files that are maintained on primary nodes to minimize scenarios where the log files grow out of bounds and interrupt cluster operations.
+ This release makes intra-node communication more resilient for high-availability EMR clusters. This improvement reduces the chance of bootstrap action failures or cluster start failures.
+ Tez in Amazon EMR 6.15.0 introduces configurations that you can specify to asynchronously open the input splits in a Tez grouped split. This results in faster performance of read queries when there are a large number of input splits in a single Tez grouped split. For more information, see [Tez asynchronous split opening](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/tez-configure.html#tez-configure-async).
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6150-release.html)

## 6.15.0 default Java versions
<a name="emr-6150-jdk"></a>

Amazon EMR releases 6.12.0 and higher support all applications with Amazon Corretto 8 by default, except for Trino. For Trino, Amazon EMR supports Amazon Corretto 17 by default starting with Amazon EMR release 6.9.0. Amazon EMR also supports some applications with Amazon Corretto 11 and 17. Those applications are listed in the following table. If you want to change the default JVM on your cluster, follow the instructions in [Configure applications to use a specific Java Virtual Machine](configuring-java8.md) for each application that runs on the cluster. You can only use one Java runtime version for a cluster. Amazon EMR doesn't support running different nodes or applications on different runtime versions on the same cluster.

While Amazon EMR supports both Amazon Corretto 11 and 17 on Apache Spark, Apache Hadoop, and Apache Hive, performance might regress for some workloads when you use these versions of Corretto. We recommend that you test your workloads before you change defaults.

The following table shows the default Java versions for applications in Amazon EMR 6.15.0:


| Application | Java / Amazon Corretto version (default is bold) | 
| --- | --- | 
| Delta | 17, 11, 8 | 
| Flink | 11, 8 | 
| Ganglia | 8 | 
| HBase | 11, 8 | 
| HCatalog | 17, 11, 8 | 
| Hadoop | 17, 11, 8 | 
| Hive | 17, 11, 8 | 
| Hudi | 17, 11, 8 | 
| Iceberg | 17, 11, 8 | 
| Livy | 17, 11, 8 | 
| Oozie | 17, 11, 8 | 
| Phoenix | 8 | 
| PrestoDB | 8 | 
| Spark | 17, 11, 8 | 
| Spark RAPIDS | 17, 11, 8 | 
| Sqoop | 8 | 
| Tez | 17, 11, 8 | 
| Trino | 17 | 
| Zeppelin | 8 | 
| Pig | 8 | 
| Zookeeper | 8 | 

## 6.15.0 component versions
<a name="emr-6150-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.4.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.2.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.8.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.12.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.29.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.8.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.2.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.60.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.17.1-amzn-1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.17.1-amzn-1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.6-amzn-1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.6-amzn-1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.6-amzn-1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.6-amzn-1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.6-amzn-1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.6-amzn-1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.6-amzn-1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.6-amzn-1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.6-amzn-1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.6-amzn-1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.6-amzn-1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.17-amzn-3 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.17-amzn-3 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.17-amzn-3 | HBase command-line client. | 
| hbase-rest-server | 2.4.17-amzn-3 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.17-amzn-3 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.17-amzn-3 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-8 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-8 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-8 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-8 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-8 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-8 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-8 | Service for accepting Hive queries as web requests. | 
| hudi | 0.14.0-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.14.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.14.0-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.14.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.4.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.5.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.7.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.3 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.3 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.283-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.283-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.283-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 426-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 426-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 426-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.4.1-amzn-2 | Spark command-line clients. | 
| spark-history-server | 3.4.1-amzn-2 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.4.1-amzn-2 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.4.1-amzn-2 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.08.1-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-6 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-6 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.15.0 configuration classifications
<a name="emr-6150-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.15.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-java-home | Change Hadoop's KMS java home | Not available. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.15.0 change log
<a name="6150-changelog"></a>


**Change log for 6.15.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2025-09-03 | Docs revision | Amazon EMR 6.15.0 release notes added known issue | 
| 2023-11-17 | Docs publication | Amazon EMR 6.15.0 release notes first published | 
| 2023-11-17 | Deployment complete | Amazon EMR 6.15.0 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-11-13 | Initial release | Amazon EMR 6.15.0 first deployed to initial commercial Regions | 

# Amazon EMR release 6.14.0
<a name="emr-6140-release"></a>

## 6.14.0 application versions
<a name="emr-6140-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.14.0 | emr-6.13.0 | emr-6.12.0 | emr-6.11.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.543 | 1.12.513 | 1.12.490 | 1.12.446 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.4.0 | 2.4.0 | 2.4.0 | 2.2.0 | 
| Flink | 1.17.1-amzn-0 | 1.17.0 | 1.17.0 | 1.16.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.17-amzn-2 | 2.4.17-amzn-1 | 2.4.17-amzn-0 | 2.4.15-amzn-1.1 | 
| HCatalog | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 
| Hadoop | 3.3.3-amzn-6 | 3.3.3-amzn-5 | 3.3.3-amzn-4 | 3.3.3-amzn-3.1 | 
| Hive | 3.1.3-amzn-7 | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 
| Hudi | 0.13.1-amzn-2 | 0.13.1-amzn-1 | 0.13.1-amzn-0 | 0.13.0-amzn-0 | 
| Hue | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 
| Iceberg | 1.3.1-amzn-0 | 1.3.0-amzn-1 | 1.3.0-amzn-0 | 1.2.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.5.0 | 1.5.0 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.3 | 5.1.3 | 5.1.3 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.281-amzn-2 | 0.281-amzn-1 | 0.281-amzn-0 | 0.279-amzn-0 | 
| Spark | 3.4.1-amzn-1 | 3.4.1-amzn-0 | 3.4.0-amzn-0 | 3.3.2-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 
| Tez | 0.10.2-amzn-5 | 0.10.2-amzn-4 | 0.10.2-amzn-3 | 0.10.2-amzn-2.1 | 
| Trino (PrestoSQL) | 422-amzn-0 | 414-amzn-1 | 414-amzn-0 | 410-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.14.0 release notes
<a name="emr-6140-relnotes"></a>

The following release notes include information for Amazon EMR release 6.14.0. Changes are relative to 6.13.0. For information on the release timeline, see the [6.14.0 change log](#6140-changelog).

**New features**
+ Amazon EMR 6.14.0 supports Apache Spark 3.4.1, Apache Spark RAPIDS 23.06.0-amzn-2, Flink 1.17.1, Iceberg 1.3.1, and Trino 422.
+ [Amazon EMR managed scaling](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-scaling.html) is now available in the `ap-southeast-3` Asia Pacific (Jakarta) Region for clusters that you create with Amazon EMR 6.14.0 and higher.

**Known issues**
+ An on-cluster instance-state script that monitors health of the instance can consume excessive CPU and memory resources when there are a large number of threads and/or open file handles on the node.

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ The 6.14.0 release optimizes log management with Amazon EMR running on Amazon EC2. As a result, you might see a slight reduction in storage costs for your cluster logs.
+ The 6.14.0 release improves the scaling workflow to account for different core instances that have a substantial variation in size for their Amazon EBS volumes. This improvement applies to core nodes only; scale-down operations for task nodes aren’t affected.
+ The 6.14.0 release improves the way that Amazon EMR interacts with open-source applications such as Apache Hadoop YARN ResourceManager and HDFS NameNode. This improvement reduces the risk of operational delays with cluster scaling, and mitigates startup failures that occur due to connectivity issues with the open-source applications.
+ The 6.14.0 release optimizes application installation at cluster launch. This improves the cluster startup times for certain combinations of Amazon EMR applications.
+ The 6.14.0 release fixes an issue where cluster scale-down operations might stall when a cluster that's running in a VPC with a custom domain encounters a core or task node restart.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6140-release.html)

## 6.14.0 default Java versions
<a name="emr-6140-jdk"></a>

Amazon EMR releases 6.12.0 and higher support all applications with Amazon Corretto 8 by default, except for Trino. For Trino, Amazon EMR supports Amazon Corretto 17 by default starting with Amazon EMR release 6.9.0. Amazon EMR also supports some applications with Amazon Corretto 11 and 17. Those applications are listed in the following table. If you want to change the default JVM on your cluster, follow the instructions in [Configure applications to use a specific Java Virtual Machine](configuring-java8.md) for each application that runs on the cluster. You can only use one Java runtime version for a cluster. Amazon EMR doesn't support running different nodes or applications on different runtime versions on the same cluster.

While Amazon EMR supports both Amazon Corretto 11 and 17 on Apache Spark, Apache Hadoop, and Apache Hive, performance might regress for some workloads when you use these versions of Corretto. We recommend that you test your workloads before you change defaults.

The following table shows the default Java versions for applications in Amazon EMR 6.14.0:


| Application | Java / Amazon Corretto version (default is bold) | 
| --- | --- | 
| Delta | 17, 11, 8 | 
| Flink | 11, 8 | 
| Ganglia | 8 | 
| HBase | 11, 8 | 
| HCatalog | 17, 11, 8 | 
| Hadoop | 17, 11, 8 | 
| Hive | 17, 11, 8 | 
| Hudi | 17, 11, 8 | 
| Iceberg | 17, 11, 8 | 
| Livy | 17, 11, 8 | 
| Oozie | 17, 11, 8 | 
| Phoenix | 8 | 
| PrestoDB | 8 | 
| Spark | 17, 11, 8 | 
| Spark RAPIDS | 17, 11, 8 | 
| Sqoop | 8 | 
| Tez | 17, 11, 8 | 
| Trino | 17 | 
| Zeppelin | 8 | 
| Pig | 8 | 
| Zookeeper | 8 | 

## 6.14.0 component versions
<a name="emr-6140-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.4.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.1.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.7.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.11.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.28.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.7.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.1.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.59.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.17.1-amzn-0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.17.1-amzn-0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-6 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-6 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-6 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-6 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-6 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-6 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-6 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-6 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-6 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-6 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-6 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.17-amzn-2 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.17-amzn-2 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.17-amzn-2 | HBase command-line client. | 
| hbase-rest-server | 2.4.17-amzn-2 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.17-amzn-2 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.17-amzn-2 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-7 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-7 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-7 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-7 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-7 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-7 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-7 | Service for accepting Hive queries as web requests. | 
| hudi | 0.13.1-amzn-2 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.13.1-amzn-2 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.13.1-amzn-2 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.13.1-amzn-2 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.3.1-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.5.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.7.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.3 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.3 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.281-amzn-2 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.281-amzn-2 | Service for executing pieces of a query. | 
| presto-client | 0.281-amzn-2 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 422-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 422-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 422-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.4.1-amzn-1 | Spark command-line clients. | 
| spark-history-server | 3.4.1-amzn-1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.4.1-amzn-1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.4.1-amzn-1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.06.0-amzn-2 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-5 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-5 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.14.0 configuration classifications
<a name="emr-6140-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.14.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-java-home | Change Hadoop's KMS java home | Not available. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.14.0 change log
<a name="6140-changelog"></a>


**Change log for 6.14.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2025-09-03 | Docs revision | Amazon EMR 6.14.0 release notes added known issue | 
| *2023-11-02* | Deployment complete | Amazon EMR 6.14.0 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-10-10 | Docs publication | Amazon EMR 6.14.0 release notes first published | 
| 2023-10-04 | Initial release | Amazon EMR 6.14.0 first deployed to initial commercial Regions | 

# Amazon EMR release 6.13.0
<a name="emr-6130-release"></a>

## 6.13.0 application versions
<a name="emr-6130-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.13.0 | emr-6.12.0 | emr-6.11.1 | emr-6.11.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.513 | 1.12.490 | 1.12.446 | 1.12.446 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.4.0 | 2.4.0 | 2.2.0 | 2.2.0 | 
| Flink | 1.17.0 | 1.17.0 | 1.16.0 | 1.16.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.17-amzn-1 | 2.4.17-amzn-0 | 2.4.15-amzn-1.1 | 2.4.15-amzn-1 | 
| HCatalog | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 
| Hadoop | 3.3.3-amzn-5 | 3.3.3-amzn-4 | 3.3.3-amzn-3.1 | 3.3.3-amzn-3 | 
| Hive | 3.1.3-amzn-6 | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 
| Hudi | 0.13.1-amzn-1 | 0.13.1-amzn-0 | 0.13.0-amzn-0 | 0.13.0-amzn-0 | 
| Hue | 4.11.0 | 4.11.0 | 4.11.0 | 4.11.0 | 
| Iceberg | 1.3.0-amzn-1 | 1.3.0-amzn-0 | 1.2.0-amzn-0 | 1.2.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.5.0 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.3 | 5.1.3 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.281-amzn-1 | 0.281-amzn-0 | 0.279-amzn-0 | 0.279-amzn-0 | 
| Spark | 3.4.1-amzn-0 | 3.4.0-amzn-0 | 3.3.2-amzn-0.1 | 3.3.2-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 
| Tez | 0.10.2-amzn-4 | 0.10.2-amzn-3 | 0.10.2-amzn-2.1 | 0.10.2-amzn-2 | 
| Trino (PrestoSQL) | 414-amzn-1 | 414-amzn-0 | 410-amzn-0 | 410-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.13.0 release notes
<a name="emr-6130-relnotes"></a>

The following release notes include information for Amazon EMR release 6.13.0. Changes are relative to 6.12.0. For information on the release timeline, see the [6.13.0 change log](#6130-changelog).

**New features**
+ Amazon EMR 6.13.0 supports Apache Spark 3.4.1, Apache Spark RAPIDS 23.06.0-amzn-1, CUDA Toolkit 11.8.0, and JupyterHub 1.5.0.

**Known issues**
+ An on-cluster instance-state script that monitors health of the instance can consume excessive CPU and memory resources when there are a large number of threads and/or open file handles on the node.

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ The 6.13.0 release improves the Amazon EMR log management daemon to ensure that all logs are uploaded at a regular cadence to Amazon S3 when a cluster termination command is issued. This facilitates faster cluster terminations.
+ The 6.13.0 release enhances Amazon EMR log management capabilities to ensure consistent and timely upload of all log files to Amazon S3. This especially benefits long-running EMR clusters.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6130-release.html)

## 6.13.0 default Java versions
<a name="emr-6130-jdk"></a>

Amazon EMR releases 6.12.0 and higher support all applications with Amazon Corretto 8 by default, except for Trino. For Trino, Amazon EMR supports Amazon Corretto 17 by default starting with Amazon EMR release 6.9.0. Amazon EMR also supports some applications with Amazon Corretto 11 and 17. Those applications are listed in the following table. If you want to change the default JVM on your cluster, follow the instructions in [Configure applications to use a specific Java Virtual Machine](configuring-java8.md) for each application that runs on the cluster. You can only use one Java runtime version for a cluster. Amazon EMR doesn't support running different nodes or applications on different runtime versions on the same cluster.

While Amazon EMR supports both Amazon Corretto 11 and 17 on Apache Spark, Apache Hadoop, and Apache Hive, performance might regress for some workloads when you use these versions of Corretto. We recommend that you test your workloads before you change defaults.

The following table shows the default Java versions for applications in Amazon EMR 6.13.0:


| Application | Java / Amazon Corretto version (default is bold) | 
| --- | --- | 
| Delta | 17, 11, 8 | 
| Flink | 11, 8 | 
| Ganglia | 8 | 
| HBase | 11, 8 | 
| HCatalog | 17, 11, 8 | 
| Hadoop | 17, 11, 8 | 
| Hive | 17, 11, 8 | 
| Hudi | 17, 11, 8 | 
| Iceberg | 17, 11, 8 | 
| Livy | 17, 11, 8 | 
| Oozie | 17, 11, 8 | 
| Phoenix | 8 | 
| PrestoDB | 8 | 
| Spark | 17, 11, 8 | 
| Spark RAPIDS | 17, 11, 8 | 
| Sqoop | 8 | 
| Tez | 17, 11, 8 | 
| Trino | 17 | 
| Zeppelin | 8 | 
| Pig | 8 | 
| Zookeeper | 8 | 

## 6.13.0 component versions
<a name="emr-6130-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.4.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.1.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.6.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.10.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.27.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.6.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.1.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.58.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.17.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.17.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-5 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-5 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-5 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-5 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-5 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-5 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-5 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-5 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-5 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-5 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-5 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.17-amzn-1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.17-amzn-1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.17-amzn-1 | HBase command-line client. | 
| hbase-rest-server | 2.4.17-amzn-1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.17-amzn-1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.17-amzn-1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-6 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-6 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-6 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-6 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-6 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-6 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-6 | Service for accepting Hive queries as web requests. | 
| hudi | 0.13.1-amzn-1 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.13.1-amzn-1 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.13.1-amzn-1 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.13.1-amzn-1 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.3.0-amzn-1 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.5.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.7.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.3 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.3 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.281-amzn-1 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.281-amzn-1 | Service for executing pieces of a query. | 
| presto-client | 0.281-amzn-1 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 414-amzn-1 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 414-amzn-1 | Service for executing pieces of a query. | 
| trino-client | 414-amzn-1 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.4.1-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.4.1-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.4.1-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.4.1-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.06.0-amzn-1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-4 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-4 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.13.0 configuration classifications
<a name="emr-6130-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.13.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-java-home | Change Hadoop's KMS java home | Not available. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.13.0 change log
<a name="6130-changelog"></a>


**Change log for 6.13.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2025-09-03 | Docs revision | Amazon EMR 6.13.0 release notes added known issue | 
| *2023-09-23* | Deployment complete | Amazon EMR 6.13.0 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-09-12 | Docs publication | Amazon EMR 6.13.0 release notes first published | 
| 2023-09-01 | Initial release | Amazon EMR 6.13.0 first deployed to initial commercial Regions | 

# Amazon EMR release 6.12.0
<a name="emr-6120-release"></a>

## 6.12.0 application versions
<a name="emr-6120-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.12.0 | emr-6.11.1 | emr-6.11.0 | emr-6.10.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.490 | 1.12.446 | 1.12.446 | 1.12.397 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.4.0 | 2.2.0 | 2.2.0 | 2.2.0 | 
| Flink | 1.17.0 | 1.16.0 | 1.16.0 | 1.16.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.17-amzn-0 | 2.4.15-amzn-1.1 | 2.4.15-amzn-1 | 2.4.15-amzn-0.1 | 
| HCatalog | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 
| Hadoop | 3.3.3-amzn-4 | 3.3.3-amzn-3.1 | 3.3.3-amzn-3 | 3.3.3-amzn-2.1 | 
| Hive | 3.1.3-amzn-5 | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 
| Hudi | 0.13.1-amzn-0 | 0.13.0-amzn-0 | 0.13.0-amzn-0 | 0.12.2-amzn-0 | 
| Hue | 4.11.0 | 4.11.0 | 4.11.0 | 4.10.0 | 
| Iceberg | 1.3.0-amzn-0 | 1.2.0-amzn-0 | 1.2.0-amzn-0 | 1.1.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.5.0 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.3 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.281-amzn-0 | 0.279-amzn-0 | 0.279-amzn-0 | 0.278.1-amzn-0 | 
| Spark | 3.4.0-amzn-0 | 3.3.2-amzn-0.1 | 3.3.2-amzn-0 | 3.3.1-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 
| Tez | 0.10.2-amzn-3 | 0.10.2-amzn-2.1 | 0.10.2-amzn-2 | 0.10.2-amzn-1.1 | 
| Trino (PrestoSQL) | 414-amzn-0 | 410-amzn-0 | 410-amzn-0 | 403-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.12.0 release notes
<a name="emr-6120-relnotes"></a>

The following release notes include information for Amazon EMR release 6.12.0. Changes are relative to 6.11.0. For information on the release timeline, see the [6.12.0 change log](#6120-changelog).

**New features**
+ Amazon EMR 6.12.0 supports Apache Spark 3.4.0, Apache Spark RAPIDS 23.06.0-amzn-0, CUDA 11.8.0, Apache Hudi 0.13.1-amzn-0, Apache Iceberg 1.3.0-amzn-0, Trino 414, and PrestoDB 0.281.
+ Amazon EMR releases 6.12.0 and higher support LDAP integration with Apache Livy, Apache Hive through HiveServer2 (HS2), Trino, Presto, and Hue. You can also install Apache Spark and Apache Hadoop on an EMR cluster that uses 6.12.0 or higher and configure them to use LDAP. For more information, see [Use Active Directory or LDAP servers for authentication with Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/ldap.html).

**Known issues**
+ An on-cluster instance-state script that monitors health of the instance can consume excessive CPU and memory resources when there are a large number of threads and/or open file handles on the node.

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ Amazon EMR releases 6.12.0 and higher provide Java 11 runtime support for Flink. For more information, see [Configure Flink to run with Java 11](flink-configure.md#flink-configure-java11).
+ The 6.12.0 release adds a new retry mechanism to the cluster scaling workflow for EMR clusters that run Presto or Trino. This improvement reduces the risk that cluster resizing will indefinitely stall due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.
+ The 6.12.0 release fixes an issue where cluster scale-down operations might stall when a core node that is undergoing graceful decommissioning turns unhealthy for any reason before it fully decommissions.
+ The 6.12.0 release improves cluster scale-down logic so that your cluster doesn't attempt a scale-down of core nodes below the HDFS replication factor setting for the cluster. This aligns with your data redundancy requirements, and reduces the chance that a scaling operation might stall.
+ The 6.12.0 release enhances the performance and efficiency of the health monitoring service for Amazon EMR by increasing the speed at which it logs state changes for instances. This improvement reduces the chance of degraded performance for cluster nodes that are running multiple custom client tools or third-party applications.
+ The 6.12.0 release improves the performance of the on-cluster log management daemon for Amazon EMR. As a result, there is less chance for degraded performance with EMR clusters that run steps with high concurrency.
+ With Amazon EMR release 6.12.0, the log management daemon has been upgraded to identify all logs that are in active use with open file handles on the local instance storage, and the associated processes. This upgrade ensures that Amazon EMR properly deletes the files and reclaims storage space after the logs are archived to Amazon S3.
+ The 6.12.0 release includes a log-management daemon enhancement that deletes empty, unused steps directories in the local cluster file system. An excessively large number of empty directories can degrade the performance of Amazon EMR daemons and result in disk over-utilization.
+ The 6.12.0 release enables log rotation for YARN Timeline Server logs. This minimizes disk over-utilization scenarios, especially for long-running clusters.
+ The default root volume size has increased to 15 GB in Amazon EMR 6.10.0 and higher. Earlier releases have default root volume size of 10 GB.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6120-release.html)

## 6.12.0 default Java versions
<a name="emr-6120-jdk"></a>

Amazon EMR releases 6.12.0 and higher support all applications with Amazon Corretto 8 by default, except for Trino. For Trino, Amazon EMR supports Amazon Corretto 17 by default starting with Amazon EMR release 6.9.0. Amazon EMR also supports some applications with Amazon Corretto 11 and 17. Those applications are listed in the following table. If you want to change the default JVM on your cluster, follow the instructions in [Configure applications to use a specific Java Virtual Machine](configuring-java8.md) for each application that runs on the cluster. You can only use one Java runtime version for a cluster. Amazon EMR doesn't support running different nodes or applications on different runtime versions on the same cluster.

While Amazon EMR supports both Amazon Corretto 11 and 17 on Apache Spark, Apache Hadoop, and Apache Hive, performance might regress for some workloads when you use these versions of Corretto. We recommend that you test your workloads before you change defaults.

The following table shows the default Java versions for applications in Amazon EMR 6.12.0:


| Application | Java / Amazon Corretto version (default is bold) | 
| --- | --- | 
| Delta | 17, 11, 8 | 
| Flink | 11, 8 | 
| Ganglia | 8 | 
| HBase | 11, 8 | 
| HCatalog | 17, 11, 8 | 
| Hadoop | 17, 11, 8 | 
| Hive | 17, 11, 8 | 
| Hudi | 17, 11, 8 | 
| Iceberg | 17, 11, 8 | 
| Livy | 17, 11, 8 | 
| Oozie | 17, 11, 8 | 
| Phoenix | 8 | 
| PrestoDB | 8 | 
| Spark | 17, 11, 8 | 
| Spark RAPIDS | 17, 11, 8 | 
| Sqoop | 8 | 
| Tez | 17, 11, 8 | 
| Trino | 17 | 
| Zeppelin | 8 | 
| Pig | 8 | 
| Zookeeper | 8 | 

## 6.12.0 component versions
<a name="emr-6120-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.4.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.1.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.5.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.9.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.26.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.5.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.1.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.57.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.17.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.17.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-4 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-4 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-4 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-4 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-4 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-4 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-4 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-4 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-4 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-4 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-4 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.17-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.17-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.17-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.4.17-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.17-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.17-amzn-0 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-5 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-5 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-5 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-5 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-5 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-5 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-5 | Service for accepting Hive queries as web requests. | 
| hudi | 0.13.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.13.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.13.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.13.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.3.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.7.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.3 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.3 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.281-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.281-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.281-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 414-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 414-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 414-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.4.0-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.4.0-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.4.0-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.4.0-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.06.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-3 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-3 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.12.0 configuration classifications
<a name="emr-6120-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.12.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-java-home | Change Hadoop's KMS java home | Not available. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.12.0 change log
<a name="6120-changelog"></a>


**Change log for 6.12.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2025-09-03 | Docs revision | Amazon EMR 6.12.0 release notes added known issue | 
| 2023-07-27 | Update documentation | Update the Java options for 6.12 and add Oozie tutorial to update JVM | 
| 2023-07-21 | Deployment complete | Amazon EMR 6.12.0 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-07-21 | Docs publication | Amazon EMR 6.12.0 release notes first published | 
| 2023-07-12 | Initial release | Amazon EMR 6.12.0 first deployed to initial commercial Regions | 

# Amazon EMR release 6.11.1
<a name="emr-6111-release"></a>

## 6.11.1 application versions
<a name="emr-6111-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.11.1 | emr-6.11.0 | emr-6.10.1 | emr-6.10.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.446 | 1.12.446 | 1.12.397 | 1.12.397 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.2.0 | 2.2.0 | 2.2.0 | 2.2.0 | 
| Flink | 1.16.0 | 1.16.0 | 1.16.0 | 1.16.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.15-amzn-1.1 | 2.4.15-amzn-1 | 2.4.15-amzn-0.1 | 2.4.15-amzn-0 | 
| HCatalog | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 
| Hadoop | 3.3.3-amzn-3.1 | 3.3.3-amzn-3 | 3.3.3-amzn-2.1 | 3.3.3-amzn-2 | 
| Hive | 3.1.3-amzn-4.1 | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 
| Hudi | 0.13.0-amzn-0 | 0.13.0-amzn-0 | 0.12.2-amzn-0 | 0.12.2-amzn-0 | 
| Hue | 4.11.0 | 4.11.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 1.2.0-amzn-0 | 1.2.0-amzn-0 | 1.1.0-amzn-0 | 1.1.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.5.0 | 1.5.0 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.279-amzn-0 | 0.279-amzn-0 | 0.278.1-amzn-0 | 0.278.1-amzn-0 | 
| Spark | 3.3.2-amzn-0.1 | 3.3.2-amzn-0 | 3.3.1-amzn-0.1 | 3.3.1-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.11.0 | 
| Tez | 0.10.2-amzn-2.1 | 0.10.2-amzn-2 | 0.10.2-amzn-1.1 | 0.10.2-amzn-1 | 
| Trino (PrestoSQL) | 410-amzn-0 | 410-amzn-0 | 403-amzn-0 | 403-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.11.1 release notes
<a name="emr-6111-relnotes"></a>

The following release notes include information for Amazon EMR release 6.11.1. Changes are relative to 6.11.0. For information on the release timeline, see the [6.11.1 change log](#6111-changelog).

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ Due to lock contention, a node can enter into a deadlock if it's added or removed at the same time that it attempts to decommission. As a result, the Hadoop Resource Manager (YARN) becomes unresponsive, and affects all the incoming and currently-running containers.
+ This release includes a change that allows high-availability clusters to recover from failed state after restart.
+ This release includes security fixes for Hue and HBase.
+ This release fixes an issue where clusters that are running workloads on Spark with Amazon EMR might silently receive incorrect results with `contains`, `startsWith`, `endsWith`, and `like`. This issue occurs when you use the expressions on partitioned fields that have metadata in the Amazon EMR Hive3 Metastore Server (HMS).
+ This release fixes an issue with throttling on the Glue side when there are no user-defined functions (UDF).
+ This release fixes an issue that deletes container logs by the node log aggregation service before log pusher can push them to S3 in case of YARN decommissioning.
+ This release fixes an issue with FairShare Scheduler metrics when Node Label is enabled for Hadoop.
+ This release fixes an issue that impacted Spark performance when you set a default `true` value for the `spark.yarn.heterogeneousExecutors.enabled` config in `spark-defaults.conf`.
+ This release fixes an issue with Reduce Task failing to read shuffle data. The issue caused Hive query failures with a corrupted memory error.
+ This release adds a new retry mechanism to the cluster scaling workflow for EMR clusters that run Presto or Trino. This improvement reduces the risk that cluster resizing will indefinitely stall due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.
+ This release improves cluster scale-down logic so that your cluster doesn't attempt a scale-down of core nodes below the HDFS replication factor setting for the cluster. This aligns with your data redundancy requirements, and reduces the chance that a scaling operation might stall.
+ The log management daemon has been upgraded to identify all logs that are in active use with open file handles on the local instance storage, and the associated processes. This upgrade ensures that Amazon EMR properly deletes the files and reclaims storage space after the logs are archived to Amazon S3.
+ This release includes a log-management daemon enhancement that deletes empty, unused steps directories in the local cluster file system. An excessively large number of empty directories can degrade the performance of Amazon EMR daemons and result in disk over-utilization.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6111-release.html)

## 6.11.1 component versions
<a name="emr-6111-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.2.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.1.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.4.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.8.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.25.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.4.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.1.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.56.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.16.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.16.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-3.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-3.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-3.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-3.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-3.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-3.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-3.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-3.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-3.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-3.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-3.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.15-amzn-1.1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.15-amzn-1.1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.15-amzn-1.1 | HBase command-line client. | 
| hbase-rest-server | 2.4.15-amzn-1.1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.15-amzn-1.1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.15-amzn-1.1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-4.1 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-4.1 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-4.1 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-4.1 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-4.1 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-4.1 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-4.1 | Service for accepting Hive queries as web requests. | 
| hudi | 0.13.0-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.13.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.13.0-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.13.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.2.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.279-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.279-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.279-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 410-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 410-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 410-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.2-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.3.2-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.2-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.2-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.02.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-2.1 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-2.1 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.11.1 configuration classifications
<a name="emr-6111-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.11.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.11.1 change log
<a name="6111-changelog"></a>


**Change log for 6.11.1 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-30 | Update release notes | Added several control-plane related fixes to the release notes | 
| 2023-08-21 | Docs publication | Amazon EMR 6.11.1 release notes first published | 
| 2023-08-16 | Deployment complete | Amazon EMR 6.11.1 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-08-04 | Initial release | Amazon EMR 6.11.1 first deployed to limited commercial Regions | 

# Amazon EMR release 6.11.0
<a name="emr-6110-release"></a>

## 6.11.0 application versions
<a name="emr-6110-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.11.0 | emr-6.10.1 | emr-6.10.0 | emr-6.9.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.446 | 1.12.397 | 1.12.397 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.2.0 | 2.2.0 | 2.2.0 | 2.1.0 | 
| Flink | 1.16.0 | 1.16.0 | 1.16.0 | 1.15.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.15-amzn-1 | 2.4.15-amzn-0.1 | 2.4.15-amzn-0 | 2.4.13-amzn-0.1 | 
| HCatalog | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 
| Hadoop | 3.3.3-amzn-3 | 3.3.3-amzn-2.1 | 3.3.3-amzn-2 | 3.3.3-amzn-1.1 | 
| Hive | 3.1.3-amzn-4 | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 
| Hudi | 0.13.0-amzn-0 | 0.12.2-amzn-0 | 0.12.2-amzn-0 | 0.12.1-amzn-0 | 
| Hue | 4.11.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 1.2.0-amzn-0 | 1.1.0-amzn-0 | 1.1.0-amzn-0 | 0.14.1-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.4.1 | 1.5.0 | 1.5.0 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.279-amzn-0 | 0.278.1-amzn-0 | 0.278.1-amzn-0 | 0.276-amzn-0 | 
| Spark | 3.3.2-amzn-0 | 3.3.1-amzn-0.1 | 3.3.1-amzn-0 | 3.3.0-amzn-1.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.11.0 | 2.10.0 | 
| Tez | 0.10.2-amzn-2 | 0.10.2-amzn-1.1 | 0.10.2-amzn-1 | 0.10.2-amzn-0.1 | 
| Trino (PrestoSQL) | 410-amzn-0 | 403-amzn-0 | 403-amzn-0 | 398-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.11.0 release notes
<a name="emr-6110-relnotes"></a>

The following release notes include information for Amazon EMR release 6.11.0. Changes are relative to 6.10.0. For information on the release timeline, see the [change log](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6110-release.html#6110-changelog).

**New features**
+ Amazon EMR 6.11.0 supports Apache Spark 3.3.2-amzn-0, Apache Spark RAPIDS 23.02.0-amzn-0, CUDA 11.8.0, Apache Hudi 0.13.0-amzn-0, Apache Iceberg 1.2.0-amzn-0, Trino 410-amzn-0, and PrestoDB 0.279-amzn-0.

**Changes, enhancements, and resolved issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 
+ With Amazon EMR 6.11.0, the DynamoDB connector has been upgraded to version 5.0.0. Version 5.0.0 uses AWS SDK for Java 2.x. Previous releases used AWS SDK for Java 1.x. As a result of this upgrade, we strongly advise you to test your code before you use the DynamoDB connector with Amazon EMR 6.11.
+ When the DynamoDB connector for Amazon EMR 6.11.0 calls the DynamoDB service, it uses the Region value that you provide for the `dynamodb.endpoint` property. We recommend that you also configure `dynamodb.region` when you use `dynamodb.endpoint`, and that both properties target the same AWS Region. If you use `dynamodb.endpoint` and you don't configure `dynamodb.region`, the DynamoDB connector for Amazon EMR 6.11.0 will return an invalid Region exception and attempt to reconcile your AWS Region information from the Amazon EC2 instance metadata service (IMDS). If the connector can't retrieve the Region from IMDS, it defaults to US East (N. Virginia) (`us-east-1`). The following error is an example of the invalid Region exception that you might get if you don't properly configure the `dynamodb.region` property: `error software.amazon.awssdk.services.dynamodb.model.DynamoDbException: Credential should be scoped to a valid region.` For more information on the classes that are affected by the AWS SDK for Java upgrade to 2.x, see the [Upgrade AWS SDK for Java from 1.x to 2.x (\$1175) ](https://github.com/awslabs/emr-dynamodb-connector/commit/1dec9d1972d3673c3fae6c6ea51f19f295147ccf) commit in the GitHub repo for the Amazon EMR - DynamoDB connector.
+ This release fixes an issue where column data becomes `NULL` when you use Delta Lake to store Delta table data in Amazon S3 after column rename operation. For more information about this experimental feature in Delta Lake, see [Column rename operation](https://docs.delta.io/latest/delta-batch.html#rename-columns) in the Delta Lake User Guide.
+ The 6.11.0 release fixes an issue that might occur when you create an edge node by replicating one of the primary nodes from a cluster with multiple primary nodes. The replicated edge node could cause delays with scale-down operations, or result in high memory-utilization on the primary nodes. For more information on how to create an edge node to communicate with your EMR cluster, see [Edge Node Creator](https://github.com/aws-samples/aws-emr-utilities/tree/main/utilities/emr-edge-node-creator) in the `aws-samples` repo on GitHub.
+ The 6.11.0 release improves the automation process that Amazon EMR uses to re-mount Amazon EBS volumes to an instance after a reboot.
+ The 6.11.0 release fixes an issue that resulted in intermittent gaps in the Hadoop metrics that Amazon EMR publishes to Amazon CloudWatch.
+ The 6.11.0 release fixes an issue with EMR clusters where an update to the YARN configuration file that contains the exclusion list of nodes for the cluster is interrupted due to disk over-utilization. The incomplete update hinders future cluster scale-down operations. This release ensures that your cluster remains healthy, and that scaling operations work as expected.
+ The default root volume size has increased to 15 GB in Amazon EMR 6.10.0 and higher. Earlier releases have default root volume size of 10 GB.
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. This approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  With Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` is set to `false` in `yarn-site.xml` to resolve this issue.

  While the fix addresses the issues that were introduced by YARN-9608, it might cause Hive jobs to fail due to shuffle data loss on clusters that have managed scaling enabled. We've mitigated that risk in this release by also setting `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-shuffle-data` for Hive workloads. This config is only available with Amazon EMR releases 6.11.0 and higher.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).
**Note**  
This release no longer gets automatic AMI updates since it has been succeeded by 1 more more patch releases. The patch release is denoted by the number after the second decimal point (`6.8.1`). To see if you're using the latest patch release, check the available releases in the [https://docs.aws.amazon.com/emr/latest/ReleaseGuide](https://docs.aws.amazon.com/emr/latest/ReleaseGuide), or check the **Amazon EMR release** dropdown when you create a cluster in the console, or use the [https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html](https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html) API or [https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html](https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html) CLI action. To get updates about new releases, subscribe to the RSS feed on the [What's new?](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-whatsnew.html) page.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6110-release.html)

## 6.11.0 component versions
<a name="emr-6110-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.2.0 | Delta lake is an open table format for huge analytic datasets | 
| delta-standalone-connectors | 0.6.0 | Delta Connectors provide different runtimes to integrate Delta Lake with engines like Flink, Hive and Presto. | 
| emr-ddb | 5.1.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.4.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.8.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.25.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.4.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.1.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.56.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.16.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.16.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-3 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-3 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-3 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-3 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-3 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-3 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-3 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-3 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-3 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-3 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-3 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.15-amzn-1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.15-amzn-1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.15-amzn-1 | HBase command-line client. | 
| hbase-rest-server | 2.4.15-amzn-1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.15-amzn-1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.15-amzn-1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-4 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-4 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-4 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-4 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-4 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-4 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-4 | Service for accepting Hive queries as web requests. | 
| hudi | 0.13.0-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.13.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.13.0-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.13.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.11.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.2.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.279-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.279-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.279-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 410-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 410-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 410-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.2-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.3.2-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.2-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.2-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 23.02.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-2 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-2 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.11.0 configuration classifications
<a name="emr-6110-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.11.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.11.0 change log
<a name="6110-changelog"></a>


**Change log for 6.11.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-21 | Update | Fixed issue introduced with Hadoop 3.3.3. | 
| 2023-07-26 | Update | New OS release labels 2.0.20230612.0 and 2.0.20230628.0. | 
| 2023-06-09 | Deployment complete | Amazon EMR 6.11.0 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-06-09 | Docs publication | Amazon EMR 6.11.0 release notes first published | 
| 2023-06-08 | Initial release | Amazon EMR 6.11.0 first deployed to initial commercial Regions | 

# Amazon EMR release 6.10.1
<a name="emr-6101-release"></a>

## 6.10.1 application versions
<a name="emr-6101-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.10.1 | emr-6.10.0 | emr-6.9.1 | emr-6.9.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.397 | 1.12.397 | 1.12.170 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.2.0 | 2.2.0 | 2.1.0 | 2.1.0 | 
| Flink | 1.16.0 | 1.16.0 | 1.15.2 | 1.15.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.15-amzn-0.1 | 2.4.15-amzn-0 | 2.4.13-amzn-0.1 | 2.4.13-amzn-0 | 
| HCatalog | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 
| Hadoop | 3.3.3-amzn-2.1 | 3.3.3-amzn-2 | 3.3.3-amzn-1.1 | 3.3.3-amzn-1 | 
| Hive | 3.1.3-amzn-3.1 | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 
| Hudi | 0.12.2-amzn-0 | 0.12.2-amzn-0 | 0.12.1-amzn-0 | 0.12.1-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 1.1.0-amzn-0 | 1.1.0-amzn-0 | 0.14.1-amzn-0 | 0.14.1-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.6.0 | 
| JupyterHub | 1.5.0 | 1.5.0 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.278.1-amzn-0 | 0.278.1-amzn-0 | 0.276-amzn-0 | 0.276-amzn-0 | 
| Spark | 3.3.1-amzn-0.1 | 3.3.1-amzn-0 | 3.3.0-amzn-1.1 | 3.3.0-amzn-1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.11.0 | 2.10.0 | 2.10.0 | 
| Tez | 0.10.2-amzn-1.1 | 0.10.2-amzn-1 | 0.10.2-amzn-0.1 | 0.10.2-amzn-0 | 
| Trino (PrestoSQL) | 403-amzn-0 | 403-amzn-0 | 398-amzn-0 | 398-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.10.1 release notes
<a name="emr-6101-relnotes"></a>

The following release notes include information for Amazon EMR release 6.10.1. Changes are relative to 6.10.0. For information on the release timeline, see the [6.10.1 change log](#6101-changelog).

**Known Issues**
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 

**Changes, enhancements, and resolved issues**
+ Due to lock contention, a node can enter into a deadlock if it's added or removed at the same time that it attempts to decommission. As a result, the Hadoop Resource Manager (YARN) becomes unresponsive, and affects all the incoming and currently-running containers.
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. This approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  With Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` is set to `false` in `yarn-site.xml` to resolve this issue.

  While the fix addresses the issues that were introduced by YARN-9608, it might cause Hive jobs to fail due to shuffle data loss on clusters that have managed scaling enabled. We've mitigated that risk in this release by also setting `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-shuffle-data` for Hive workloads. This config is only available with Amazon EMR releases 6.11.0 and higher.
+ Metrics collector will not send any metrics to the control plane after failover of primary node in clusters with the instance groups configuration.
+ This release includes a change that allows high-availability clusters to recover from failed state after restart.
+ This release includes security fixes for Hue and HBase.
+ This release fixes an issue where clusters that are running workloads on Spark with Amazon EMR might silently receive incorrect results with `contains`, `startsWith`, `endsWith`, and `like`. This issue occurs when you use the expressions on partitioned fields that have metadata in the Amazon EMR Hive3 Metastore Server (HMS).
+ This release fixes an issue with throttling on the Glue side when there are no user-defined functions (UDF).
+ This release fixes an issue that deletes container logs by the node log aggregation service before log pusher can push them to S3 in case of YARN decommissioning.
+ This release fixes an issue with FairShare Scheduler metrics when Node Label is enabled for Hadoop.
+ This release fixes an issue that impacted Spark performance when you set a default `true` value for the `spark.yarn.heterogeneousExecutors.enabled` config in `spark-defaults.conf`.
+ This release fixes an issue with Reduce Task failing to read shuffle data. The issue caused Hive query failures with a corrupted memory error.
+ This release adds a new retry mechanism to the cluster scaling workflow for EMR clusters that run Presto or Trino. This improvement reduces the risk that cluster resizing will indefinitely stall due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.
+ This release improves cluster scale-down logic so that your cluster doesn't attempt a scale-down of core nodes below the HDFS replication factor setting for the cluster. This aligns with your data redundancy requirements, and reduces the chance that a scaling operation might stall.
+ The log management daemon has been upgraded to identify all logs that are in active use with open file handles on the local instance storage, and the associated processes. This upgrade ensures that Amazon EMR properly deletes the files and reclaims storage space after the logs are archived to Amazon S3.
+ This release includes a log-management daemon enhancement that deletes empty, unused steps directories in the local cluster file system. An excessively large number of empty directories can degrade the performance of Amazon EMR daemons and result in disk over-utilization.
+ This release fixes an issue that might occur when you create an edge node by replicating one of the primary nodes from a cluster with multiple primary nodes. The replicated edge node could cause delays with scale-down operations, or result in high memory-utilization on the primary nodes. For more information on how to create an edge node to communicate with your EMR cluster, see [Edge Node Creator](https://github.com/aws-samples/aws-emr-utilities/tree/main/utilities/emr-edge-node-creator) in the `aws-samples` repo on GitHub.
+ This release improves the automation process that Amazon EMR uses to re-mount Amazon EBS volumes to an instance after a reboot.
+ This release fixes an issue that resulted in intermittent gaps in the Hadoop metrics that Amazon EMR publishes to Amazon CloudWatch.
+ This release fixes an issue with EMR clusters where an update to the YARN configuration file that contains the exclusion list of nodes for the cluster is interrupted due to disk over-utilization. The incomplete update hinders future cluster scale-down operations. This release ensures that your cluster remains healthy, and that scaling operations work as expected.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6101-release.html)

## 6.10.1 component versions
<a name="emr-6101-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.2.0 | Delta lake is an open table format for huge analytic datasets | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.3.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.7.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.24.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.3.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.0.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.55.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.16.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.16.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-2.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-2.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-2.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-2.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-2.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-2.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-2.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-2.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-2.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-2.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-2.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.15-amzn-0.1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.15-amzn-0.1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.15-amzn-0.1 | HBase command-line client. | 
| hbase-rest-server | 2.4.15-amzn-0.1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.15-amzn-0.1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.15-amzn-0.1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-3.1 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-3.1 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-3.1 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-3.1 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-3.1 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-3.1 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-3.1 | Service for accepting Hive queries as web requests. | 
| hudi | 0.12.2-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.12.2-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.12.2-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.12.2-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.1.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.5.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 6.0.0-SNAPSHOT | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.278.1-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.278.1-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.278.1-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 403-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 403-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 403-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.1-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.3.1-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.1-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.1-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.12.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-1.1 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-1.1 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.10.1 configuration classifications
<a name="emr-6101-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.10.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.10.1 change log
<a name="6101-changelog"></a>


**Change log for 6.10.1 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-30 | Update release notes | Added several control-plane related fixes to the release notes | 
| 2023-08-21 | Docs publication | Amazon EMR 6.10.1 release notes first published | 
| 2023-08-16 | Deployment complete | Amazon EMR 6.10.1 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-08-04 | Initial release | Amazon EMR 6.10.1 first deployed to limited commercial Regions | 

# Amazon EMR release 6.10.0
<a name="emr-6100-release"></a>

## 6.10.0 application versions
<a name="emr-6100-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.10.0 | emr-6.9.1 | emr-6.9.0 | emr-6.8.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.397 | 1.12.170 | 1.12.170 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.2.0 | 2.1.0 | 2.1.0 |  -  | 
| Flink | 1.16.0 | 1.15.2 | 1.15.2 | 1.15.1 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.15-amzn-0 | 2.4.13-amzn-0.1 | 2.4.13-amzn-0 | 2.4.12-amzn-0.1 | 
| HCatalog | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 
| Hadoop | 3.3.3-amzn-2 | 3.3.3-amzn-1.1 | 3.3.3-amzn-1 | 3.2.1-amzn-8.1 | 
| Hive | 3.1.3-amzn-3 | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 
| Hudi | 0.12.2-amzn-0 | 0.12.1-amzn-0 | 0.12.1-amzn-0 | 0.11.1-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 1.1.0-amzn-0 | 0.14.1-amzn-0 | 0.14.1-amzn-0 | 0.14.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.6.0 | 2.1.0 | 
| JupyterHub | 1.5.0 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.278.1-amzn-0 | 0.276-amzn-0 | 0.276-amzn-0 | 0.273.3-amzn-0 | 
| Spark | 3.3.1-amzn-0 | 3.3.0-amzn-1.1 | 3.3.0-amzn-1 | 3.3.0-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.11.0 | 2.10.0 | 2.10.0 | 2.9.1 | 
| Tez | 0.10.2-amzn-1 | 0.10.2-amzn-0.1 | 0.10.2-amzn-0 | 0.9.2 | 
| Trino (PrestoSQL) | 403-amzn-0 | 398-amzn-0 | 398-amzn-0 | 388-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.10.0 release notes
<a name="emr-6100-relnotes"></a>

The following release notes include information for Amazon EMR release 6.10.0. Changes are relative to 6.9.0. For information on the release timeline, see the [change log](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6100-release.html#6100-changelog).

**New features**
+ Amazon EMR 6.10.0 supports Apache Spark 3.3.1, Apache Spark RAPIDS 22.12.0, CUDA 11.8.0, Apache Hudi 0.12.2-amzn-0, Apache Iceberg 1.1.0-amzn-0, Trino 403, and PrestoDB 0.278.1.
+ Amazon EMR 6.10.0 includes a native Trino-Hudi connector that provides read access to data in Hudi tables. You can activate the connector with `trino-cli --catalog hudi`, and configure the connector for your requirements with `trino-connector-hudi`. The native integration with Amazon EMR means that you no longer need to use `trino-connector-hive` to query Hudi tables. For a list of supported configurations with the new connector, see the [Hudi connector](https://trino.io/docs/current/connector/hudi.html) page of the Trino documentation.
+ Amazon EMR releases 6.10.0 and higher support Apache Zeppelin integration with Apache Flink. See [Working with Flink jobs from Zeppelin in Amazon EMR](flink-zeppelin.md) for more information.

**Known Issues**
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. This approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  To work around this issue in Amazon EMR 6.10.0, you can set the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` to `false` in `yarn-site.xml`. In Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the config is set to `false` by default to resolve this issue.
+  Starting with Spark 3.3.1 (supported in EMR versions 6.10 and above), all executors in a decommissioning host are set to a new `ExecutorState`, called *DECOMMISSIONING* state. The executors being decommissioned cannot be used by Yarn to allocate tasks and thus it will request for new executors, if needed, for the tasks being executed. Thus, if you disable Spark DRA while using EMR Managed Scaling, EMR Auto Scaling, or any custom scaling mechanism on EMR-EC2 clusters, then Yarn may request maximum permissible executors for each job. In order to avoid this issue, leave the `spark.dynamicAllocation.enabled` property set to `TRUE` (which is the default) when you are using the above combination of features. In addition, you can also set minimum and maximum executor constraints by setting values for `spark.dynamicAllocation.maxExecutors` and `spark.dynamicAllocation.minExecutors` properties for your Spark jobs, to restrict the number of executors allocated during the job’s execution. 

**Changes, enhancements, and resolved issues**
+ Amazon EMR 6.10.0 removes the dependency on `minimal-json.jar` for the [Amazon Redshift integration for Apache Spark](emr-spark-redshift-launch.md), and automatically adds the required Spark-Redshift related jars to the executor class path for Spark: `spark-redshift.jar`, `spark-avro.jar`, and `RedshiftJDBC.jar`.
+ The 6.10.0 release improves the on-cluster log management daemon to monitor additional log folders in your EMR cluster. This improvement minimizes disk over-utilization scenarios.
+ The 6.10.0 release automatically restarts the on-cluster log management daemon when it stops. This improvement reduces the risk for nodes to appear unhealthy due to disk over-utilization. 
+ Amazon EMR 6.10.0 supports regional endpoints for EMRFS user mapping.
+ The default root volume size has increased to 15 GB in Amazon EMR 6.10.0 and higher. Earlier releases have default root volume size of 10 GB.
+ The 6.10.0 release fixes an issue that caused Spark jobs to stall when all remaining Spark executors are on a decommissioning host with the YARN resource manager. 
+ With Amazon EMR 6.6.0 through 6.9.x, INSERT queries with dynamic partition and an ORDER BY or SORT BY clause will always have two reducers. This issue is caused by OSS change [HIVE-20703](https://issues.apache.org/jira/browse/HIVE-20703), which puts dynamic sort partition optimization under cost-based decision. If your workload doesn't require sorting of dynamic partitions, we recommend that you set the `hive.optimize.sort.dynamic.partition.threshold` property to `-1` to disable the new feature and get the correctly calculated number of reducers. This issue is fixed in OSS Hive as part of [HIVE-22269](https://issues.apache.org/jira/browse/HIVE-22269) and is fixed in Amazon EMR 6.10.0.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).
**Note**  
This release no longer gets automatic AMI updates since it has been succeeded by 1 more more patch releases. The patch release is denoted by the number after the second decimal point (`6.8.1`). To see if you're using the latest patch release, check the available releases in the [https://docs.aws.amazon.com/emr/latest/ReleaseGuide](https://docs.aws.amazon.com/emr/latest/ReleaseGuide), or check the **Amazon EMR release** dropdown when you create a cluster in the console, or use the [https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html](https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html) API or [https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html](https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html) CLI action. To get updates about new releases, subscribe to the RSS feed on the [What's new?](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-whatsnew.html) page.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-6100-release.html)

## 6.10.0 component versions
<a name="emr-6100-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.2.0 | Delta lake is an open table format for huge analytic datasets | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.3.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.7.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.24.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.3.0 | EMR S3Select Connector | 
| emr-wal-cli | 1.0.0 | Cli used for emrwal list/deletion. | 
| emrfs | 2.55.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.16.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.16.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-2 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-2 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-2 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-2 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-2 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-2 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-2 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-2 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-2 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-2 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-2 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.15-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.15-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.15-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.4.15-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.15-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.15-amzn-0 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-3 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-3 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-3 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-3 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-3 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-3 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-3 | Service for accepting Hive queries as web requests. | 
| hudi | 0.12.2-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.12.2-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.12.2-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.12.2-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 1.1.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.5.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.8.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 6.0.0-SNAPSHOT | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.278.1-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.278.1-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.278.1-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 403-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 403-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 403-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.1-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.3.1-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.1-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.1-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.12.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.11.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-1 | The tez YARN application and libraries. | 
| tez-on-worker | 0.10.2-amzn-1 | The tez YARN application and libraries for worker nodes. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.10.0 configuration classifications
<a name="emr-6100-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.10.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hudi | Change values in Trino's hudi.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-logback | Change values in kms-logback.xml file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.10.0 change log
<a name="6100-changelog"></a>


**Change log for 6.10.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-21 | Update | Added a known issue introduced with Hadoop 3.3.3. | 
| 2023-07-26 | Update | New OS release labels 2.0.20230612.0 and 2.0.20230628.0. | 
| 2023-03-02 | Deployment complete | Amazon EMR 6.10 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-03-02 | Docs publication | Amazon EMR 6.10 release notes first published | 
| 2023-02-27 | Initial release | Amazon EMR 6.10 deployed to limited commercial Regions | 

# Amazon EMR release 6.9.1
<a name="emr-691-release"></a>

## 6.9.1 application versions
<a name="emr-691-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.9.1 | emr-6.9.0 | emr-6.8.1 | emr-6.8.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.1.0 | 2.1.0 |  -  |  -  | 
| Flink | 1.15.2 | 1.15.2 | 1.15.1 | 1.15.1 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.13-amzn-0.1 | 2.4.13-amzn-0 | 2.4.12-amzn-0.1 | 2.4.12-amzn-0 | 
| HCatalog | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 
| Hadoop | 3.3.3-amzn-1.1 | 3.3.3-amzn-1 | 3.2.1-amzn-8.1 | 3.2.1-amzn-8 | 
| Hive | 3.1.3-amzn-2.1 | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 
| Hudi | 0.12.1-amzn-0 | 0.12.1-amzn-0 | 0.11.1-amzn-0 | 0.11.1-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 0.14.1-amzn-0 | 0.14.1-amzn-0 | 0.14.0-amzn-0 | 0.14.0-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.6.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.9.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.276-amzn-0 | 0.276-amzn-0 | 0.273.3-amzn-0 | 0.273.3-amzn-0 | 
| Spark | 3.3.0-amzn-1.1 | 3.3.0-amzn-1 | 3.3.0-amzn-0.1 | 3.3.0-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.10.0 | 2.10.0 | 2.9.1 | 2.9.1 | 
| Tez | 0.10.2-amzn-0.1 | 0.10.2-amzn-0 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 398-amzn-0 | 398-amzn-0 | 388-amzn-0 | 388-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.1 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.10 | 

## 6.9.1 release notes
<a name="emr-691-relnotes"></a>

The following release notes include information for Amazon EMR release 6.9.1. Changes are relative to 6.9.0. For information on the release timeline, see the [6.9.1 change log](#691-changelog).

**Changes, enhancements, and resolved issues**
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. This approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  With Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` is set to `false` in `yarn-site.xml` to resolve this issue.

  While the fix addresses the issues that were introduced by YARN-9608, it might cause Hive jobs to fail due to shuffle data loss on clusters that have managed scaling enabled. We've mitigated that risk in this release by also setting `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-shuffle-data` for Hive workloads. This config is only available with Amazon EMR releases 6.11.0 and higher.
+ Metrics collector won't send any metrics to the control plane after failover of primary node in clusters with the instance groups configuration.
+ This release eliminates retries on failed HTTP requests to metrics collector endpoints.
+ This release includes a change that allows high-availability clusters to recover from failed state after restart.
+ This release fixes an issue where large user-created UIDs caused overflow exceptions.
+ This release fixes timeout issues with the Amazon EMR reconfiguration process.
+ This release includes security fixes.
+ This release fixes an issue where clusters that are running workloads on Spark with Amazon EMR might silently receive incorrect results with `contains`, `startsWith`, `endsWith`, and `like`. This issue occurs when you use the expressions on partitioned fields that have metadata in the Amazon EMR Hive3 Metastore Server (HMS).
+ With Amazon EMR 6.6.0 through 6.9.x, INSERT queries with dynamic partition and an ORDER BY or SORT BY clause will always have two reducers. This issue is caused by OSS change [HIVE-20703](https://issues.apache.org/jira/browse/HIVE-20703), which puts dynamic sort partition optimization under cost-based decision. If your workload doesn't require sorting of dynamic partitions, we recommend that you set the `hive.optimize.sort.dynamic.partition.threshold` property to `-1` to disable the new feature and get the correctly calculated number of reducers. This issue is fixed in OSS Hive as part of [HIVE-22269](https://issues.apache.org/jira/browse/HIVE-22269) and is fixed in Amazon EMR 6.10.0.
+ Hive might experience data loss when you use HDFS as a scratch directory and you have enabled merge small files, and the table contains static partition paths.
+ This release fixes a performance issue with Hive if merge small files (disabled by default) is enabled at the end of ETL job.
+ This release fixes an issue with throttling on the Glue side when there are no user-defined functions (UDF).
+ This release fixes an issue that deletes container logs by the node log aggregation service before log pusher can push them to S3 in case of YARN decommissioning.
+ This release fixes handling of compacted/archived files with persistent storefile tracking for HBase.
+ This release fixes an issue that impacted Spark performance when you set a default `true` value for the `spark.yarn.heterogeneousExecutors.enabled` config in `spark-defaults.conf`.
+ This release fixes an issue with Reduce Task failing to read shuffle data. The issue caused Hive query failures with a corrupted memory error.
+ This release fixes an issue that caused the node provisioner to fail if the HDFS NameNode (NN) service was stuck in safemode during node replacement.
+ This release adds a new retry mechanism to the cluster scaling workflow for EMR clusters that run Presto or Trino. This improvement reduces the risk that cluster resizing will indefinitely stall due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.
+ This release improves cluster scale-down logic so that your cluster doesn't attempt a scale-down of core nodes below the HDFS replication factor setting for the cluster. This aligns with your data redundancy requirements, and reduces the chance that a scaling operation might stall.
+ The log management daemon has been upgraded to identify all logs that are in active use with open file handles on the local instance storage, and the associated processes. This upgrade ensures that Amazon EMR properly deletes the files and reclaims storage space after the logs are archived to Amazon S3.
+ This release includes a log-management daemon enhancement that deletes empty, unused steps directories in the local cluster file system. An excessively large number of empty directories can degrade the performance of Amazon EMR daemons and result in disk over-utilization.
+ This release fixes an issue that might occur when you create an edge node by replicating one of the primary nodes from a cluster with multiple primary nodes. The replicated edge node could cause delays with scale-down operations, or result in high memory-utilization on the primary nodes. For more information on how to create an edge node to communicate with your EMR cluster, see [Edge Node Creator](https://github.com/aws-samples/aws-emr-utilities/tree/main/utilities/emr-edge-node-creator) in the `aws-samples` repo on GitHub.
+ This release improves the automation process that Amazon EMR uses to re-mount Amazon EBS volumes to an instance after a reboot.
+ This release fixes an issue that resulted in intermittent gaps in the Hadoop metrics that Amazon EMR publishes to Amazon CloudWatch.
+ This release fixes an issue with EMR clusters where an update to the YARN configuration file that contains the exclusion list of nodes for the cluster is interrupted due to disk over-utilization. The incomplete update hinders future cluster scale-down operations. This release ensures that your cluster remains healthy, and that scaling operations work as expected.
+ This release improves the on-cluster log management daemon to monitor additional log folders in your EMR cluster. This improvement minimizes disk over-utilization scenarios.
+ This release automatically restarts the on-cluster log management daemon when it stops. This improvement reduces the risk for nodes to appear unhealthy due to disk over-utilization. 
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-691-release.html)

## 6.9.1 component versions
<a name="emr-691-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.1.0 | Delta lake is an open table format for huge analytic datasets | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.3.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.6.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.23.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.2.0 | EMR S3Select Connector | 
| emrfs | 2.54.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.15.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.15.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-1.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-1.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-1.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-1.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-1.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-1.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-1.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-1.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-1.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-1.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-1.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.13-amzn-0.1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.13-amzn-0.1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.13-amzn-0.1 | HBase command-line client. | 
| hbase-rest-server | 2.4.13-amzn-0.1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.13-amzn-0.1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.13-amzn-0.1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-2.1 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-2.1 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-2.1 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-2.1 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-2.1 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-2.1 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-2.1 | Service for accepting Hive queries as web requests. | 
| hudi | 0.12.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.12.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.12.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.12.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.14.1-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.7.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 6.0.0-SNAPSHOT | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.276-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.276-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.276-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 398-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 398-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 398-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.0-amzn-1.1 | Spark command-line clients. | 
| spark-history-server | 3.3.0-amzn-1.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.0-amzn-1.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.0-amzn-1.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.08.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.10.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-0.1 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.9.1 configuration classifications
<a name="emr-691-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.9.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.9.1 change log
<a name="691-changelog"></a>


**Change log for 6.9.1 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-30 | Update release notes | Added several control-plane related fixes to the release notes | 
| 2023-08-21 | Docs publication | Amazon EMR 6.9.1 release notes first published | 
| 2023-08-16 | Deployment complete | Amazon EMR 6.9.1 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-08-04 | Initial release | Amazon EMR 6.9.1 first deployed to limited commercial Regions | 

# Amazon EMR release 6.9.0
<a name="emr-690-release"></a>

## 6.9.0 application versions
<a name="emr-690-app-versions"></a>

This release includes the following applications: [https://delta.io/](https://delta.io/), [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.9.0 | emr-6.8.1 | emr-6.8.0 | emr-6.7.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.15 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta | 2.1.0 |  -  |  -  |  -  | 
| Flink | 1.15.2 | 1.15.1 | 1.15.1 | 1.14.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.13-amzn-0 | 2.4.12-amzn-0.1 | 2.4.12-amzn-0 | 2.4.4-amzn-3 | 
| HCatalog | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 
| Hadoop | 3.3.3-amzn-1 | 3.2.1-amzn-8.1 | 3.2.1-amzn-8 | 3.2.1-amzn-7 | 
| Hive | 3.1.3-amzn-2 | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 
| Hudi | 0.12.1-amzn-0 | 0.11.1-amzn-0 | 0.11.1-amzn-0 | 0.11.0-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 0.14.1-amzn-0 | 0.14.0-amzn-0 | 0.14.0-amzn-0 | 0.13.1-amzn-0 | 
| JupyterEnterpriseGateway | 2.6.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.9.1 | 1.8.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.276-amzn-0 | 0.273.3-amzn-0 | 0.273.3-amzn-0 | 0.272-amzn-0 | 
| Spark | 3.3.0-amzn-1 | 3.3.0-amzn-0.1 | 3.3.0-amzn-0 | 3.2.1-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.10.0 | 2.9.1 | 2.9.1 | 2.4.1 | 
| Tez | 0.10.2-amzn-0 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 398-amzn-0 | 388-amzn-0 | 388-amzn-0 | 378-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.1 | 0.10.0 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.10 | 3.5.7 | 

## 6.9.0 release notes
<a name="emr-690-relnotes"></a>

The following release notes include information for Amazon EMR release 6.9.0. Changes are relative to Amazon EMR release 6.8.0. For information on the release timeline, see the [change log](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-690-release.html#690-changelog).

**New Features**
+ Amazon EMR release 6.9.0 supports Apache Spark RAPIDS 22.08.0, Apache Hudi 0.12.1, Apache Iceberg 0.14.1, Trino 398, and Tez 0.10.2.
+ Amazon EMR release 6.9.0 includes a new open-source application, [Delta Lake](emr-delta.md) 2.1.0.
+ The Amazon Redshift integration for Apache Spark is included in Amazon EMR releases 6.9.0 and later. Previously an open-source tool, the native integration is a Spark connector that you can use to build Apache Spark applications that read from and write to data in Amazon Redshift and Amazon Redshift Serverless. For more information, see [Using Amazon Redshift integration for Apache Spark with Amazon EMR](emr-spark-redshift.md).
+ Amazon EMR release 6.9.0 adds support for archiving logs to Amazon S3 during cluster scale-down. Previously, you could only archive log files to Amazon S3 during cluster termination. The new capability ensures that log files generated on the cluster persist on Amazon S3 even after the node is terminated. For more information, see [Configure cluster logging and debugging](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-debugging.html).
+ To support long running queries, Trino now includes a fault-tolerant execution mechanism. Fault-tolerant execution mitigates query failures by retrying failed queries or their component tasks.
+ You can use Apache Flink on Amazon EMR for unified `BATCH` and `STREAM` processing of Apache Hive Tables or metadata of any Flink tablesource such as Iceberg, Kinesis or Kafka. You can specify the AWS Glue Data Catalog as the metastore for Flink using the AWS Management Console, AWS CLI, or Amazon EMR API. For more information, see [Configuring Flink in Amazon EMR](flink-configure.md).
+ You can now specify AWS Identity and Access Management (IAM) runtime roles and AWS Lake Formation-based access control for Apache Spark, Apache Hive, and Presto queries on Amazon EMR on EC2 clusters with Amazon SageMaker AI Studio. For more information, see [Configure runtime roles for Amazon EMR steps](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-steps-runtime-roles.html). 

**Known Issues**
+ For Amazon EMR release 6.9.0, Trino does not work on clusters enabled for Apache Ranger. If you need to use Trino with Ranger, contact [Support](https://console.aws.amazon.com/support/home#/).
+ If you use the Amazon Redshift integration for Apache Spark and have a time, timetz, timestamp, or timestamptz with microsecond precision in Parquet format, the connector rounds the time values to the nearest millisecond value. As a workaround, use the text unload format `unload_s3_format` parameter.
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.
+ Connections to Amazon EMR clusters from Amazon SageMaker AI Studio may intermittently fail with a **403 Forbidden** response code. This error happens when setup of the IAM role on the cluster takes longer than 60 seconds. As a workaround, you can install an Amazon EMR patch to enable retries and increase the timeout to a minimum of 300 seconds. Use the following steps to apply the bootstrap action when you launch your cluster.

  1.  Download the bootstrap script and RPM files from the following Amazon S3 URIs.

     ```
     s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/gcsc/replace-rpms.sh
     s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/gcsc/emr-secret-agent-1.18.0-SNAPSHOT20221121212949.noarch.rpm
     ```

  1. Upload the files from the previous step to an Amazon S3 bucket that you own. The bucket must be in the same AWS Region where you plan to launch the cluster.

  1. Include the following bootstrap action when you launch your EMR cluster. Replace *bootstrap\$1URI* and *RPM\$1URI* with the corresponding URIs from Amazon S3. 

     ```
     --bootstrap-actions "Path=bootstrap_URI,Args=[RPM_URI]"
     ```
+ With Amazon EMR releases 5.36.0 and 6.6.0 through 6.9.0, `SecretAgent` and `RecordServer` service components may experience log data loss due to an incorrect file name pattern configuration in Log4j2 properties. The incorrect configuration causes the components to generate only one log file per day. When the rotation strategy occurs, it overwrites the existing file instead of generating a new log file as expected. As a workaround, use a bootstrap action to generate log files each hour and append an auto-increment integer in the file name to handle the rotation.

  For Amazon EMR 6.6.0 through 6.9.0 releases, use the following bootstrap action when you launch a cluster. 

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-6x/replace-puppet.sh,Args=[]"
  ```

  For Amazon EMR 5.36.0, use the following bootstrap action when you launch a cluster.

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-5x/replace-puppet.sh,Args=[]"
  ```
+ Apache Flink provides Native S3 FileSystem and Hadoop FileSystem Connectors, which let applications create a FileSink and write the data into Amazon S3. This FileSink fails with one of the following two exceptions.

  ```
  java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are only supported for HDFS
  ```

  ```
  Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.io.retry.RetryPolicies.retryOtherThanRemoteAndSaslException(Lorg/apache/hadoop/io/retry/RetryPolicy;Ljava/util/Map;)Lorg/apache/hadoop/io/retry/RetryPolicy;
                                          at org.apache.hadoop.yarn.client.RMProxy.createRetryPolicy(RMProxy.java:302) ~[hadoop-yarn-common-3.3.3-amzn-0.jar:?]
  ```

  As a workaround, you can install an Amazon EMR patch, which fixes the above issue in Flink. To apply the bootstrap action when you launch your cluster, complete the following steps.

  1. Download the flink-rpm to your Amazon S3 bucket. Your RPM path is `s3://DOC-EXAMPLE-BUCKET/rpms/flink/`.

  1. Download the bootstrap script and RPM files from Amazon S3 using the following URI. Replace `regionName` with the AWS Region where you plan to launch the cluster.

     ```
     s3://emr-data-access-control-regionName/customer-bootstrap-actions/gcsc/replace-rpms.sh
     ```

  1. Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. In Amazon EMR 6.8.0 and 6.9.0, this approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

     With [Amazon EMR 6.10.0](emr-6100-release.md#emr-6100-relnotes), there's a workaround for this issue to set the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` to `false` in `yarn-site.xml`. In Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the config is set to `false` by default to resolve this issue.

**Changes, Enhancements, and Resolved Issues**
+ For Amazon EMR release 6.9.0 and later, all components installed by Amazon EMR that use Log4j libraries use Log4j version 2.17.1 or later.
+ When you use the DynamoDB connector with Spark on Amazon EMR versions 6.6.0, 6.7.0, and 6.8.0, all reads from your table return an empty result, even though the input split references non-empty data. Amazon EMR release 6.9.0 fixes this issue.
+ Amazon EMR 6.9.0 adds limited support for Lake Formation-based access control with Apache Hudi when reading data using Spark SQL. The support is for SELECT queries using Spark SQL and is limited to column-level access control. For more information, see [Hudi and Lake Formation](https://docs.aws.amazon.com/emr/latest/ManagementGuide/hudi-with-lake-formation.html).
+ When you use Amazon EMR 6.9.0 to create a Hadoop cluster with [Node Labels](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeLabel.html) enabled, the [YARN metrics API](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API) returns aggregated information across all partitions, instead of the default partition. For more information, see [YARN-11414](https://issues.apache.org/jira/browse/YARN-11414).
+ With Amazon EMR release 6.9.0, we've updated Trino to version 398, which uses Java 17. The previous supported version of Trino for Amazon EMR 6.8.0 was Trino 388 running on Java 11. For more information about this change, see [Trino updates to Java 17](https://trino.io/blog/2022/07/14/trino-updates-to-java-17.html) on the Trino blog.
+ This releases fixes a timing sequence mismatch issue between Apache BigTop and the Amazon EMR on EC2 cluster startup sequence. This timing sequence mismatch occurs when a system attempts to perform two or more operations at the same time instead of doing them in the proper sequence. As a result, certain cluster configurations experienced instance startup timeouts and slower cluster startup times.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).
**Note**  
This release no longer gets automatic AMI updates since it has been succeeded by 1 more more patch releases. The patch release is denoted by the number after the second decimal point (`6.8.1`). To see if you're using the latest patch release, check the available releases in the [https://docs.aws.amazon.com/emr/latest/ReleaseGuide](https://docs.aws.amazon.com/emr/latest/ReleaseGuide), or check the **Amazon EMR release** dropdown when you create a cluster in the console, or use the [https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html](https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html) API or [https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html](https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html) CLI action. To get updates about new releases, subscribe to the RSS feed on the [What's new?](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-whatsnew.html) page.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-690-release.html)

## 6.9.0 component versions
<a name="emr-690-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| delta | 2.1.0 | Delta lake is an open table format for huge analytic datasets | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.3.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.6.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.23.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.2.0 | EMR S3Select Connector | 
| emrfs | 2.54.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.15.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.15.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.3.3-amzn-1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.3.3-amzn-1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.3.3-amzn-1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.3.3-amzn-1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.3.3-amzn-1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.3.3-amzn-1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.3.3-amzn-1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.3.3-amzn-1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.3.3-amzn-1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.3.3-amzn-1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.3.3-amzn-1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.13-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.13-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.13-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.4.13-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.13-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.13-amzn-0 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-2 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-2 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-2 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-2 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-2 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-2 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-2 | Service for accepting Hive queries as web requests. | 
| hudi | 0.12.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.12.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.12.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.12.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.14.1-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.7.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 6.0.0-SNAPSHOT | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 6.0.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.276-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.276-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.276-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 398-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 398-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 398-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.0-amzn-1 | Spark command-line clients. | 
| spark-history-server | 3.3.0-amzn-1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.0-amzn-1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.0-amzn-1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.08.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.10.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.10.2-amzn-0 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.9.0 configuration classifications
<a name="emr-690-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.9.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| delta-defaults | Change values in Delta's delta-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j2 | Change Livy log4j2.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-lakeformation | Change values in Presto's lakeformation.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-delta | Change values in Trino's delta.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-exchange-manager | Change values in Trino's exchange-manager.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.9.0 change log
<a name="690-changelog"></a>


**Change log for 6.9.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-30 | Update release notes | Added fix for timing sequence mismatch issue | 
| 2023-08-21 | Update release notes | Added a known issue with Hadoop 3.3.3. | 
| 2023-07-26 | Update | New OS release labels 2.0.20230612.0 and 2.0.20230628.0. | 
| 2022-12-13 | Release notes updated | Added feature and known issue for runtime with SageMaker AI | 
| 2022-11-29 | Release notes and documentation updated | Added feature for Amazon Redshift integration for Apache Spark | 
| 2022-11-23 | Release notes updated | Removed Log4j entry | 
| 2022-11-18 | Deployment complete | Amazon EMR 6.9 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2022-11-18 | Docs publication | Amazon EMR 6.9 release notes first published | 
| 2022-11-14 | Initial release | Amazon EMR 6.9 deployed to limited commercial Regions | 

# Amazon EMR release 6.8.1
<a name="emr-681-release"></a>

## 6.8.1 application versions
<a name="emr-681-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.8.1 | emr-6.8.0 | emr-6.7.0 | emr-6.6.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.170 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.15 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.15.1 | 1.15.1 | 1.14.2 | 1.14.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.12-amzn-0.1 | 2.4.12-amzn-0 | 2.4.4-amzn-3 | 2.4.4-amzn-2 | 
| HCatalog | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 
| Hadoop | 3.2.1-amzn-8.1 | 3.2.1-amzn-8 | 3.2.1-amzn-7 | 3.2.1-amzn-6 | 
| Hive | 3.1.3-amzn-1.1 | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 
| Hudi | 0.11.1-amzn-0 | 0.11.1-amzn-0 | 0.11.0-amzn-0 | 0.10.1-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.10.0 | 
| Iceberg | 0.14.0-amzn-0 | 0.14.0-amzn-0 | 0.13.1-amzn-0 | 0.13.1 | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.9.1 | 1.8.0 | 1.8.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.273.3-amzn-0 | 0.273.3-amzn-0 | 0.272-amzn-0 | 0.267-amzn-0 | 
| Spark | 3.3.0-amzn-0.1 | 3.3.0-amzn-0 | 3.2.1-amzn-0 | 3.2.0-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.9.1 | 2.9.1 | 2.4.1 | 2.4.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 388-amzn-0 | 388-amzn-0 | 378-amzn-0 | 367-amzn-0 | 
| Zeppelin | 0.10.1 | 0.10.1 | 0.10.0 | 0.10.0 | 
| ZooKeeper | 3.5.10 | 3.5.10 | 3.5.7 | 3.5.7 | 

## 6.8.1 release notes
<a name="emr-681-relnotes"></a>

The following release notes include information for Amazon EMR release 6.8.1. Changes are relative to 6.8.0. For information on the release timeline, see the [6.8.1 change log](#681-changelog).

**Changes, enhancements, and resolved issues**
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. This approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  With Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` is set to `false` in `yarn-site.xml` to resolve this issue.

  While the fix addresses the issues that were introduced by YARN-9608, it might cause Hive jobs to fail due to shuffle data loss on clusters that have managed scaling enabled. We've mitigated that risk in this release by also setting `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-shuffle-data` for Hive workloads. This config is only available with Amazon EMR releases 6.11.0 and higher.
+ Metrics collector won't send any metrics to the control plane after failover of primary node in clusters with the instance groups configuration.
+ This release eliminates retries on failed HTTP requests to metrics collector endpoints.
+ This release includes a change that allows high-availability clusters to recover from failed state after restart.
+ This release fixes an issue where large user-created UIDs caused overflow exceptions.
+ This release fixes timeout issues with the Amazon EMR reconfiguration process.
+ This release prevents an issue where failed reconfiguration might break other, unrelated processes.
+ This release includes security fixes.
+ This release fixes an issue where clusters that are running workloads on Spark with Amazon EMR might silently receive incorrect results with `contains`, `startsWith`, `endsWith`, and `like`. This issue occurs when you use the expressions on partitioned fields that have metadata in the Amazon EMR Hive3 Metastore Server (HMS).
+ With Amazon EMR 6.6.0 through 6.9.x, INSERT queries with dynamic partition and an ORDER BY or SORT BY clause will always have two reducers. This issue is caused by OSS change [HIVE-20703](https://issues.apache.org/jira/browse/HIVE-20703), which puts dynamic sort partition optimization under cost-based decision. If your workload doesn't require sorting of dynamic partitions, we recommend that you set the `hive.optimize.sort.dynamic.partition.threshold` property to `-1` to disable the new feature and get the correctly calculated number of reducers. This issue is fixed in OSS Hive as part of [HIVE-22269](https://issues.apache.org/jira/browse/HIVE-22269) and is fixed in Amazon EMR 6.10.0.
+ Hive might experience data loss when you use HDFS as a scratch directory and you have enabled merge small files, and the table contains static partition paths.
+ This release fixes a performance issue with Hive if merge small files (disabled by default) is enabled at the end of ETL job.
+ This release fixes an issue with throttling on the Glue side when there are no user-defined functions (UDF).
+ This release fixes an issue that deletes container logs by the node log aggregation service before log pusher can push them to S3 in case of YARN decommissioning.
+ This release fixes handling of compacted/archived files with persistent storefile tracking for HBase.
+ This release fixes an issue that impacted Spark performance when you set a default `true` value for the `spark.yarn.heterogeneousExecutors.enabled` config in `spark-defaults.conf`.
+ This release fixes an issue with Reduce Task failing to read shuffle data. The issue caused Hive query failures with a corrupted memory error.
+ This release fixes an issue that caused the node provisioner to fail if the HDFS NameNode (NN) service was stuck in safemode during node replacement.
+ This release adds a new retry mechanism to the cluster scaling workflow for EMR clusters that run Presto or Trino. This improvement reduces the risk that cluster resizing will indefinitely stall due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.
+ This release improves cluster scale-down logic so that your cluster doesn't attempt a scale-down of core nodes below the HDFS replication factor setting for the cluster. This aligns with your data redundancy requirements, and reduces the chance that a scaling operation might stall.
+ The log management daemon has been upgraded to identify all logs that are in active use with open file handles on the local instance storage, and the associated processes. This upgrade ensures that Amazon EMR properly deletes the files and reclaims storage space after the logs are archived to Amazon S3.
+ This release includes a log-management daemon enhancement that deletes empty, unused steps directories in the local cluster file system. An excessively large number of empty directories can degrade the performance of Amazon EMR daemons and result in disk over-utilization.
+ This release fixes an issue that might occur when you create an edge node by replicating one of the primary nodes from a cluster with multiple primary nodes. The replicated edge node could cause delays with scale-down operations, or result in high memory-utilization on the primary nodes. For more information on how to create an edge node to communicate with your EMR cluster, see [Edge Node Creator](https://github.com/aws-samples/aws-emr-utilities/tree/main/utilities/emr-edge-node-creator) in the `aws-samples` repo on GitHub.
+ This release improves the automation process that Amazon EMR uses to re-mount Amazon EBS volumes to an instance after a reboot.
+ This release fixes an issue that resulted in intermittent gaps in the Hadoop metrics that Amazon EMR publishes to Amazon CloudWatch.
+ This release fixes an issue with EMR clusters where an update to the YARN configuration file that contains the exclusion list of nodes for the cluster is interrupted due to disk over-utilization. The incomplete update hinders future cluster scale-down operations. This release ensures that your cluster remains healthy, and that scaling operations work as expected.
+ This release improves the on-cluster log management daemon to monitor additional log folders in your EMR cluster. This improvement minimizes disk over-utilization scenarios.
+ This release automatically restarts the on-cluster log management daemon when it stops. This improvement reduces the risk for nodes to appear unhealthy due to disk over-utilization. 
+ This release adds support for archiving logs to Amazon S3 during cluster scale-down. Previously, you could only archive log files to Amazon S3 during cluster termination. The new capability ensures that log files generated on the cluster persist on Amazon S3 even after the node is terminated. For more information, see [Configure cluster logging and debugging](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-debugging.html).
+ This release fixes an issue that occurred when the Amazon S3 URI for a bootstrap action ended with a port number, for example: `a.b.c.d:4345`. Amazon EMR was incorrectly parsing these URIs, so any associated bootstrap actions would fail.
+ This releases fixes a timing sequence mismatch issue between Apache BigTop and the Amazon EMR on EC2 cluster startup sequence. This timing sequence mismatch occurs when a system attempts to perform two or more operations at the same time instead of doing them in the proper sequence. As a result, certain cluster configurations experienced instance startup timeouts and slower cluster startup times.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-681-release.html)

## 6.8.1 component versions
<a name="emr-681-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.22.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.53.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.15.1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.15.1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-8.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-8.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-8.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-8.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-8.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-8.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-8.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-8.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-8.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-8.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-8.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.12-amzn-0.1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.12-amzn-0.1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.12-amzn-0.1 | HBase command-line client. | 
| hbase-rest-server | 2.4.12-amzn-0.1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.12-amzn-0.1 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.12-amzn-0.1 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-1.1 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-1.1 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-1.1 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-1.1 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-1.1 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-1.1 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-1.1 | Service for accepting Hive queries as web requests. | 
| hudi | 0.11.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.11.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.11.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.11.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.14.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.7.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.273.3-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.273.3-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.273.3-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 388-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 388-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 388-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.0-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.3.0-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.0-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.0-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.06.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.9.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.8.1 configuration classifications
<a name="emr-681-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.8.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.8.1 change log
<a name="681-changelog"></a>


**Change log for 6.8.1 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-30 | Update release notes | Added several control-plane related fixes to the release notes | 
| 2023-08-21 | Docs publication | Amazon EMR 6.8.1 release notes first published | 
| 2023-08-16 | Deployment complete | Amazon EMR 6.8.1 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2023-08-04 | Initial release | Amazon EMR 6.8.1 first deployed to limited commercial Regions | 

# Amazon EMR release 6.8.0
<a name="emr-680-release"></a>

## 6.8.0 application versions
<a name="emr-680-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.8.0 | emr-6.7.0 | emr-6.6.0 | emr-6.5.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.170 | 1.12.170 | 1.12.31 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.15 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.15.1 | 1.14.2 | 1.14.2 | 1.14.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.12-amzn-0 | 2.4.4-amzn-3 | 2.4.4-amzn-2 | 2.4.4-amzn-1 | 
| HCatalog | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 
| Hadoop | 3.2.1-amzn-8 | 3.2.1-amzn-7 | 3.2.1-amzn-6 | 3.2.1-amzn-5 | 
| Hive | 3.1.3-amzn-1 | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 
| Hudi | 0.11.1-amzn-0 | 0.11.0-amzn-0 | 0.10.1-amzn-0 | 0.9.0-amzn-1 | 
| Hue | 4.10.0 | 4.10.0 | 4.10.0 | 4.9.0 | 
| Iceberg | 0.14.0-amzn-0 | 0.13.1-amzn-0 | 0.13.1 | 0.12.0 | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.9.1 | 1.8.0 | 1.8.0 | 1.8.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.273.3-amzn-0 | 0.272-amzn-0 | 0.267-amzn-0 | 0.261-amzn-0 | 
| Spark | 3.3.0-amzn-0 | 3.2.1-amzn-0 | 3.2.0-amzn-0 | 3.1.2-amzn-1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.9.1 | 2.4.1 | 2.4.1 | 2.4.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 388-amzn-0 | 378-amzn-0 | 367-amzn-0 | 360 | 
| Zeppelin | 0.10.1 | 0.10.0 | 0.10.0 | 0.10.0 | 
| ZooKeeper | 3.5.10 | 3.5.7 | 3.5.7 | 3.5.7 | 

## 6.8.0 release notes
<a name="emr-680-relnotes"></a>

The following release notes include information for Amazon EMR release 6.8.0. Changes are relative to 6.7.0.

**New Features**
+ Amazon EMR steps feature now supports Apache Livy endpoint and JDBC/ODBC clients. For more information, see [Configure runtime roles for Amazon EMR steps](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-steps-runtime-roles.html).
+ Amazon EMR release 6.8.0 comes with Apache HBase release 2.4.12. With this HBase release, you can both archive and delete your HBase tables. The Amazon S3 archive process renames all table files to the archive directory. This can be a costly and lengthy process. Now, you can skip the archive process and quickly drop and delete large tables. For more information, see [Using the HBase shell](emr-hbase-connect.md).

**Known Issues**
+ Hadoop 3.3.3 introduced a change in YARN ([YARN-9608](https://issues.apache.org/jira/browse/YARN-9608)) that keeps nodes where containers ran in a decommissioning state until the application completes. This change ensures that local data such as shuffle data doesn't get lost, and you don' need to re-run the job. In Amazon EMR 6.8.0 and 6.9.0, this approach might also lead to underutilization of resources on clusters with or without managed scaling enabled.

  With [Amazon EMR 6.10.0](emr-6100-release.md#emr-6100-relnotes), there's a workaround for this issue to set the value of `yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications` to `false` in `yarn-site.xml`. In Amazon EMR releases 6.11.0 and higher as well as 6.8.1, 6.9.1, and 6.10.1, the config is set to `false` by default to resolve this issue.

**Changes, Enhancements, and Resolved Issues**
+ When Amazon EMR release 6.5.0, 6.6.0, or 6.7.0 read Apache Phoenix tables through the Apache Spark shell, Amazon EMR produced a `NoSuchMethodError`. Amazon EMR release 6.8.0 fixes this issue.
+ Amazon EMR release 6.8.0 comes with [Apache Hudi](https://hudi.apache.org/) 0.11.1; however, Amazon EMR 6.8.0 clusters are also compatible with the open-source `hudi-spark3.3-bundle_2.12` from Hudi 0.12.0.
+ Amazon EMR release 6.8.0 comes with Apache Spark 3.3.0. This Spark release uses Apache Log4j 2 and the `log4j2.properties` file to configure Log4j in Spark processes. If you use Spark in the cluster or create EMR clusters with custom configuration parameters, and you want to upgrade to Amazon EMR release 6.8.0, you must migrate to the new `spark-log4j2` configuration classification and key format for Apache Log4j 2. For more information, see [Migrating from Apache Log4j 1.x to Log4j 2.x](emr-spark-configure.md#spark-migrate-logj42).
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).
**Note**  
This release no longer gets automatic AMI updates since it has been succeeded by 1 more more patch releases. The patch release is denoted by the number after the second decimal point (`6.8.1`). To see if you're using the latest patch release, check the available releases in the [https://docs.aws.amazon.com/emr/latest/ReleaseGuide](https://docs.aws.amazon.com/emr/latest/ReleaseGuide), or check the **Amazon EMR release** dropdown when you create a cluster in the console, or use the [https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html](https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html) API or [https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html](https://docs.aws.amazon.com/cli/latest/reference/emr/list-release-labels.html) CLI action. To get updates about new releases, subscribe to the RSS feed on the [What's new?](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-whatsnew.html) page.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-680-release.html)

**Known Issues**
+ When you use the DynamoDB connector with Spark on Amazon EMR versions 6.6.0, 6.7.0, and 6.8.0, all reads from your table return an empty result, even though the input split references non-empty data. This is because Spark 3.2.0 sets `spark.hadoopRDD.ignoreEmptySplits` to `true` by default. As a workaround, explicitly set `spark.hadoopRDD.ignoreEmptySplits` to `false`. Amazon EMR release 6.9.0 fixes this issue.
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.
+ With Amazon EMR releases 5.36.0 and 6.6.0 through 6.9.0, `SecretAgent` and `RecordServer` service components may experience log data loss due to an incorrect file name pattern configuration in Log4j2 properties. The incorrect configuration causes the components to generate only one log file per day. When the rotation strategy occurs, it overwrites the existing file instead of generating a new log file as expected. As a workaround, use a bootstrap action to generate log files each hour and append an auto-increment integer in the file name to handle the rotation.

  For Amazon EMR 6.6.0 through 6.9.0 releases, use the following bootstrap action when you launch a cluster. 

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-6x/replace-puppet.sh,Args=[]"
  ```

  For Amazon EMR 5.36.0, use the following bootstrap action when you launch a cluster.

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-5x/replace-puppet.sh,Args=[]"
  ```

For more information on the release timeline, see the [change log](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-680-release.html#680-changelog).

## 6.8.0 component versions
<a name="emr-680-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.2 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.7.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.22.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.53.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.15.1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.15.1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-8 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-8 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-8 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-8 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-8 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-8 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-8 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-8 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-8 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-8 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-8 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.12-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.12-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.12-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.4.12-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.12-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.12-amzn-0 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-1 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-1 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-1 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-1 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-1 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-1 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-1 | Service for accepting Hive queries as web requests. | 
| hudi | 0.11.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.11.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.11.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.11.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.14.0-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.9.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.7.0 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.273.3-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.273.3-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.273.3-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 388-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 388-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 388-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.3.0-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.3.0-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.3.0-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.3.0-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.06.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.9.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.10 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.10 | ZooKeeper command line client. | 

## 6.8.0 configuration classifications
<a name="emr-680-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.8.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j2 | Change values in Spark's log4j2.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

## 6.8.0 change log
<a name="680-changelog"></a>


**Change log for 6.8.0 release and release notes**  

| Date | Event | Description | 
| --- | --- | --- | 
| 2023-08-21 | Update | Added a known issue with Hadoop 3.3.3. | 
| 2023-07-26 | Update | New OS release labels 2.0.20230612.0 and 2.0.20230628.0. | 
| 2022-09-06 | Deployment complete | Amazon EMR 6.8 fully deployed to all [supported Regions](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) | 
| 2022-09-06 | Initial publication | Amazon EMR 6.8 release notes first published | 
| 2022-08-31 | Initial release | Amazon EMR 6.8 released to limited commercial Regions | 

# Amazon EMR release 6.7.0
<a name="emr-670-release"></a>

## 6.7.0 application versions
<a name="emr-670-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.7.0 | emr-6.6.0 | emr-6.5.0 | emr-6.4.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.170 | 1.12.31 | 1.12.31 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.15 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.14.2 | 1.14.2 | 1.14.0 | 1.13.1 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.4-amzn-3 | 2.4.4-amzn-2 | 2.4.4-amzn-1 | 2.4.4-amzn-0 | 
| HCatalog | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 
| Hadoop | 3.2.1-amzn-7 | 3.2.1-amzn-6 | 3.2.1-amzn-5 | 3.2.1-amzn-4 | 
| Hive | 3.1.3-amzn-0 | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 
| Hudi | 0.11.0-amzn-0 | 0.10.1-amzn-0 | 0.9.0-amzn-1 | 0.8.0-amzn-0 | 
| Hue | 4.10.0 | 4.10.0 | 4.9.0 | 4.9.0 | 
| Iceberg | 0.13.1-amzn-0 | 0.13.1 | 0.12.0 |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.4.1 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 
| MXNet | 1.8.0 | 1.8.0 | 1.8.0 | 1.8.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.1.2 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.272-amzn-0 | 0.267-amzn-0 | 0.261-amzn-0 | 0.254.1-amzn-0 | 
| Spark | 3.2.1-amzn-0 | 3.2.0-amzn-0 | 3.1.2-amzn-1 | 3.1.2-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 378-amzn-0 | 367-amzn-0 | 360 | 359 | 
| Zeppelin | 0.10.0 | 0.10.0 | 0.10.0 | 0.9.0 | 
| ZooKeeper | 3.5.7 | 3.5.7 | 3.5.7 | 3.5.7 | 

## 6.7.0 release notes
<a name="emr-670-relnotes"></a>

The following release notes include information for Amazon EMR release 6.7.0. Changes are relative to 6.6.0.

Initial release date: July 15, 2022

**New Features**
+ Amazon EMR now supports Apache Spark 3.2.1, Apache Hive 3.1.3, HUDI 0.11, PrestoDB 0.272, and Trino 0.378.
+ Supports IAM Role and Lake Formation-based access controls with EMR steps (Spark, Hive) for Amazon EMR on EC2 clusters.
+ Supports Apache Spark data definition statements on Apache Ranger enabled clusters. This now includes support for Trino applications reading and writing Apache Hive metadata on Apache Ranger enabled clusters. For more information, see [Enable federated governance using Trino and Apache Ranger on Amazon EMR](https://aws.amazon.com/blogs/big-data/enable-federated-governance-using-trino-and-apache-ranger-on-amazon-emr/).
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-670-release.html)

**Known Issues**
+ When Amazon EMR release 6.5.0, 6.6.0, or 6.7.0 read Apache Phoenix tables through the Apache Spark shell, a `NoSuchMethodError` occurs because Amazon EMR uses an incorrect `Hbase.compat.version`. Amazon EMR release 6.8.0 fixes this issue.
+ When you use the DynamoDB connector with Spark on Amazon EMR versions 6.6.0, 6.7.0, and 6.8.0, all reads from your table return an empty result, even though the input split references non-empty data. This is because Spark 3.2.0 sets `spark.hadoopRDD.ignoreEmptySplits` to `true` by default. As a workaround, explicitly set `spark.hadoopRDD.ignoreEmptySplits` to `false`. Amazon EMR release 6.9.0 fixes this issue.
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.
+ With Amazon EMR releases 5.36.0 and 6.6.0 through 6.9.0, `SecretAgent` and `RecordServer` service components may experience log data loss due to an incorrect file name pattern configuration in Log4j2 properties. The incorrect configuration causes the components to generate only one log file per day. When the rotation strategy occurs, it overwrites the existing file instead of generating a new log file as expected. As a workaround, use a bootstrap action to generate log files each hour and append an auto-increment integer in the file name to handle the rotation.

  For Amazon EMR 6.6.0 through 6.9.0 releases, use the following bootstrap action when you launch a cluster. 

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-6x/replace-puppet.sh,Args=[]"
  ```

  For Amazon EMR 5.36.0, use the following bootstrap action when you launch a cluster.

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-5x/replace-puppet.sh,Args=[]"
  ```
+ The `GetClusterSessionCredentials` API isn't supported with clusters that run on Amazon EMR 6.7 or lower.
+ The following Hadoop commits were backported.

  - [[HADOOP-16080]](https://issues.apache.org/jira/browse/HADOOP-16080) Fix issue where `hadoop-aws` not working with `hadoop-client-api`.

  - [[HADOOP-18237]](https://issues.apache.org/jira/browse/HADOOP-18237) Upgrade Apache Xerces Java to 2.12.2.

  - [[YARN-11092]](https://issues.apache.org/jira/browse/YARN-11092) Upgrade jquery to ui to 1.13.1.

  - [[YARN-10720]](https://issues.apache.org/jira/browse/YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent proxy server from hanging.

## 6.7.0 component versions
<a name="emr-670-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.6.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.22.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.52.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.14.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.14.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-7 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-7 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-7 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-7 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-7 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-7 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-7 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-7 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-7 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-7 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-7 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.4-amzn-3 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.4-amzn-3 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.4-amzn-3 | HBase command-line client. | 
| hbase-rest-server | 2.4.4-amzn-3 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.4-amzn-3 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.4-amzn-3 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.3-amzn-0 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.3-amzn-0 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.3-amzn-0 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.3-amzn-0 | Hive command line client. | 
| hive-hbase | 3.1.3-amzn-0 | Hive-hbase client. | 
| hive-metastore-server | 3.1.3-amzn-0 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.3-amzn-0 | Service for accepting Hive queries as web requests. | 
| hudi | 0.11.0-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.11.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.11.0-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.11.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.13.1-amzn-0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.8.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.0.194 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.272-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.272-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.272-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 378-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 378-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 378-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.2.1-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.2.1-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.2.1-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.2.1-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.02.0-amzn-1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.7 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.7 | ZooKeeper command line client. | 

## 6.7.0 configuration classifications
<a name="emr-670-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.7.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.6.0
<a name="emr-660-release"></a>

## 6.6.0 application versions
<a name="emr-660-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.6.0 | emr-6.5.0 | emr-6.4.0 | emr-6.3.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.170 | 1.12.31 | 1.12.31 | 1.11.977 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.14.2 | 1.14.0 | 1.13.1 | 1.12.1 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.4-amzn-2 | 2.4.4-amzn-1 | 2.4.4-amzn-0 | 2.2.6-amzn-1 | 
| HCatalog | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 
| Hadoop | 3.2.1-amzn-6 | 3.2.1-amzn-5 | 3.2.1-amzn-4 | 3.2.1-amzn-3.1 | 
| Hive | 3.1.2-amzn-7 | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 
| Hudi | 0.10.1-amzn-0 | 0.9.0-amzn-1 | 0.8.0-amzn-0 | 0.7.0-amzn-0 | 
| Hue | 4.10.0 | 4.9.0 | 4.9.0 | 4.9.0 | 
| Iceberg | 0.13.1 | 0.12.0 |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.4.1 | 1.2.2 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.1-incubating | 0.7.0-incubating | 
| MXNet | 1.8.0 | 1.8.0 | 1.8.0 | 1.7.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.1.2 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.267-amzn-0 | 0.261-amzn-0 | 0.254.1-amzn-0 | 0.245.1-amzn-0 | 
| Spark | 3.2.0-amzn-0 | 3.1.2-amzn-1 | 3.1.2-amzn-0 | 3.1.1-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 367-amzn-0 | 360 | 359 | 350 | 
| Zeppelin | 0.10.0 | 0.10.0 | 0.9.0 | 0.9.0 | 
| ZooKeeper | 3.5.7 | 3.5.7 | 3.5.7 | 3.4.14 | 

## 6.6.0 release notes
<a name="emr-660-relnotes"></a>

The following release notes include information for Amazon EMR release 6.6.0. Changes are relative to 6.5.0.

Initial release date: May 9, 2022

Updated documentation date: June 15, 2022

**New Features**
+ Amazon EMR 6.6 now supports Apache Spark 3.2, Apache Spark RAPIDS 22.02, CUDA 11, Apache Hudi 0.10.1, Apache Iceberg 0.13, Trino 0.367 and PrestoDB 0.267.
+ When you launch a cluster with *the latest patch release* of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see [Using the default Amazon Linux AMI for Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-default-ami.html).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-660-release.html)
+ With Amazon EMR 6.6 and later, applications that use Log4j 1.x and Log4j 2.x are upgraded to use Log4j 1.2.17 (or higher) and Log4j 2.17.1 (or higher) respectively, and do not require using the [bootstrap actions](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-log4j-vulnerability.html) provided to mitigate the CVE issues.
+ **[Managed scaling] Spark shuffle data managed scaling optimization** - For Amazon EMR versions 5.34.0 and later, and EMR versions 6.4.0 and later, managed scaling is now Spark shuffle data aware (data that Spark redistributes across partitions to perform specific operations). For more information on shuffle operations, see [Using EMR managed scaling in Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-scaling.html) in the *Amazon EMR Management Guide* and [Spark Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html#shuffle-operations).
+ Starting with Amazon EMR 5.32.0 and 6.5.0, dynamic executor sizing for Apache Spark is enabled by default. To turn this feature on or off, you can use the `spark.yarn.heterogeneousExecutors.enabled` configuration parameter.

**Changes, Enhancements, and Resolved Issues**
+ Amazon EMR reduces cluster startup time by up to 80 seconds on average for clusters that use the EMR default AMI option and only install common applications, such as Apache Hadoop, Apache Spark and Apache Hive.

**Known Issues**
+ When Amazon EMR release 6.5.0, 6.6.0, or 6.7.0 read Apache Phoenix tables through the Apache Spark shell, a `NoSuchMethodError` occurs because Amazon EMR uses an incorrect `Hbase.compat.version`. Amazon EMR release 6.8.0 fixes this issue.
+ When you use the DynamoDB connector with Spark on Amazon EMR versions 6.6.0, 6.7.0, and 6.8.0, all reads from your table return an empty result, even though the input split references non-empty data. This is because Spark 3.2.0 sets `spark.hadoopRDD.ignoreEmptySplits` to `true` by default. As a workaround, explicitly set `spark.hadoopRDD.ignoreEmptySplits` to `false`. Amazon EMR release 6.9.0 fixes this issue.
+ On Trino long-running clusters, Amazon EMR 6.6.0 enables Garbage Collection logging parameters in the Trino jvm.config to get better insights from the Garbage Collection logs. This change appends many Garbage Collection logs to the launcher.log (/var/log/trino/launcher.log) file. If you are running Trino clusters in Amazon EMR 6.6.0, you may encounter nodes running out of disk space after the cluster has been running for a couple of days due to the appended logs.

  The workaround for this issue is to run the script below as a Bootstrap Action to disable the Garbage Collection logging parameters in jvm.config while creating or cloning the cluster for Amazon EMR 6.6.0.

  ```
  #!/bin/bash
    set -ex
    PRESTO_PUPPET_DIR='/var/aws/emr/bigtop-deploy/puppet/modules/trino'
    sudo bash -c "sed -i '/-Xlog/d' ${PRESTO_PUPPET_DIR}/templates/jvm.config"
  ```
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.
+ With Amazon EMR releases 5.36.0 and 6.6.0 through 6.9.0, `SecretAgent` and `RecordServer` service components may experience log data loss due to an incorrect file name pattern configuration in Log4j2 properties. The incorrect configuration causes the components to generate only one log file per day. When the rotation strategy occurs, it overwrites the existing file instead of generating a new log file as expected. As a workaround, use a bootstrap action to generate log files each hour and append an auto-increment integer in the file name to handle the rotation.

  For Amazon EMR 6.6.0 through 6.9.0 releases, use the following bootstrap action when you launch a cluster. 

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-6x/replace-puppet.sh,Args=[]"
  ```

  For Amazon EMR 5.36.0, use the following bootstrap action when you launch a cluster.

  ```
  ‑‑bootstrap‑actions "Path=s3://emr-data-access-control-us-east-1/customer-bootstrap-actions/log-rotation-emr-5x/replace-puppet.sh,Args=[]"
  ```

## 6.6.0 component versions
<a name="emr-660-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.5.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.20.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.50.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.14.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.14.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-6 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-6 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-6 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-6 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-6 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-6 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-6 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-6 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-6 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-6 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-6 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.4-amzn-2 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.4-amzn-2 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.4-amzn-2 | HBase command-line client. | 
| hbase-rest-server | 2.4.4-amzn-2 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.4-amzn-2 | Service providing a Thrift endpoint to HBase. | 
| hbase-operator-tools | 2.4.4-amzn-2 | Repair tool for Apache HBase clusters. | 
| hcatalog-client | 3.1.2-amzn-7 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-7 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-7 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-7 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-7 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-7 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-7 | Service for accepting Hive queries as web requests. | 
| hudi | 0.10.1-amzn-0 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.10.1-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.10.1-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.10.1-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.10.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.13.1 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.8.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 11.0.194 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-connectors | 5.1.2 | Apache Phoenix-Connectors for Spark-3 | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.267-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.267-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.267-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 367-amzn-0 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 367-amzn-0 | Service for executing pieces of a query. | 
| trino-client | 367-amzn-0 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.2.0-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.2.0-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.2.0-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.2.0-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 22.02.0-amzn-0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.7 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.7 | ZooKeeper command line client. | 

## 6.6.0 configuration classifications
<a name="emr-660-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.6.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-iceberg | Change values in Trino's iceberg.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.5.0
<a name="emr-650-release"></a>

## 6.5.0 application versions
<a name="emr-650-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://iceberg.apache.org/](https://iceberg.apache.org/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.5.0 | emr-6.4.0 | emr-6.3.1 | emr-6.3.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.31 | 1.12.31 | 1.11.977 | 1.11.977 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.14.0 | 1.13.1 | 1.12.1 | 1.12.1 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.4-amzn-1 | 2.4.4-amzn-0 | 2.2.6-amzn-1 | 2.2.6-amzn-1 | 
| HCatalog | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 
| Hadoop | 3.2.1-amzn-5 | 3.2.1-amzn-4 | 3.2.1-amzn-3.1 | 3.2.1-amzn-3 | 
| Hive | 3.1.2-amzn-6 | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 
| Hudi | 0.9.0-amzn-1 | 0.8.0-amzn-0 | 0.7.0-amzn-0 | 0.7.0-amzn-0 | 
| Hue | 4.9.0 | 4.9.0 | 4.9.0 | 4.9.0 | 
| Iceberg | 0.12.0 |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.4.1 | 1.2.2 | 1.2.2 | 
| Livy | 0.7.1-incubating | 0.7.1-incubating | 0.7.0-incubating | 0.7.0-incubating | 
| MXNet | 1.8.0 | 1.8.0 | 1.7.0 | 1.7.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.1 | 
| Phoenix | 5.1.2 | 5.1.2 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.261-amzn-0 | 0.254.1-amzn-0 | 0.245.1-amzn-0 | 0.245.1-amzn-0 | 
| Spark | 3.1.2-amzn-1 | 3.1.2-amzn-0 | 3.1.1-amzn-0.1 | 3.1.1-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.4.1 | 2.4.1 | 2.4.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 360 | 359 | 350 | 350 | 
| Zeppelin | 0.10.0 | 0.9.0 | 0.9.0 | 0.9.0 | 
| ZooKeeper | 3.5.7 | 3.5.7 | 3.4.14 | 3.4.14 | 

## 6.5.0 release notes
<a name="emr-650-relnotes"></a>

The following release notes include information for Amazon EMR release 6.5.0. Changes are relative to 6.4.0.

Initial release date: January 20, 2022

Updated release date: March 21, 2022

**New Features**
+ **[Managed scaling] Spark shuffle data managed scaling optimization** - For Amazon EMR versions 5.34.0 and later, and EMR versions 6.4.0 and later, managed scaling is now Spark shuffle data aware (data that Spark redistributes across partitions to perform specific operations). For more information on shuffle operations, see [Using EMR managed scaling in Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-scaling.html) in the *Amazon EMR Management Guide* and [Spark Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html#shuffle-operations).
+ Starting with Amazon EMR 5.32.0 and 6.5.0, dynamic executor sizing for Apache Spark is enabled by default. To turn this feature on or off, you can use the `spark.yarn.heterogeneousExecutors.enabled` configuration parameter.
+ Support for Apache Iceberg open table format for huge analytic datasets.
+ Support for ranger-trino-plugin 2.0.1-amzn-1
+ Support for toree 0.5.0

**Changes, Enhancements, and Resolved Issues**
+ Amazon EMR 6.5 release version now supports Apache Iceberg 0.12.0, and provides runtime improvements with Amazon EMR Runtime for Apache Spark, Amazon EMR Runtime for Presto, and Amazon EMR Runtime for Apache Hive.
+ [Apache Iceberg](https://iceberg.apache.org/) is an open table format for large data sets in Amazon S3 and provides fast query performance over large tables, atomic commits, concurrent writes, and SQL-compatible table evolution. With EMR 6.5, you can use Apache Spark 3.1.2 with the Iceberg table format.
+ Apache Hudi 0.9 adds Spark SQL DDL and DML support. This allows you to create, upsert Hudi tables using just SQL statements. Apache Hudi 0.9 also includes query side and writer side performance improvements.
+ Amazon EMR Runtime for Apache Hive improves Apache Hive performance on Amazon S3 by removing rename operations during staging operations, and improves performance for metastore check (MSCK) commands used for repairing tables.

**Known Issues**
+ When Amazon EMR release 6.5.0, 6.6.0, or 6.7.0 read Apache Phoenix tables through the Apache Spark shell, a `NoSuchMethodError` occurs because Amazon EMR uses an incorrect `Hbase.compat.version`. Amazon EMR release 6.8.0 fixes this issue.
+ Hbase bundle clusters in high availability (HA) fail to provision with the default volume size and instance type. The workaround for this issue is to increase the root volume size.
+ To use Spark actions with Apache Oozie, you must add the following configuration to your Oozie `workflow.xml` file. Otherwise, several critical libraries such as Hadoop and EMRFS will be missing from the classpath of the Spark executors that Oozie launches.

  ```
  <spark-opts>--conf spark.yarn.populateHadoopClasspath=true</spark-opts>
  ```
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.5.0 component versions
<a name="emr-650-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.4.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.19.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.48.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.14.0 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.14.0 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-5 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-5 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-5 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-5 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-5 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-5 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-5 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-5 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-5 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-5 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-5 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.4-amzn-1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.4-amzn-1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.4-amzn-1 | HBase command-line client. | 
| hbase-rest-server | 2.4.4-amzn-1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.4-amzn-1 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-6 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-6 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-6 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-6 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-6 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-6 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-6 | Service for accepting Hive queries as web requests. | 
| hudi | 0.9.0-amzn-1 | Incremental processing framework to power data pipeline at low latency and high efficiency. | 
| hudi-presto | 0.9.0-amzn-1 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.9.0-amzn-1 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.9.0-amzn-1 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.9.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| iceberg | 0.12.0 | Apache Iceberg is an open table format for huge analytic datasets | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.8.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.261-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.261-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.261-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 360 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 360 | Service for executing pieces of a query. | 
| trino-client | 360 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.1.2-amzn-1 | Spark command-line clients. | 
| spark-history-server | 3.1.2-amzn-1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.1.2-amzn-1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.1.2-amzn-1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.4.1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.10.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.7 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.7 | ZooKeeper command line client. | 

## 6.5.0 configuration classifications
<a name="emr-650-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.5.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| iceberg-defaults | Change values in Iceberg's iceberg-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.4.0
<a name="emr-640-release"></a>

## 6.4.0 application versions
<a name="emr-640-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://trino.io/](https://trino.io/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.4.0 | emr-6.3.1 | emr-6.3.0 | emr-6.2.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.12.31 | 1.11.977 | 1.11.977 | 1.11.880 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.13.1 | 1.12.1 | 1.12.1 | 1.11.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.4.4-amzn-0 | 2.2.6-amzn-1 | 2.2.6-amzn-1 | 2.2.6-amzn-0 | 
| HCatalog | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 
| Hadoop | 3.2.1-amzn-4 | 3.2.1-amzn-3.1 | 3.2.1-amzn-3 | 3.2.1-amzn-2.1 | 
| Hive | 3.1.2-amzn-5 | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 
| Hudi | 0.8.0-amzn-0 | 0.7.0-amzn-0 | 0.7.0-amzn-0 | 0.6.0-amzn-1 | 
| Hue | 4.9.0 | 4.9.0 | 4.9.0 | 4.8.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.4.1 | 1.2.2 | 1.2.2 | 1.1.0 | 
| Livy | 0.7.1-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 
| MXNet | 1.8.0 | 1.7.0 | 1.7.0 | 1.7.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.1 | 5.2.0 | 
| Phoenix | 5.1.2 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.254.1-amzn-0 | 0.245.1-amzn-0 | 0.245.1-amzn-0 | 0.238.3-amzn-1 | 
| Spark | 3.1.2-amzn-0 | 3.1.1-amzn-0.1 | 3.1.1-amzn-0 | 3.0.1-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.4.1 | 2.4.1 | 2.3.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 359 | 350 | 350 | 343 | 
| Zeppelin | 0.9.0 | 0.9.0 | 0.9.0 | 0.9.0-preview1 | 
| ZooKeeper | 3.5.7 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.4.0 release notes
<a name="emr-640-relnotes"></a>

The following release notes include information for Amazon EMR release 6.4.0. Changes are relative to 6.3.0.

Initial release date: Sept 20, 2021

Updated release date: March 21, 2022

**Supported applications**
+ AWS SDK for Java version 1.12.31
+ CloudWatch Sink version 2.2.0
+ DynamoDB Connector version 4.16.0
+ EMRFS version 2.47.0
+ Amazon EMR Goodies version 3.2.0
+ Amazon EMR Kinesis Connector version 3.5.0
+ Amazon EMR Record Server version 2.1.0
+ Amazon EMR Scripts version 2.5.0
+ Flink version 1.13.1
+ Ganglia version 3.7.2
+ AWS Glue Hive Metastore Client version 3.3.0
+ Hadoop version 3.2.1-amzn-4
+ HBase version 2.4.4-amzn-0
+ HBase-operator-tools 1.1.0
+ HCatalog version 3.1.2-amzn-5
+ Hive version 3.1.2-amzn-5
+ Hudi version 0.8.0-amzn-0
+ Hue version 4.9.0
+ Java JDK version Corretto-8.302.08.1 (build 1.8.0\$1302-b08)
+ JupyterHub version 1.4.1
+ Livy version 0.7.1-incubating
+ MXNet version 1.8.0
+ Oozie version 5.2.1
+ Phoenix version 5.1.2
+ Pig version 0.17.0
+ Presto version 0.254.1-amzn-0
+ Trino version 359
+ Apache Ranger KMS (multi-master transparent encryption) version 2.0.0
+ ranger-plugins 2.0.1-amzn-0
+ ranger-s3-plugin 1.2.0
+ SageMaker Spark SDK version 1.4.1
+ Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0\$1282)
+ Spark version 3.1.2-amzn-0
+ spark-rapids 0.4.1
+ Sqoop version 1.4.7
+ TensorFlow version 2.4.1
+ tez version 0.9.2
+ Zeppelin version 0.9.0
+ Zookeeper version 3.5.7
+ Connectors and drivers: DynamoDB Connector 4.16.0

**New features**
+ **[Managed scaling] Spark shuffle data managed scaling optimization** - For Amazon EMR versions 5.34.0 and later, and EMR versions 6.4.0 and later, managed scaling is now Spark shuffle data aware (data that Spark redistributes across partitions to perform specific operations). For more information on shuffle operations, see [Using EMR managed scaling in Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-scaling.html) in the *Amazon EMR Management Guide* and [Spark Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html#shuffle-operations).
+ On Apache Ranger-enabled Amazon EMR clusters, you can use Apache Spark SQL to insert data into or update the Apache Hive metastore tables using `INSERT INTO`, `INSERT OVERWRITE`, and `ALTER TABLE`. When using ALTER TABLE with Spark SQL, a partition location must be the child directory of a table location. Amazon EMR does not currently support inserting data into a partition where the partition location is different from the table location.
+ PrestoSQL has been [renamed to Trino.](https://trino.io/blog/2020/12/27/announcing-trino.html) 
+ Hive: Execution of simple SELECT queries with LIMIT clause are accelerated by stopping the query execution as soon as the number of records mentioned in LIMIT clause is fetched. Simple SELECT queries are queries that do not have GROUP BY / ORDER by clause or queries that do not have a reducer stage. For example, `SELECT * from <TABLE> WHERE <Condition> LIMIT <Number>`. 

**Hudi Concurrency Control**
+ Hudi now supports Optimistic Concurrency Control (OCC), which can be leveraged with write operations like UPSERT and INSERT to allow changes from multiple writers to the same Hudi table. This is file-level OCC, so any two commits (or writers) can write to the same table, if their changes do not conflict. For more information, see the [Hudi concurrency control](https://hudi.apache.org/docs/concurrency_control/). 
+ Amazon EMR clusters have Zookeeper installed, which can be leveraged as the lock provider for OCC. To make it easier to use this feature, Amazon EMR clusters have the following properties pre-configured:

  ```
  hoodie.write.lock.provider=org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider
  hoodie.write.lock.zookeeper.url=<EMR Zookeeper URL>
  hoodie.write.lock.zookeeper.port=<EMR Zookeeper Port>
  hoodie.write.lock.zookeeper.base_path=/hudi
  ```

  To enable OCC, you need to configure the following properties either with their Hudi job options or at the cluster-level using the Amazon EMR configurations API:

  ```
  hoodie.write.concurrency.mode=optimistic_concurrency_control
  hoodie.cleaner.policy.failed.writes=LAZY (Performs cleaning of failed writes lazily instead of inline with every write)
  hoodie.write.lock.zookeeper.lock_key=<Key to uniquely identify the Hudi table> (Table Name is a good option)
  ```

**Hudi Monitoring: Amazon CloudWatch integration to report Hudi Metrics**
+ Amazon EMR supports publishing Hudi Metrics to Amazon CloudWatch. It is enabled by setting the following required configurations:

  ```
  hoodie.metrics.on=true
  hoodie.metrics.reporter.type=CLOUDWATCH
  ```
+ The following are optional Hudi configurations that you can change:    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-640-release.html)

**Amazon EMR Hudi configurations support and improvements**
+ Customers can now leverage EMR Configurations API and Reconfiguration feature to configure Hudi configurations at cluster level. A new file based configuration support has been introduced via /etc/hudi/conf/hudi-defaults.conf along the lines of other applications like Spark, Hive etc. EMR configures few defaults to improve user experience:

  — `hoodie.datasource.hive_sync.jdbcurl ` is configured to the cluster Hive server URL and no longer needs to be specified. This is particularly useful when running a job in Spark cluster mode, where you previously had to specify the Amazon EMR master IP. 

  — HBase specific configurations, which are useful for using HBase index with Hudi.

  — Zookeeper lock provider specific configuration, as discussed under concurrency control, which makes it easier to use Optimistic Concurrency Control (OCC).
+ Additional changes have been introduced to reduce the number of configurations that you need to pass, and to infer automatically where possible:

  — The `partitionBy ` keyword can be used to specify the partition column. 

  — When enabling Hive Sync, it is no longer mandatory to pass `HIVE_TABLE_OPT_KEY, HIVE_PARTITION_FIELDS_OPT_KEY, HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY`. Those values can be inferred from the Hudi table name and partition field. 

  — `KEYGENERATOR_CLASS_OPT_KEY` is not mandatory to pass, and can be inferred from simpler cases of `SimpleKeyGenerator` and `ComplexKeyGenerator`. 

**Hudi Caveats**
+ Hudi does not support vectorized execution in Hive for Merge on Read (MoR) and Bootstrap tables. For example, `count(*)` fails with Hudi realtime table when `hive.vectorized.execution.enabled` is set to true. As a workaround, you can disable vectorized reading by setting `hive.vectorized.execution.enabled` to `false`. 
+ Multi-writer support is not compatible with the Hudi bootstrap feature.
+ Flink Streamer and Flink SQL are experimental features in this release. These features are not recommended for use in production deployments.

**Changes, enhancements, and resolved issues**

This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.
+ Previously, manual restart of the resource manager on a multi-master cluster caused Amazon EMR on-cluster daemons, like Zookeeper, to reload all previously decommissioned or lost nodes in the Zookeeper znode file. This caused default limits to be exceeded in certain situations. Amazon EMR now removes the decommissioned or lost node records older than one hour from the Zookeeper file and the internal limits have been increased.
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ **Configuring a cluster to fix Apache YARN Timeline Server version 1 and 1.5 performance issues**

  Apache YARN Timeline Server version 1 and 1.5 can cause performance issues with very active, large EMR clusters, particularly with `yarn.resourcemanager.system-metrics-publisher.enabled=true`, which is the default setting in Amazon EMR. An open source YARN Timeline Server v2 solves the performance issue related to YARN Timeline Server scalability.

  Other workarounds for this issue include:
  + Configuring yarn.resourcemanager.system-metrics-publisher.enabled=false in yarn-site.xml.
  + Enabling the fix for this issue when creating a cluster, as described below.

  The following Amazon EMR releases contain a fix for this YARN Timeline Server performance issue.

  EMR 5.30.2, 5.31.1, 5.32.1, 5.33.1, 5.34.x, 6.0.1, 6.1.1, 6.2.1, 6.3.1, 6.4.x

  To enable the fix on any of the above specified Amazon EMR releases, set these properties to `true` in a configurations JSON file that is passed in using the [`aws emr create-cluster` command parameter](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps-create-cluster.html): `--configurations file://./configurations.json`. Or enable the fix using the [reconfiguration console UI](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps-running-cluster.html).

  Example of the configurations.json file contents:

  ```
  [
  {
  "Classification": "yarn-site",
  "Properties": {
  "yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.enable-batch": "true",
  "yarn.resourcemanager.system-metrics-publisher.enabled": "true"
  },
  "Configurations": []
  }
  ]
  ```
+ WebHDFS and HttpFS server are disabled by default. You can re-enable WebHDFS using the Hadoop configuration, `dfs.webhdfs.enabled`. HttpFS server can be started by using `sudo systemctl start hadoop-httpfs`.
+ HTTPS is now enabled by default for Amazon Linux repositories. If you are using an Amazon S3 VPCE policy to restrict access to specific buckets, you must add the new Amazon Linux bucket ARN `arn:aws:s3:::amazonlinux-2-repos-$region/*` to your policy (replace `$region` with the region where the endpoint is). For more information, see this topic in the AWS discussion forums. [Announcement: Amazon Linux 2 now supports the ability to use HTTPS while connecting to package repositories ](https://forums.aws.amazon.com/ann.jspa?annID=8528). 
+ Hive: Write query performance is improved by enabling the use of a scratch directory on HDFS for the last job. The temporary data for final job is written to HDFS instead of Amazon S3 and performance is improved because the data is moved from HDFS to the final table location (Amazon S3) instead of between Amazon S3 devices.
+ Hive: Query compilation time improvement up to 2.5x with Glue metastore Partition Pruning.
+ By default, when built-in UDFs are passed by Hive to the Hive Metastore Server, only a subset of those built-in UDFs are passed to the Glue Metastore since Glue supports only limited expression operators. If you set `hive.glue.partition.pruning.client=true`, then all partition pruning happens on the client side. If the you set `hive.glue.partition.pruning.server=true`, then all partition pruning happens on the server side. 

**Known issues**
+ Hue queries do not work in Amazon EMR 6.4.0 because Apache Hadoop HttpFS server is disabled by default. To use Hue on Amazon EMR 6.4.0, either manually start HttpFS server on the Amazon EMR primary node using `sudo systemctl start hadoop-httpfs`, or [use an Amazon EMR step](https://docs.aws.amazon.com/emr/latest/ManagementGuide/add-step-cli.html).
+ The Amazon EMR Notebooks feature used with Livy user impersonation does not work because HttpFS is disabled by default. In this case, the EMR notebook cannot connect to the cluster that has Livy impersonation enabled. The workaround is to start HttpFS server before connecting the EMR notebook to the cluster using `sudo systemctl start hadoop-httpfs`.
+ In Amazon EMR version 6.4.0, Phoenix does not support the Phoenix connectors component.
+ To use Spark actions with Apache Oozie, you must add the following configuration to your Oozie `workflow.xml` file. Otherwise, several critical libraries such as Hadoop and EMRFS will be missing from the classpath of the Spark executors that Oozie launches.

  ```
  <spark-opts>--conf spark.yarn.populateHadoopClasspath=true</spark-opts>
  ```
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.4.0 component versions
<a name="emr-640-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.3.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.18.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.47.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.13.1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.13.1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-4 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-4 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-4 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-4 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-4 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-4 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-4 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-4 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-4 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-4 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-4 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.4.4-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.4.4-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.4.4-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.4.4-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.4.4-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-5 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-5 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-5 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-5 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-5 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-5 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-5 | Service for accepting Hive queries as web requests. | 
| hudi | 0.8.0-amzn-0 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.8.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-trino | 0.8.0-amzn-0 | Bundle library for running Trino with Hudi. | 
| hudi-spark | 0.8.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.9.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.4.1 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.1-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.8.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.1.2 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.1.2 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.254.1-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.254.1-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.254.1-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| trino-coordinator | 359 | Service for accepting queries and managing query execution among trino-workers. | 
| trino-worker | 359 | Service for executing pieces of a query. | 
| trino-client | 359 | Trino command-line client which is installed on an HA cluster's stand-by masters where Trino server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.1.2-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.1.2-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.1.2-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.1.2-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.4.1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.5.7 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.5.7 | ZooKeeper command line client. | 

## 6.4.0 configuration classifications
<a name="emr-640-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.4.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| hudi-defaults | Change values in Hudi's hudi-defaults.conf file. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| trino-log | Change values in Trino's log.properties file. | Restarts Trino-Server (for Trino) | 
| trino-config | Change values in Trino's config.properties file. | Restarts Trino-Server (for Trino) | 
| trino-password-authenticator | Change values in Trino's password-authenticator.properties file. | Restarts Trino-Server (for Trino) | 
| trino-env | Change values in Trino's trino-env.sh file. | Restarts Trino-Server (for Trino) | 
| trino-node | Change values in Trino's node.properties file. | Not available. | 
| trino-connector-blackhole | Change values in Trino's blackhole.properties file. | Not available. | 
| trino-connector-cassandra | Change values in Trino's cassandra.properties file. | Not available. | 
| trino-connector-hive | Change values in Trino's hive.properties file. | Restarts Trino-Server (for Trino) | 
| trino-connector-jmx | Change values in Trino's jmx.properties file. | Not available. | 
| trino-connector-kafka | Change values in Trino's kafka.properties file. | Not available. | 
| trino-connector-localfile | Change values in Trino's localfile.properties file. | Not available. | 
| trino-connector-memory | Change values in Trino's memory.properties file. | Not available. | 
| trino-connector-mongodb | Change values in Trino's mongodb.properties file. | Not available. | 
| trino-connector-mysql | Change values in Trino's mysql.properties file. | Not available. | 
| trino-connector-postgresql | Change values in Trino's postgresql.properties file. | Not available. | 
| trino-connector-raptor | Change values in Trino's raptor.properties file. | Not available. | 
| trino-connector-redis | Change values in Trino's redis.properties file. | Not available. | 
| trino-connector-redshift | Change values in Trino's redshift.properties file. | Not available. | 
| trino-connector-tpch | Change values in Trino's tpch.properties file. | Not available. | 
| trino-connector-tpcds | Change values in Trino's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.3.1
<a name="emr-631-release"></a>

## 6.3.1 application versions
<a name="emr-631-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.3.1 | emr-6.3.0 | emr-6.2.1 | emr-6.2.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.977 | 1.11.977 | 1.11.880 | 1.11.880 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.12.1 | 1.12.1 | 1.11.2 | 1.11.2 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.6-amzn-1 | 2.2.6-amzn-1 | 2.2.6-amzn-0 | 2.2.6-amzn-0 | 
| HCatalog | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 
| Hadoop | 3.2.1-amzn-3.1 | 3.2.1-amzn-3 | 3.2.1-amzn-2.1 | 3.2.1-amzn-2 | 
| Hive | 3.1.2-amzn-4 | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 
| Hudi | 0.7.0-amzn-0 | 0.7.0-amzn-0 | 0.6.0-amzn-1 | 0.6.0-amzn-1 | 
| Hue | 4.9.0 | 4.9.0 | 4.8.0 | 4.8.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 | 2.1.0 | 
| JupyterHub | 1.2.2 | 1.2.2 | 1.1.0 | 1.1.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 
| MXNet | 1.7.0 | 1.7.0 | 1.7.0 | 1.7.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.1 | 5.2.0 | 5.2.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.245.1-amzn-0 | 0.245.1-amzn-0 | 0.238.3-amzn-1 | 0.238.3-amzn-1 | 
| Spark | 3.1.1-amzn-0.1 | 3.1.1-amzn-0 | 3.0.1-amzn-0.1 | 3.0.1-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.4.1 | 2.3.1 | 2.3.1 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 350 | 350 | 343 | 343 | 
| Zeppelin | 0.9.0 | 0.9.0 | 0.9.0-preview1 | 0.9.0-preview1 | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.3.1 release notes
<a name="emr-631-relnotes"></a>

This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.

**Changes, Enhancements, and Resolved Issues**
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ HTTPS is now enabled by default for Amazon Linux repositories. If you are using an Amazon S3 VPCE policy to restrict access to specific buckets, you must add the new Amazon Linux bucket ARN `arn:aws:s3:::amazonlinux-2-repos-$region/*` to your policy (replace `$region` with the region where the endpoint is). For more information, see this topic in the AWS discussion forums. [Announcement: Amazon Linux 2 now supports the ability to use HTTPS while connecting to package repositories ](https://forums.aws.amazon.com/ann.jspa?annID=8528). 

**Known issues**
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.3.1 component versions
<a name="emr-631-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.2.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.18.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.46.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.12.1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.12.1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-3.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-3.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-3.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-3.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-3.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-3.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-3.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-3.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-3.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-3.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-3.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.6-amzn-1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.6-amzn-1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.6-amzn-1 | HBase command-line client. | 
| hbase-rest-server | 2.2.6-amzn-1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.6-amzn-1 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-4 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-4 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-4 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-4 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-4 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-4 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-4 | Service for accepting Hive queries as web requests. | 
| hudi | 0.7.0-amzn-0 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.7.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.7.0-amzn-0 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.7.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.9.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.2.2 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.7.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.245.1-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.245.1-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.245.1-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 350 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 350 | Service for executing pieces of a query. | 
| prestosql-client | 350 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.1.1-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.1.1-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.1.1-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.1.1-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.4.1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.3.1 configuration classifications
<a name="emr-631-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.3.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| prestosql-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | Not available. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | Not available. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | Not available. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | Not available. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | Not available. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | Not available. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | Not available. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | Not available. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | Not available. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | Not available. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | Not available. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | Not available. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | Not available. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | Not available. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.3.0
<a name="emr-630-release"></a>

## 6.3.0 application versions
<a name="emr-630-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.3.0 | emr-6.2.1 | emr-6.2.0 | emr-6.1.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.977 | 1.11.880 | 1.11.880 | 1.11.828 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.12.1 | 1.11.2 | 1.11.2 | 1.11.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.6-amzn-1 | 2.2.6-amzn-0 | 2.2.6-amzn-0 | 2.2.5 | 
| HCatalog | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 
| Hadoop | 3.2.1-amzn-3 | 3.2.1-amzn-2.1 | 3.2.1-amzn-2 | 3.2.1-amzn-1.1 | 
| Hive | 3.1.2-amzn-4 | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 
| Hudi | 0.7.0-amzn-0 | 0.6.0-amzn-1 | 0.6.0-amzn-1 | 0.5.2-incubating-amzn-2 | 
| Hue | 4.9.0 | 4.8.0 | 4.8.0 | 4.7.1 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 | 2.1.0 |  -  | 
| JupyterHub | 1.2.2 | 1.1.0 | 1.1.0 | 1.1.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 
| MXNet | 1.7.0 | 1.7.0 | 1.7.0 | 1.6.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.1 | 5.2.0 | 5.2.0 | 5.2.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.245.1-amzn-0 | 0.238.3-amzn-1 | 0.238.3-amzn-1 | 0.232 | 
| Spark | 3.1.1-amzn-0 | 3.0.1-amzn-0.1 | 3.0.1-amzn-0 | 3.0.0-amzn-0.1 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.4.1 | 2.3.1 | 2.3.1 | 2.1.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 350 | 343 | 343 | 338 | 
| Zeppelin | 0.9.0 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.3.0 release notes
<a name="emr-630-relnotes"></a>

The following release notes include information for Amazon EMR release 6.3.0. Changes are relative to 6.2.0.

Initial release date: May 12, 2021

Last updated date: August 9, 2021

**Supported applications**
+ AWS SDK for Java version 1.11.977
+ CloudWatch Sink version 2.1.0
+ DynamoDB Connector version 4.16.0
+ EMRFS version 2.46.0
+ Amazon EMR Goodies version 3.2.0
+ Amazon EMR Kinesis Connector version 3.5.0
+ Amazon EMR Record Server version 2.0.0
+ Amazon EMR Scripts version 2.5.0
+ Flink version 1.12.1
+ Ganglia version 3.7.2
+ AWS Glue Hive Metastore Client version 3.2.0
+ Hadoop version 3.2.1-amzn-3
+ HBase version 2.2.6-amzn-1
+ HBase-operator-tools 1.0.0
+ HCatalog version 3.1.2-amzn-0
+ Hive version 3.1.2-amzn-4
+ Hudi version 0.7.0-amzn-0
+ Hue version 4.9.0
+ Java JDK version Corretto-8.282.08.1 (build 1.8.0\$1282-b08)
+ JupyterHub version 1.2.0
+ Livy version 0.7.0-incubating
+ MXNet version 1.7.0
+ Oozie version 5.2.1
+ Phoenix version 5.0.0
+ Pig version 0.17.0
+ Presto version 0.245.1-amzn-0
+ PrestoSQL version 350
+ Apache Ranger KMS (multi-master transparent encryption) version 2.0.0
+ ranger-plugins 2.0.1-amzn-0
+ ranger-s3-plugin 1.1.0
+ SageMaker Spark SDK version 1.4.1
+ Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0\$1282)
+ Spark version 3.1.1-amzn-0
+ spark-rapids 0.4.1
+ Sqoop version 1.4.7
+ TensorFlow version 2.4.1
+ tez version 0.9.2
+ Zeppelin version 0.9.0
+ Zookeeper version 3.4.14
+ Connectors and drivers: DynamoDB Connector 4.16.0

**New features**
+ Amazon EMR supports Amazon S3 Access Points, a feature of Amazon S3 that allows you to easily manage access for shared data lakes. Using your Amazon S3 Access Point alias, you can simplify your data access at scale on Amazon EMR. You can use Amazon S3 Access Points with all versions of Amazon EMR at no additional cost in all AWS regions where Amazon EMR is available. To learn more about Amazon S3 Access Points and Access Point aliases, see [Using a bucket-style alias for your access point](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points-alias.html) in the *Amazon S3 User Guide*.
+ New `DescribeReleaseLabel` and `ListReleaseLabel` API parameters provide Amazon EMR release label details. You can programmatically list releases available in the region where the API request is run, and list the available applications for a specific Amazon EMR release label. The release label parameters also list Amazon EMR releases that support a specified application, such as Spark. This information can be used to programmatically launch Amazon EMR clusters. For example, you can launch a cluster using the latest release version from the `ListReleaseLabel` results. For more information, see [DescribeReleaseLabel](https://docs.aws.amazon.com/emr/latest/APIReference/API_DescribeReleaseLabel.html) and [ListReleaseLabels](https://docs.aws.amazon.com/emr/latest/APIReference/API_ListReleaseLabels.html) in the *Amazon EMR API Reference*.
+ With Amazon EMR 6.3.0, you can launch a cluster that natively integrates with Apache Ranger. Apache Ranger is an open-source framework to enable, monitor, and manage comprehensive data security across the Hadoop platform. For more information, see [Apache Ranger](https://ranger.apache.org/). With native integration, you can bring your own Apache Ranger to enforce fine-grained data access control on Amazon EMR. See [Integrate Amazon EMR with Apache Ranger](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-ranger.html) in the Amazon EMR Management Guide.
+ Scoped managed policies: To align with AWS best practices, Amazon EMR has introduced v2 EMR-scoped default managed policies as replacements for policies that will be deprecated. See [Amazon EMR Managed Policies](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-iam-policies.html).
+ Instance Metadata Service (IMDS) V2 support status: For Amazon EMR 6.2 or later, Amazon EMR components use IMDSv2 for all IMDS calls. For IMDS calls in your application code, you can use both IMDSv1 and IMDSv2, or configure the IMDS to use only IMDSv2 for added security. If you disable IMDSv1 in earlier Amazon EMR 6.x releases, it causes cluster startup failure.

**Changes, enhancements, and resolved issues**
+ This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ Spark SQL UI explain mode default changed from `extended` to `formatted` in [Spark 3.1](https://issues.apache.org/jira/browse/SPARK-31325). Amazon EMR reverted it back to `extended` to include logical plan information in the Spark SQL UI. This can be reverted by setting `spark.sql.ui.explainMode` to `formatted`.
+ The following commits were backported from the Spark master branch.

  - [[SPARK-34752]](https://issues.apache.org/jira/browse/SPARK-34752)[BUILD] Bump Jetty to 9.4.37 to address CVE-2020-27223.

  - [[SPARK-34534]](https://issues.apache.org/jira/browse/SPARK-34534) Fix blockIds order when use FetchShuffleBlocks to fetch blocks.

  - [[SPARK-34681]](https://issues.apache.org/jira/browse/SPARK-34681) [SQL] Fix bug for full outer shuffled hash join when building left side with non-equal condition.

  - [[SPARK-34497]](https://issues.apache.org/jira/browse/SPARK-34497) [SQL] Fix built-in JDBC connection providers to restore JVM security context changes.
+ To improve interoperability with Nvidia Spark RAPIDs plugin, Added workaround to address an issue preventing dynamic partition pruning from triggering when using Nvidia Spark RAPIDs with adaptive query execution disabled, see [RAPIDS issue \$11378](https://github.com/NVIDIA/spark-rapids/issues/1378) and [RAPIDS issue \$1\$11386](https://github.com/NVIDIA/spark-rapids/issues/1386). For details of the new configuration `spark.sql.optimizer.dynamicPartitionPruning.enforceBroadcastReuse`, see [RAPIDS issue \$1\$11386](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-performance.html#emr-spark-performance-dynamic).
+ The file output committer default algorithm has been changed from the v2 algorithm to the v1 algorithm in open source Spark 3.1. For more information, see this [Amazon EMR optimizing Spark performance - dynamic partition pruning](https://issues.apache.org/jira/browse/SPARK-33019).
+ Amazon EMR reverted to the v2 algorithm, the default used in prior Amazon EMR 6.x releases, to prevent performance regression. To restore the open source Spark 3.1 behavior, set `spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version` to `1`. Open source Spark made this change because task commit in file output committer algorithm v2 is not atomic, which can cause an output data correctness issue in some cases. However, task commit in algorithm v1 is also not atomic. In some scenarios task commit includes a delete performed before a rename. This can result in a silent data correctness issue.
+ Fixed Managed Scaling issues in earlier Amazon EMR releases and made improvements so application failure rates are significantly reduced.
+ Installed the AWS Java SDK Bundle on each new cluster. This is a single jar containing all service SDKs and their dependencies, instead of individual component jars. For more information, see [Java SDK Bundled Dependency](https://aws.amazon.com/blogs/developer/java-sdk-bundle/).

**Known issues**
+ For Amazon EMR 6.3.0 and 6.2.0 private subnet clusters, you cannot access the Ganglia web UI. You will get an "access denied (403)" error. Other web UIs, such as Spark, Hue, JupyterHub, Zeppelin, Livy, and Tez are working normally. Ganglia web UI access on public subnet clusters are also working normally. To resolve this issue, restart httpd service on the primary node with `sudo systemctl restart httpd`. This issue is fixed in Amazon EMR 6.4.0.
+ When AWS Glue Data Catalog is enabled, using Spark to access a AWS Glue DB with null string location URI may fail. This happens to earlier Amazon EMR releases, but SPARK-31709 (https://issues.apache.org/jira/browse/SPARK-31709) makes it apply to more cases. For example, when creating a table within the default AWS Glue DB whose location URI is a null string, `spark.sql("CREATE TABLE mytest (key string) location '/table_path';")` fails with the message, "Cannot create a Path from an empty string." To work around this, manually set a location URI of your AWS Glue databases, then create tables within these databases using Spark.
+ In Amazon EMR 6.3.0, PrestoSQL has upgraded from version 343 to version 350. There are two security related changes from the open source that relate to this version change. File-based catalog access control is changed from `deny` to `allow` when table, schema, or session property rules are not defined. Also, file-based system access control is changed to support files without catalog rules defined. In this case, all access to catalogs is allowed.

  For more information, see [Release 344 (9 Oct 2020)](https://trino.io/docs/current/release/release-344.html#security).
+ Note that the Hadoop user directory (/home/hadoop) is readable by everyone. It has Unix 755 (drwxr-xr-x) directory permissions to allow read access by frameworks like Hive. You can put files in /home/hadoop and its subdirectories, but be aware of the permissions on those directories to protect sensitive information.
+ **Lower "Max open files" limit on older AL2 [fixed in newer releases].** Amazon EMR releases: emr-5.30.x, emr-5.31.0, emr-5.32.0, emr-6.0.0, emr-6.1.0, and emr-6.2.0 are based on older versions ofAmazon Linux 2 (AL2), which have a lower ulimit setting for "Max open files" when Amazon EMR clusters are created with the default AMI. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later include a permanent fix with a higher "Max open files" setting. Releases with the lower open file limit causes a "Too many open files" error when submitting Spark job. In the impacted releases, the Amazon EMR default AMI has a default ulimit setting of 4096 for "Max open files," which is lower than the 65536 file limit in the latestAmazon Linux 2 AMI. The lower ulimit setting for "Max open files" causes Spark job failure when the Spark driver and executor try to open more than 4096 files. To fix the issue, Amazon EMR has a bootstrap action (BA) script that adjusts the ulimit setting at cluster creation. 

  If you are using an older Amazon EMR version that doesn't have the permanent fix for this issue, the following workaround lets you to explicitly set the instance-controller ulimit to a maximum of 65536 files.

**Explicitly set a ulimit from the command line**

  1. Edit `/etc/systemd/system/instance-controller.service` to add the following parameters to Service section.

     `LimitNOFILE=65536`

     `LimitNPROC=65536`

  1. Restart InstanceController

     `$ sudo systemctl daemon-reload`

     `$ sudo systemctl restart instance-controller`

  **Set a ulimit using bootstrap action (BA)**

  You can also use a bootstrap action (BA) script to configure the instance-controller ulimit to 65536 files at cluster creation.

  ```
  #!/bin/bash
  for user in hadoop spark hive; do
  sudo tee /etc/security/limits.d/$user.conf << EOF
  $user - nofile 65536
  $user - nproc 65536
  EOF
  done
  for proc in instancecontroller logpusher; do
  sudo mkdir -p /etc/systemd/system/$proc.service.d/
  sudo tee /etc/systemd/system/$proc.service.d/override.conf << EOF
  [Service]
  LimitNOFILE=65536
  LimitNPROC=65536
  EOF
  pid=$(pgrep -f aws157.$proc.Main)
  sudo prlimit --pid $pid --nofile=65535:65535 --nproc=65535:65535
  done
  sudo systemctl daemon-reload
  ```
+ 
**Important**  
EMR clusters that run Amazon Linux or Amazon Linux 2 Amazon Machine Images (AMIs) use default Amazon Linux behavior, and do not automatically download and install important and critical kernel updates that require a reboot. This is the same behavior as other Amazon EC2 instances that run the default Amazon Linux AMI. If new Amazon Linux software updates that require a reboot (such as kernel, NVIDIA, and CUDA updates) become available after an Amazon EMR release becomes available, EMR cluster instances that run the default AMI do not automatically download and install those updates. To get kernel updates, you can [customize your Amazon EMR AMI](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html) to [use the latest Amazon Linux AMI](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/finding-an-ami.html).
+ To use Spark actions with Apache Oozie, you must add the following configuration to your Oozie `workflow.xml` file. Otherwise, several critical libraries such as Hadoop and EMRFS will be missing from the classpath of the Spark executors that Oozie launches.

  ```
  <spark-opts>--conf spark.yarn.populateHadoopClasspath=true</spark-opts>
  ```
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.3.0 component versions
<a name="emr-630-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.2.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.2.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.18.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.1.0 | EMR S3Select Connector | 
| emrfs | 2.46.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.12.1 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.12.1 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-3 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-3 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-3 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-3 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-3 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-3 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-3 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-3 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-3 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-3 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-3 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.6-amzn-1 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.6-amzn-1 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.6-amzn-1 | HBase command-line client. | 
| hbase-rest-server | 2.2.6-amzn-1 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.6-amzn-1 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-4 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-4 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-4 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-4 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-4 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-4 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-4 | Service for accepting Hive queries as web requests. | 
| hudi | 0.7.0-amzn-0 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.7.0-amzn-0 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.7.0-amzn-0 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.7.0-amzn-0 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.9.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.2.2 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.7.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.68\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.1 | Oozie command-line client. | 
| oozie-server | 5.2.1 | Service for accepting Oozie workflow requests. | 
| opencv | 4.5.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.245.1-amzn-0 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.245.1-amzn-0 | Service for executing pieces of a query. | 
| presto-client | 0.245.1-amzn-0 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 350 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 350 | Service for executing pieces of a query. | 
| prestosql-client | 350 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 4.0.2 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.1.1-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.1.1-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.1.1-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.1.1-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.4.1 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.4.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.3.0 configuration classifications
<a name="emr-630-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.3.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Restarts Flink history server. | 
| flink-log4j | Change Flink log4j.properties settings. | Restarts Flink history server. | 
| flink-log4j-session | Change Flink log4j-session.properties settings for Kubernetes/Yarn session. | Restarts Flink history server. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Restarts Flink history server. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS services Namenode, Datanode, and ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| prestosql-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | Not available. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | Not available. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | Not available. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | Not available. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | Not available. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | Not available. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | Not available. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | Not available. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | Not available. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | Not available. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | Not available. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | Not available. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | Not available. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | Not available. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie and HiveServer2. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zeppelin-site | Change configuration settings in zeppelin-site.xml. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.2.1
<a name="emr-621-release"></a>

## 6.2.1 application versions
<a name="emr-621-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.2.1 | emr-6.2.0 | emr-6.1.1 | emr-6.1.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.880 | 1.11.880 | 1.11.828 | 1.11.828 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.2 | 1.11.2 | 1.11.0 | 1.11.0 | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.6-amzn-0 | 2.2.6-amzn-0 | 2.2.5 | 2.2.5 | 
| HCatalog | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 
| Hadoop | 3.2.1-amzn-2.1 | 3.2.1-amzn-2 | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 
| Hive | 3.1.2-amzn-3 | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 
| Hudi | 0.6.0-amzn-1 | 0.6.0-amzn-1 | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 
| Hue | 4.8.0 | 4.8.0 | 4.7.1 | 4.7.1 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 | 2.1.0 |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.1.0 | 1.1.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 
| MXNet | 1.7.0 | 1.7.0 | 1.6.0 | 1.6.0 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.2.0 | 5.2.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 | 0.17.0 | 
| Presto | 0.238.3-amzn-1 | 0.238.3-amzn-1 | 0.232 | 0.232 | 
| Spark | 3.0.1-amzn-0.1 | 3.0.1-amzn-0 | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 | 1.4.7 | 
| TensorFlow | 2.3.1 | 2.3.1 | 2.1.0 | 2.1.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 343 | 343 | 338 | 338 | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.2.1 release notes
<a name="emr-621-relnotes"></a>

This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.

**Changes, Enhancements, and Resolved Issues**
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ HTTPS is now enabled by default for Amazon Linux repositories. If you are using an Amazon S3 VPCE policy to restrict access to specific buckets, you must add the new Amazon Linux bucket ARN `arn:aws:s3:::amazonlinux-2-repos-$region/*` to your policy (replace `$region` with the region where the endpoint is). For more information, see this topic in the AWS discussion forums. [Announcement: Amazon Linux 2 now supports the ability to use HTTPS while connecting to package repositories ](https://forums.aws.amazon.com/ann.jspa?annID=8528). 

**Known issues**
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.2.1 component versions
<a name="emr-621-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.1.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.0.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.16.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.0.0 | EMR S3Select Connector | 
| emrfs | 2.44.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.11.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.11.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-2.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-2.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-2.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-2.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-2.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-2.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-2.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-2.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-2.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-2.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-2.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.6-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.6-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.6-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.2.6-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.6-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-3 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-3 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-3 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-3 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-3 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-3 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-3 | Service for accepting Hive queries as web requests. | 
| hudi | 0.6.0-amzn-1 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.6.0-amzn-1 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.6.0-amzn-1 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.6.0-amzn-1 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.8.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.1.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.7.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.0 | Oozie command-line client. | 
| oozie-server | 5.2.0 | Service for accepting Oozie workflow requests. | 
| opencv | 4.4.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.238.3-amzn-1 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.238.3-amzn-1 | Service for executing pieces of a query. | 
| presto-client | 0.238.3-amzn-1 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 343 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 343 | Service for executing pieces of a query. | 
| prestosql-client | 343 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.0.1-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.0.1-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.0.1-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.0.1-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.2.0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.3.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-preview1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.2.1 configuration classifications
<a name="emr-621-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.2.1 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Not available. | 
| flink-log4j | Change Flink log4j.properties settings. | Not available. | 
| flink-log4j-yarn-session | Change Flink log4j-yarn-session.properties settings. | Not available. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Not available. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| prestosql-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | Not available. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | Not available. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | Not available. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | Not available. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | Not available. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | Not available. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | Not available. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | Not available. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | Not available. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | Not available. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | Not available. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | Not available. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | Not available. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | Not available. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.2.0
<a name="emr-620-release"></a>

## 6.2.0 application versions
<a name="emr-620-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyter-enterprise-gateway.readthedocs.io/en/latest/](https://jupyter-enterprise-gateway.readthedocs.io/en/latest/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.2.0 | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.880 | 1.11.828 | 1.11.828 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.2 | 1.11.0 | 1.11.0 |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.6-amzn-0 | 2.2.5 | 2.2.5 | 2.2.3 | 
| HCatalog | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 
| Hadoop | 3.2.1-amzn-2 | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 
| Hive | 3.1.2-amzn-3 | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 
| Hudi | 0.6.0-amzn-1 | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.8.0 | 4.7.1 | 4.7.1 | 4.4.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway | 2.1.0 |  -  |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.1.0 | 1.0.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 
| MXNet | 1.7.0 | 1.6.0 | 1.6.0 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.2.0 | 5.1.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 | 0.17.0 |  -  | 
| Presto | 0.238.3-amzn-1 | 0.232 | 0.232 | 0.230 | 
| Spark | 3.0.1-amzn-0 | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 | 1.4.7 |  -  | 
| TensorFlow | 2.3.1 | 2.1.0 | 2.1.0 | 1.14.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 343 | 338 | 338 |  -  | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.2.0 release notes
<a name="emr-620-relnotes"></a>

The following release notes include information for Amazon EMR release 6.2.0. Changes are relative to 6.1.0.

Initial release date: Dec 09, 2020

Last updated date: Oct 04, 2021

**Supported applications**
+ AWS SDK for Java version 1.11.828
+ emr-record-server version 1.7.0
+ Flink version 1.11.2
+ Ganglia version 3.7.2
+ Hadoop version 3.2.1-amzn-1
+ HBase version 2.2.6-amzn-0
+ HBase-operator-tools 1.0.0
+ HCatalog version 3.1.2-amzn-0
+ Hive version 3.1.2-amzn-3
+ Hudi version 0.6.0-amzn-1
+ Hue version 4.8.0
+ JupyterHub version 1.1.0
+ Livy version 0.7.0
+ MXNet version 1.7.0
+ Oozie version 5.2.0
+ Phoenix version 5.0.0
+ Pig version 0.17.0
+ Presto version 0.238.3-amzn-1
+ PrestoSQL version 343
+ Spark version 3.0.1-amzn-0
+ spark-rapids 0.2.0
+ TensorFlow version 2.3.1
+ Zeppelin version 0.9.0-preview1
+ Zookeeper version 3.4.14
+ Connectors and drivers: DynamoDB Connector 4.16.0

**New features**
+ HBase: Removed rename in commit phase and added persistent HFile tracking. See [Persistent HFile Tracking](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase-s3.html#emr-hbase-s3-hfile-tracking) in the *Amazon EMR Release Guide*.
+ HBase: Backported [Create a config that forces to cache blocks on compaction](https://issues.apache.org/jira/browse/HBASE-23066).
+ PrestoDB: Improvements to Dynamic Partition Pruning. Rule-based Join Reorder works on non-partitioned data.
+ Scoped managed policies: To align with AWS best practices, Amazon EMR has introduced v2 EMR-scoped default managed policies as replacements for policies that will be deprecated. See [Amazon EMR Managed Policies](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-iam-policies.html).
+ Instance Metadata Service (IMDS) V2 support status: For Amazon EMR 6.2 or later, Amazon EMR components use IMDSv2 for all IMDS calls. For IMDS calls in your application code, you can use both IMDSv1 and IMDSv2, or configure the IMDS to use only IMDSv2 for added security. If you disable IMDSv1 in earlier Amazon EMR 6.x releases, it causes cluster startup failure.

**Changes, enhancements, and resolved issues**
+ This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ Spark: Performance improvements in Spark runtime.

**Known issues**
+ Amazon EMR 6.2 has incorrect permissions set on the /etc/cron.d/libinstance-controller-java file in EMR 6.2.0. Permissions on the file are 645 (-rw-r--r-x), when they should be 644 (-rw-r--r--). As a result, Amazon EMR version 6.2 does not log instance-state logs, and the /emr/instance-logs directory is empty. This issue is fixed in Amazon EMR 6.3.0 and later.

  To work around this issue, run the following script as a bootstrap action at cluster launch. 

  ```
  #!/bin/bash
  sudo chmod 644 /etc/cron.d/libinstance-controller-java
  ```
+ For Amazon EMR 6.2.0 and 6.3.0 private subnet clusters, you cannot access the Ganglia web UI. You will get an "access denied (403)" error. Other web UIs, such as Spark, Hue, JupyterHub, Zeppelin, Livy, and Tez are working normally. Ganglia web UI access on public subnet clusters are also working normally. To resolve this issue, restart httpd service on the primary node with `sudo systemctl restart httpd`. This issue is fixed in Amazon EMR 6.4.0.
+ There is an issue in Amazon EMR 6.2.0 where httpd continuously fails, causing Ganglia to be unavailable. You get a "cannot connect to the server" error. To fix a cluster that is already running with this issue, SSH to the cluster primary node and add the line `Listen 80` to the file `httpd.conf` located at `/etc/httpd/conf/httpd.conf`. This issue is fixed in Amazon EMR 6.3.0.
+ HTTPD fails on EMR 6.2.0 clusters when you use a security configuration. This makes the Ganglia web application user interface unavailable. To access the Ganglia web application user interface, add `Listen 80` to the `/etc/httpd/conf/httpd.conf` file on the primary node of your cluster. For information about connecting to your cluster, see [Connect to the Primary Node Using SSH](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-connect-master-node-ssh.html).

  EMR Notebooks also fail to establish a connection with EMR 6.2.0 clusters when you use a security configuration. The notebook will fail to list kernels and submit Spark jobs. We recommend that you use EMR Notebooks with another version of Amazon EMR instead.
+ **Lower "Max open files" limit on older AL2 [fixed in newer releases].** Amazon EMR releases: emr-5.30.x, emr-5.31.0, emr-5.32.0, emr-6.0.0, emr-6.1.0, and emr-6.2.0 are based on older versions ofAmazon Linux 2 (AL2), which have a lower ulimit setting for "Max open files" when Amazon EMR clusters are created with the default AMI. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later include a permanent fix with a higher "Max open files" setting. Releases with the lower open file limit causes a "Too many open files" error when submitting Spark job. In the impacted releases, the Amazon EMR default AMI has a default ulimit setting of 4096 for "Max open files," which is lower than the 65536 file limit in the latestAmazon Linux 2 AMI. The lower ulimit setting for "Max open files" causes Spark job failure when the Spark driver and executor try to open more than 4096 files. To fix the issue, Amazon EMR has a bootstrap action (BA) script that adjusts the ulimit setting at cluster creation. 

  If you are using an older Amazon EMR version that doesn't have the permanent fix for this issue, the following workaround lets you to explicitly set the instance-controller ulimit to a maximum of 65536 files.

**Explicitly set a ulimit from the command line**

  1. Edit `/etc/systemd/system/instance-controller.service` to add the following parameters to Service section.

     `LimitNOFILE=65536`

     `LimitNPROC=65536`

  1. Restart InstanceController

     `$ sudo systemctl daemon-reload`

     `$ sudo systemctl restart instance-controller`

  **Set a ulimit using bootstrap action (BA)**

  You can also use a bootstrap action (BA) script to configure the instance-controller ulimit to 65536 files at cluster creation.

  ```
  #!/bin/bash
  for user in hadoop spark hive; do
  sudo tee /etc/security/limits.d/$user.conf << EOF
  $user - nofile 65536
  $user - nproc 65536
  EOF
  done
  for proc in instancecontroller logpusher; do
  sudo mkdir -p /etc/systemd/system/$proc.service.d/
  sudo tee /etc/systemd/system/$proc.service.d/override.conf << EOF
  [Service]
  LimitNOFILE=65536
  LimitNPROC=65536
  EOF
  pid=$(pgrep -f aws157.$proc.Main)
  sudo prlimit --pid $pid --nofile=65535:65535 --nproc=65535:65535
  done
  sudo systemctl daemon-reload
  ```
+ 
**Important**  
Amazon EMR 6.1.0 and 6.2.0 include a performance issue that can critically affect all Hudi insert, upsert, and delete operations. If you plan to use Hudi with Amazon EMR 6.1.0 or 6.2.0, you should contact AWS support to obtain a patched Hudi RPM.
+ 
**Important**  
EMR clusters that run Amazon Linux or Amazon Linux 2 Amazon Machine Images (AMIs) use default Amazon Linux behavior, and do not automatically download and install important and critical kernel updates that require a reboot. This is the same behavior as other Amazon EC2 instances that run the default Amazon Linux AMI. If new Amazon Linux software updates that require a reboot (such as kernel, NVIDIA, and CUDA updates) become available after an Amazon EMR release becomes available, EMR cluster instances that run the default AMI do not automatically download and install those updates. To get kernel updates, you can [customize your Amazon EMR AMI](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-custom-ami.html) to [use the latest Amazon Linux AMI](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/finding-an-ami.html).
+ Amazon EMR 6.2.0 Maven artifacts are not published. They will be published with a future release of Amazon EMR.
+ Persistent HFile tracking using the HBase storefile system table does not support the HBase region replication feature. For more information about HBase region replication, see [Timeline-consistent High Available Reads](http://hbase.apache.org/book.html#arch.timelineconsistent.reads).
+ Amazon EMR 6.x and EMR 5.x Hive bucketing version differences

  EMR 5.x uses OOS Apache Hive 2, while in EMR 6.x uses OOS Apache Hive 3. The open source Hive2 uses Bucketing version 1, while open source Hive3 uses Bucketing version 2. This bucketing version difference between Hive 2 (EMR 5.x) and Hive 3 (EMR 6.x) means Hive bucketing hashing functions differently. See the example below.

  The following table is an example created in EMR 6.x and EMR 5.x, respectively.

  ```
  -- Using following LOCATION in EMR 6.x
  CREATE TABLE test_bucketing (id INT, desc STRING)
  PARTITIONED BY (day STRING)
  CLUSTERED BY(id) INTO 128 BUCKETS
  LOCATION 's3://your-own-s3-bucket/emr-6-bucketing/';
  
  -- Using following LOCATION in EMR 5.x 
  LOCATION 's3://your-own-s3-bucket/emr-5-bucketing/';
  ```

  Inserting the same data in both EMR 6.x and EMR 5.x.

  ```
  INSERT INTO test_bucketing PARTITION (day='01') VALUES(66, 'some_data');
  INSERT INTO test_bucketing PARTITION (day='01') VALUES(200, 'some_data');
  ```

  Checking the S3 location, shows the bucketing file name is different, because the hashing function is different between EMR 6.x (Hive 3) and EMR 5.x (Hive 2).

  ```
  [hadoop@ip-10-0-0-122 ~]$ aws s3 ls s3://your-own-s3-bucket/emr-6-bucketing/day=01/
  2020-10-21 20:35:16         13 000025_0
  2020-10-21 20:35:22         14 000121_0
  [hadoop@ip-10-0-0-122 ~]$ aws s3 ls s3://your-own-s3-bucket/emr-5-bucketing/day=01/
  2020-10-21 20:32:07         13 000066_0
  2020-10-21 20:32:51         14 000072_0
  ```

  You can also see the version difference by running the following command in Hive CLI in EMR 6.x. Note that it returns bucketing version 2.

  ```
  hive> DESCRIBE FORMATTED test_bucketing;
  ...
  Table Parameters:
      bucketing_version       2
  ...
  ```
+ Known issue in clusters with multiple primary nodes and Kerberos authentication

  If you run clusters with multiple primary nodes and Kerberos authentication in Amazon EMR releases 5.20.0 and later, you may encounter problems with cluster operations such as scale down or step submission, after the cluster has been running for some time. The time period depends on the Kerberos ticket validity period that you defined. The scale-down problem impacts both automatic scale-down and explicit scale down requests that you submitted. Additional cluster operations can also be impacted. 

  Workaround:
  + SSH as `hadoop` user to the lead primary node of the EMR cluster with multiple primary nodes.
  +  Run the following command to renew Kerberos ticket for `hadoop` user. 

    ```
    kinit -kt <keytab_file> <principal>
    ```

    Typically, the keytab file is located at `/etc/hadoop.keytab` and the principal is in the form of `hadoop/<hostname>@<REALM>`.
**Note**  
This workaround will be effective for the time period the Kerberos ticket is valid. This duration is 10 hours by default, but can configured by your Kerberos settings. You must re-run the above command once the Kerberos ticket expires.
+ When you use Spark with Hive partition location formatting to read data in Amazon S3, and you run Spark on Amazon EMR releases 5.30.0 to 5.36.0, and 6.2.0 to 6.9.0, you might encounter an issue that prevents your cluster from reading data correctly. This can happen if your partitions have all of the following characteristics:
  + Two or more partitions are scanned from the same table.
  + At least one partition directory path is a prefix of at least one other partition directory path, for example, `s3://bucket/table/p=a` is a prefix of `s3://bucket/table/p=a b`.
  + The first character that follows the prefix in the other partition directory has a UTF-8 value that’s less than than the `/` character (U\$1002F). For example, the space character (U\$10020) that occurs between a and b in `s3://bucket/table/p=a b` falls into this category. Note that there are 14 other non-control characters: `!"#$%&‘()*+,-`. For more information, see [UTF-8 encoding table and Unicode characters](https://www.utf8-chartable.de/).

  As a workaround to this issue, set the `spark.sql.sources.fastS3PartitionDiscovery.enabled` configuration to `false` in the `spark-defaults` classification.

## 6.2.0 component versions
<a name="emr-620-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.4.1 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.16.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.1.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-notebook-env | 1.0.0 | Conda env for emr notebook which includes jupyter enterprise gateway | 
| emr-s3-dist-cp | 2.16.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.0.0 | EMR S3Select Connector | 
| emrfs | 2.44.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.11.2 | Apache Flink command line client scripts and applications. | 
| flink-jobmanager-config | 1.11.2 | Managing resources on EMR nodes for Apache Flink JobManager. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-2 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-2 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-2 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-2 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-2 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-2 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-2 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-2 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-2 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-2 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-2 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.6-amzn-0 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.6-amzn-0 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.6-amzn-0 | HBase command-line client. | 
| hbase-rest-server | 2.2.6-amzn-0 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.6-amzn-0 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-3 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-3 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-3 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-3 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-3 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-3 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-3 | Service for accepting Hive queries as web requests. | 
| hudi | 0.6.0-amzn-1 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.6.0-amzn-1 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.6.0-amzn-1 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.6.0-amzn-1 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.8.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.1.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.7.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 10.1.243 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.0 | Oozie command-line client. | 
| oozie-server | 5.2.0 | Service for accepting Oozie workflow requests. | 
| opencv | 4.4.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.238.3-amzn-1 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.238.3-amzn-1 | Service for executing pieces of a query. | 
| presto-client | 0.238.3-amzn-1 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 343 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 343 | Service for executing pieces of a query. | 
| prestosql-client | 343 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.0.1-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.0.1-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.0.1-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.0.1-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| spark-rapids | 0.2.0 | Nvidia Spark RAPIDS plugin that accelerates Apache Spark with GPUs. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.3.1 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-preview1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.2.0 configuration classifications
<a name="emr-620-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).

Reconfiguration actions occur when you specify a configuration for instance groups in a running cluster. Amazon EMR only initiates reconfiguration actions for the classifications that you modify. For more information, see [Reconfigure an instance group in a running cluster](emr-configure-apps-running-cluster.md).


**emr-6.2.0 classifications**  

| Classifications | Description | Reconfiguration Actions | 
| --- | --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | Restarts the ResourceManager service. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | Not available. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | Not available. | 
| core-site | Change values in Hadoop's core-site.xml file. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Ranger KMS, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| docker-conf | Change docker related settings. | Not available. | 
| emrfs-site | Change EMRFS settings. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts HBaseRegionserver, HBaseMaster, HBaseThrift, HBaseRest, HiveServer2, Hive MetaStore, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| flink-conf | Change flink-conf.yaml settings. | Not available. | 
| flink-log4j | Change Flink log4j.properties settings. | Not available. | 
| flink-log4j-yarn-session | Change Flink log4j-yarn-session.properties settings. | Not available. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | Not available. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts PhoenixQueryserver, HiveServer2, Hive MetaStore, and MapReduce-HistoryServer. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | Restarts the Hadoop HDFS services SecondaryNamenode, Datanode, and Journalnode. Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Hadoop KMS, Hadoop Httpfs, and MapReduce-HistoryServer. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | Not available. | 
| hadoop-ssl-client | Change hadoop ssl client configuration | Not available. | 
| hbase | Amazon EMR-curated settings for Apache HBase. | Custom EMR specific property. Sets emrfs-site and hbase-site configs. See those for their associated restarts. | 
| hbase-env | Change values in HBase's environment. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | Not available. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | Restarts the HBase services RegionServer, HBaseMaster, ThriftServer, RestServer. Additionally restarts Phoenix QueryServer. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | This classification should not be reconfigured. | 
| hdfs-env | Change values in the HDFS environment. | Restarts Hadoop HDFS ZKFC. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | Restarts the Hadoop HDFS services Namenode, SecondaryNamenode, Datanode, ZKFC, and Journalnode. Additionally restarts Hadoop Httpfs. | 
| hcatalog-env | Change values in HCatalog's environment. | Restarts Hive HCatalog Server. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | Restarts Hive HCatalog Server. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | Restarts Hive HCatalog Server. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | Restarts Hive WebHCat server. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | Restarts Hive WebHCat server. | 
| hive | Amazon EMR-curated settings for Apache Hive. | Sets configurations to launch Hive LLAP service. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | Not available. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | Not available. | 
| hive-env | Change values in the Hive environment. | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | Not available. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | Not available. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | Not available. | 
| hive-site | Change values in Hive's hive-site.xml file | Restarts HiveServer2, HiveMetastore, and Hive HCatalog-Server. Runs Hive schemaTool CLI commands to verify hive-metastore. Also restarts Oozie and Zeppelin. | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | Not available. | 
| hue-ini | Change values in Hue's ini file | Restarts Hue. Also activates Hue config override CLI commands to pick up new configurations. | 
| httpfs-env | Change values in the HTTPFS environment. | Restarts Hadoop Httpfs service. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | Restarts Hadoop Httpfs service. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | Not available. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | Restarts Hadoop-KMS service. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | Not available. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | Restarts Hadoop-KMS and Ranger-KMS service. | 
| hudi-env | Change values in the Hudi environment. | Not available. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | Not available. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | Not available. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | Not available. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | Not available. | 
| livy-conf | Change values in Livy's livy.conf file. | Restarts Livy Server. | 
| livy-env | Change values in the Livy environment. | Restarts Livy Server. | 
| livy-log4j | Change Livy log4j.properties settings. | Restarts Livy Server. | 
| mapred-env | Change values in the MapReduce application's environment. | Restarts Hadoop MapReduce-HistoryServer. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | Restarts Hadoop MapReduce-HistoryServer. | 
| oozie-env | Change values in Oozie's environment. | Restarts Oozie. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | Restarts Oozie. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | Restarts Oozie. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | Not available. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | Not available. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | Restarts Phoenix-QueryServer. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | Not available. | 
| pig-env | Change values in the Pig environment. | Not available. | 
| pig-properties | Change values in Pig's pig.properties file. | Restarts Oozie. | 
| pig-log4j | Change values in Pig's log4j.properties file. | Not available. | 
| presto-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | Not available. | 
| presto-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoDB) | 
| presto-node | Change values in Presto's node.properties file. | Not available. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | Not available. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | Not available. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | Restarts Presto-Server (for PrestoDB) | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | Not available. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | Not available. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | Not available. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | Not available. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | Not available. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | Not available. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | Not available. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | Not available. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | Not available. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | Not available. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | Not available. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | Not available. | 
| prestosql-log | Change values in Presto's log.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-config | Change values in Presto's config.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-env | Change values in Presto's presto-env.sh file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | Not available. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | Not available. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | Not available. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | Restarts Presto-Server (for PrestoSQL) | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | Not available. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | Not available. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | Not available. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | Not available. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | Not available. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | Not available. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | Not available. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | Not available. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | Not available. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | Not available. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | Not available. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | Not available. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | Restarts Ranger KMS Server. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | Restarts Ranger KMS Server. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | Not available. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | Not available. | 
| spark | Amazon EMR-curated settings for Apache Spark. | This property modifies spark-defaults. See actions there. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | Restarts Spark history server and Spark thrift server. | 
| spark-env | Change values in the Spark environment. | Restarts Spark history server and Spark thrift server. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | Not available. | 
| spark-log4j | Change values in Spark's log4j.properties file. | Restarts Spark history server and Spark thrift server. | 
| spark-metrics | Change values in Spark's metrics.properties file. | Restarts Spark history server and Spark thrift server. | 
| sqoop-env | Change values in Sqoop's environment. | Not available. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | Not available. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | Not available. | 
| tez-site | Change values in Tez's tez-site.xml file. | Restart Oozie. | 
| yarn-env | Change values in the YARN environment. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts MapReduce-HistoryServer. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | Restarts the Hadoop YARN services ResourceManager, NodeManager, ProxyServer, and TimelineServer. Additionally restarts Livy Server and MapReduce-HistoryServer. | 
| zeppelin-env | Change values in the Zeppelin environment. | Restarts Zeppelin. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | Restarts Zookeeper server. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | Restarts Zookeeper server. | 

# Amazon EMR release 6.1.1
<a name="emr-611-release"></a>

## 6.1.1 application versions
<a name="emr-611-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | emr-6.0.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.828 | 1.11.828 | 1.11.711 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.11.12 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.0 | 1.11.0 |  -  |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.5 | 2.2.5 | 2.2.3 | 2.2.3 | 
| HCatalog | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hadoop | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 3.2.1-amzn-0 | 
| Hive | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hudi | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.7.1 | 4.7.1 | 4.4.0 | 4.4.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway |  -  |  -  |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.0.0 | 1.0.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 0.6.0-incubating | 
| MXNet | 1.6.0 | 1.6.0 | 1.5.1 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.1.0 | 5.1.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 |  -  |  -  | 
| Presto | 0.232 | 0.232 | 0.230 | 0.230 | 
| Spark | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 |  -  |  -  | 
| TensorFlow | 2.1.0 | 2.1.0 | 1.14.0 | 1.14.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 338 | 338 |  -  |  -  | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.1.1 release notes
<a name="emr-611-relnotes"></a>

This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.

**Changes, Enhancements, and Resolved Issues**
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ HTTPS is now enabled by default for Amazon Linux repositories. If you are using an Amazon S3 VPCE policy to restrict access to specific buckets, you must add the new Amazon Linux bucket ARN `arn:aws:s3:::amazonlinux-2-repos-$region/*` to your policy (replace `$region` with the region where the endpoint is). For more information, see this topic in the AWS discussion forums. [Announcement: Amazon Linux 2 now supports the ability to use HTTPS while connecting to package repositories ](https://forums.aws.amazon.com/ann.jspa?annID=8528). 

## 6.1.1 component versions
<a name="emr-611-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.3.0 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.14.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.1.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-s3-dist-cp | 2.14.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.0.0 | EMR S3Select Connector | 
| emrfs | 2.42.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.11.0 | Apache Flink command line client scripts and applications. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-1.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-1.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-1.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-1.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-1.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-1.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-1.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-1.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-1.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-1.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-1.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.5 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.5 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.5 | HBase command-line client. | 
| hbase-rest-server | 2.2.5 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.5 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-2 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-2 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-2 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-2 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-2 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-2 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-2 | Service for accepting Hive queries as web requests. | 
| hudi | 0.5.2-incubating-amzn-2 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.5.2-incubating-amzn-2 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.5.2-incubating-amzn-2 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.5.2-incubating-amzn-2 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.7.1 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.1.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.6.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 9.2.88 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.0 | Oozie command-line client. | 
| oozie-server | 5.2.0 | Service for accepting Oozie workflow requests. | 
| opencv | 4.3.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.232 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.232 | Service for executing pieces of a query. | 
| presto-client | 0.232 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 338 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 338 | Service for executing pieces of a query. | 
| prestosql-client | 338 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.0.0-amzn-0.1 | Spark command-line clients. | 
| spark-history-server | 3.0.0-amzn-0.1 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.0.0-amzn-0.1 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.0.0-amzn-0.1 | Apache Spark libraries needed by YARN slaves. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.1.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-preview1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.1.1 configuration classifications
<a name="emr-611-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).


**emr-6.1.1 classifications**  

| Classifications | Description | 
| --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | 
| core-site | Change values in Hadoop's core-site.xml file. | 
| emrfs-site | Change EMRFS settings. | 
| flink-conf | Change flink-conf.yaml settings. | 
| flink-log4j | Change Flink log4j.properties settings. | 
| flink-log4j-yarn-session | Change Flink log4j-yarn-session.properties settings. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | 
| hadoop-ssl-client | Change hadoop ssl client configuration | 
| hbase | Amazon EMR-curated settings for Apache HBase. | 
| hbase-env | Change values in HBase's environment. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | 
| hdfs-env | Change values in the HDFS environment. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | 
| hcatalog-env | Change values in HCatalog's environment. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | 
| hive | Amazon EMR-curated settings for Apache Hive. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | 
| hive-env | Change values in the Hive environment. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | 
| hive-site | Change values in Hive's hive-site.xml file | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | 
| hue-ini | Change values in Hue's ini file | 
| httpfs-env | Change values in the HTTPFS environment. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | 
| hudi-env | Change values in the Hudi environment. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | 
| livy-conf | Change values in Livy's livy.conf file. | 
| livy-env | Change values in the Livy environment. | 
| livy-log4j | Change Livy log4j.properties settings. | 
| mapred-env | Change values in the MapReduce application's environment. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | 
| oozie-env | Change values in Oozie's environment. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | 
| pig-env | Change values in the Pig environment. | 
| pig-properties | Change values in Pig's pig.properties file. | 
| pig-log4j | Change values in Pig's log4j.properties file. | 
| presto-log | Change values in Presto's log.properties file. | 
| presto-config | Change values in Presto's config.properties file. | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| presto-env | Change values in Presto's presto-env.sh file. | 
| presto-node | Change values in Presto's node.properties file. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | 
| prestosql-log | Change values in Presto's log.properties file. | 
| prestosql-config | Change values in Presto's config.properties file. | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| prestosql-env | Change values in Presto's presto-env.sh file. | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | 
| spark | Amazon EMR-curated settings for Apache Spark. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | 
| spark-env | Change values in the Spark environment. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | 
| spark-log4j | Change values in Spark's log4j.properties file. | 
| spark-metrics | Change values in Spark's metrics.properties file. | 
| sqoop-env | Change values in Sqoop's environment. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | 
| tez-site | Change values in Tez's tez-site.xml file. | 
| yarn-env | Change values in the YARN environment. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | 
| zeppelin-env | Change values in the Zeppelin environment. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | 

# Amazon EMR release 6.1.0
<a name="emr-610-release"></a>

## 6.1.0 application versions
<a name="emr-610-app-versions"></a>

This release includes the following applications: [https://flink.apache.org/](https://flink.apache.org/), [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [http://pig.apache.org/](http://pig.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://prestosql.io/](https://prestosql.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [http://sqoop.apache.org/](http://sqoop.apache.org/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | emr-6.0.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.828 | 1.11.828 | 1.11.711 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.11.12 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.0 | 1.11.0 |  -  |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.5 | 2.2.5 | 2.2.3 | 2.2.3 | 
| HCatalog | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hadoop | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 3.2.1-amzn-0 | 
| Hive | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hudi | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.7.1 | 4.7.1 | 4.4.0 | 4.4.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway |  -  |  -  |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.0.0 | 1.0.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 0.6.0-incubating | 
| MXNet | 1.6.0 | 1.6.0 | 1.5.1 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.1.0 | 5.1.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 |  -  |  -  | 
| Presto | 0.232 | 0.232 | 0.230 | 0.230 | 
| Spark | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 |  -  |  -  | 
| TensorFlow | 2.1.0 | 2.1.0 | 1.14.0 | 1.14.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 338 | 338 |  -  |  -  | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.1.0 release notes
<a name="emr-610-relnotes"></a>

The following release notes include information for Amazon EMR release 6.1.0. Changes are relative to 6.0.0.

Initial release date: Sept 04, 2020

Last updated date: Oct 15, 2020

**Supported applications**
+ AWS SDK for Java version 1.11.828
+ Flink version 1.11.0
+ Ganglia version 3.7.2
+ Hadoop version 3.2.1-amzn-1
+ HBase version 2.2.5
+ HBase-operator-tools 1.0.0
+ HCatalog version 3.1.2-amzn-0
+ Hive version 3.1.2-amzn-1
+ Hudi version 0.5.2-incubating
+ Hue version 4.7.1
+ JupyterHub version 1.1.0
+ Livy version 0.7.0
+ MXNet version 1.6.0
+ Oozie version 5.2.0
+ Phoenix version 5.0.0
+ Presto version 0.232
+ PrestoSQL version 338
+ Spark version 3.0.0-amzn-0
+ TensorFlow version 2.1.0
+ Zeppelin version 0.9.0-preview1
+ Zookeeper version 3.4.14
+ Connectors and drivers: DynamoDB Connector 4.14.0

**New features**
+ ARM instance types are supported starting with Amazon EMR version 5.30.0 and Amazon EMR version 6.1.0.
+ M6g general purpose instance types are supported starting with Amazon EMR versions 6.1.0 and 5.30.0. For more information, see [Supported Instance Types](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-supported-instance-types.html) in the *Amazon EMR Management Guide*.
+ The EC2 placement group feature is supported starting with Amazon EMR version 5.23.0 as an option for multiple primary node clusters. Currently, only primary node types are supported by the placement group feature, and the `SPREAD` strategy is applied to those primary nodes. The `SPREAD` strategy places a small group of instances across separate underlying hardware to guard against the loss of multiple primary nodes in the event of a hardware failure. For more information, see [EMR Integration with EC2 Placement Group](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-ha-placementgroup.html) in the *Amazon EMR Management Guide*.
+ Managed Scaling – With Amazon EMR version 6.1.0, you can enable Amazon EMR managed scaling to automatically increase or decrease the number of instances or units in your cluster based on workload. Amazon EMR continuously evaluates cluster metrics to make scaling decisions that optimize your clusters for cost and speed. Managed Scaling is also available on Amazon EMR version 5.30.0 and later, except 6.0.0. For more information, see [Scaling Cluster Resources](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-scale-on-demand.html) in the *Amazon EMR Management Guide*.
+ PrestoSQL version 338 is supported with EMR 6.1.0. For more information, see [Presto](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html).
  + PrestoSQL is supported on EMR 6.1.0 and later versions only, not on EMR 6.0.0 or EMR 5.x.
  + The application name, `Presto` continues to be used to install PrestoDB on clusters. To install PrestoSQL on clusters, use the application name `PrestoSQL`.
  + You can install either PrestoDB or PrestoSQL, but you cannot install both on a single cluster. If both PrestoDB and PrestoSQL are specified when attempting to create a cluster, a validation error occurs and the cluster creation request fails.
  + PrestoSQL is supported on both single-master and muti-master clusters. On multi-master clusters, an external Hive metastore is required to run PrestoSQL or PrestoDB. See [Supported applications in an EMR cluster with multiple primary nodes](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-ha-applications.html#emr-plan-ha-applications-list).
+ ECR auto authentication support on Apache Hadoop and Apache Spark with Docker: Spark users can use Docker images from Docker Hub and Amazon Elastic Container Registry (Amazon ECR) to define environment and library dependencies.

  [Configure Docker](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-docker.html) and [Run Spark Applications with Docker Using Amazon EMR 6.x](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-docker.html).
+ EMR supports Apache Hive ACID transactions: Amazon EMR 6.1.0 adds support for Hive ACID transactions so it complies with the ACID properties of a database. With this feature, you can run `INSERT, UPDATE, DELETE,` and `MERGE` operations in Hive managed tables with data in Amazon Simple Storage Service (Amazon S3). This is a key feature for use cases like streaming ingestion, data restatement, bulk updates using MERGE, and slowly changing dimensions. For more information, including configuration examples and use cases, see [Amazon EMR supports Apache Hive ACID transactions](https://aws.amazon.com/blogs/big-data/amazon-emr-supports-apache-hive-acid-transactions).

**Changes, enhancements, and resolved issues**
+ This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ Apache Flink is not supported on EMR 6.0.0, but it is supported on EMR 6.1.0 with Flink 1.11.0. This is the first version of Flink to officially support Hadoop 3. See [Apache Flink 1.11.0 Release Announcement](https://flink.apache.org/news/2020/07/06/release-1.11.0.html).
+ Ganglia has been removed from default EMR 6.1.0 package bundles.

**Known issues**
+ **Lower "Max open files" limit on older AL2 [fixed in newer releases].** Amazon EMR releases: emr-5.30.x, emr-5.31.0, emr-5.32.0, emr-6.0.0, emr-6.1.0, and emr-6.2.0 are based on older versions ofAmazon Linux 2 (AL2), which have a lower ulimit setting for "Max open files" when Amazon EMR clusters are created with the default AMI. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later include a permanent fix with a higher "Max open files" setting. Releases with the lower open file limit causes a "Too many open files" error when submitting Spark job. In the impacted releases, the Amazon EMR default AMI has a default ulimit setting of 4096 for "Max open files," which is lower than the 65536 file limit in the latestAmazon Linux 2 AMI. The lower ulimit setting for "Max open files" causes Spark job failure when the Spark driver and executor try to open more than 4096 files. To fix the issue, Amazon EMR has a bootstrap action (BA) script that adjusts the ulimit setting at cluster creation. 

  If you are using an older Amazon EMR version that doesn't have the permanent fix for this issue, the following workaround lets you to explicitly set the instance-controller ulimit to a maximum of 65536 files.

**Explicitly set a ulimit from the command line**

  1. Edit `/etc/systemd/system/instance-controller.service` to add the following parameters to Service section.

     `LimitNOFILE=65536`

     `LimitNPROC=65536`

  1. Restart InstanceController

     `$ sudo systemctl daemon-reload`

     `$ sudo systemctl restart instance-controller`

  **Set a ulimit using bootstrap action (BA)**

  You can also use a bootstrap action (BA) script to configure the instance-controller ulimit to 65536 files at cluster creation.

  ```
  #!/bin/bash
  for user in hadoop spark hive; do
  sudo tee /etc/security/limits.d/$user.conf << EOF
  $user - nofile 65536
  $user - nproc 65536
  EOF
  done
  for proc in instancecontroller logpusher; do
  sudo mkdir -p /etc/systemd/system/$proc.service.d/
  sudo tee /etc/systemd/system/$proc.service.d/override.conf << EOF
  [Service]
  LimitNOFILE=65536
  LimitNPROC=65536
  EOF
  pid=$(pgrep -f aws157.$proc.Main)
  sudo prlimit --pid $pid --nofile=65535:65535 --nproc=65535:65535
  done
  sudo systemctl daemon-reload
  ```
+ 
**Important**  
Amazon EMR 6.1.0 and 6.2.0 include a performance issue that can critically affect all Hudi insert, upsert, and delete operations. If you plan to use Hudi with Amazon EMR 6.1.0 or 6.2.0, you should contact AWS support to obtain a patched Hudi RPM.
+ If you set custom garbage collection configuration with `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions`, this will result in driver/executor launch failure with EMR 6.1 due to conflicting garbage collection configuration. With EMR Release 6.1.0, you should specify custom Spark garbage collection configuration for drivers and executors with the properties `spark.driver.defaultJavaOptions` and `spark.executor.defaultJavaOptions` instead. Read more in [Apache Spark Runtime Environment](https://spark.apache.org/docs/latest/configuration.html#runtime-environment) and [Configuring Spark Garbage Collection on Amazon EMR 6.1.0](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-configure.html#spark-gc-config).
+ Using Pig with Oozie (and within Hue, since Hue uses Oozie actions to run Pig scripts), generates an error that a native-lzo library cannot be loaded. This error message is informational and does not block Pig from running.
+ Hudi Concurrency Support: Currently Hudi doesn't support concurrent writes to a single Hudi table. In addition, Hudi rolls back any changes being done by in-progress writers before allowing a new writer to start. Concurrent writes can interfere with this mechanism and introduce race conditions, which can lead to data corruption. You should ensure that as part of your data processing workflow, there is only a single Hudi writer operating against a Hudi table at any time. Hudi does support multiple concurrent readers operating against the same Hudi table.
+ Known issue in clusters with multiple primary nodes and Kerberos authentication

  If you run clusters with multiple primary nodes and Kerberos authentication in Amazon EMR releases 5.20.0 and later, you may encounter problems with cluster operations such as scale down or step submission, after the cluster has been running for some time. The time period depends on the Kerberos ticket validity period that you defined. The scale-down problem impacts both automatic scale-down and explicit scale down requests that you submitted. Additional cluster operations can also be impacted. 

  Workaround:
  + SSH as `hadoop` user to the lead primary node of the EMR cluster with multiple primary nodes.
  +  Run the following command to renew Kerberos ticket for `hadoop` user. 

    ```
    kinit -kt <keytab_file> <principal>
    ```

    Typically, the keytab file is located at `/etc/hadoop.keytab` and the principal is in the form of `hadoop/<hostname>@<REALM>`.
**Note**  
This workaround will be effective for the time period the Kerberos ticket is valid. This duration is 10 hours by default, but can configured by your Kerberos settings. You must re-run the above command once the Kerberos ticket expires.
+ There is an issue in Amazon EMR 6.1.0 that affects clusters running Presto. After an extended period of time (days), the cluster may throw errors such as, "su: failed to execute /bin/bash: Resource temporarily unavailable" or "shell request failed on channel 0". This issue is caused by an internal Amazon EMR process (InstanceController) that is spawning too many Light Weight Processes (LWP), which eventually causes the Hadoop user to exceed their nproc limit. This prevents the user from opening additional processes. The solution for this issue is to upgrade to EMR 6.2.0.

## 6.1.0 component versions
<a name="emr-610-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.3.0 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.14.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.1.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-s3-dist-cp | 2.14.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 2.0.0 | EMR S3Select Connector | 
| emrfs | 2.42.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| flink-client | 1.11.0 | Apache Flink command line client scripts and applications. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.5 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.5 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.5 | HBase command-line client. | 
| hbase-rest-server | 2.2.5 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.5 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-2 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-2 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-2 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-2 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-2 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-2 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-2 | Service for accepting Hive queries as web requests. | 
| hudi | 0.5.2-incubating-amzn-2 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.5.2-incubating-amzn-2 | Bundle library for running Presto with Hudi. | 
| hudi-prestosql | 0.5.2-incubating-amzn-2 | Bundle library for running PrestoSQL with Hudi. | 
| hudi-spark | 0.5.2-incubating-amzn-2 | Bundle library for running Spark with Hudi. | 
| hue-server | 4.7.1 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.1.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.7.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.6.0 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 9.2.88 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.2.0 | Oozie command-line client. | 
| oozie-server | 5.2.0 | Service for accepting Oozie workflow requests. | 
| opencv | 4.3.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.232 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.232 | Service for executing pieces of a query. | 
| presto-client | 0.232 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| prestosql-coordinator | 338 | Service for accepting queries and managing query execution among prestosql-workers. | 
| prestosql-worker | 338 | Service for executing pieces of a query. | 
| prestosql-client | 338 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| pig-client | 0.17.0 | Pig command-line client. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| ranger-kms-server | 2.0.0 | Apache Ranger Key Management System | 
| spark-client | 3.0.0-amzn-0 | Spark command-line clients. | 
| spark-history-server | 3.0.0-amzn-0 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 3.0.0-amzn-0 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 3.0.0-amzn-0 | Apache Spark libraries needed by YARN slaves. | 
| sqoop-client | 1.4.7 | Apache Sqoop command-line client. | 
| tensorflow | 2.1.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-preview1 | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.1.0 configuration classifications
<a name="emr-610-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).


**emr-6.1.0 classifications**  

| Classifications | Description | 
| --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | 
| core-site | Change values in Hadoop's core-site.xml file. | 
| emrfs-site | Change EMRFS settings. | 
| flink-conf | Change flink-conf.yaml settings. | 
| flink-log4j | Change Flink log4j.properties settings. | 
| flink-log4j-yarn-session | Change Flink log4j-yarn-session.properties settings. | 
| flink-log4j-cli | Change Flink log4j-cli.properties settings. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | 
| hadoop-ssl-client | Change hadoop ssl client configuration | 
| hbase | Amazon EMR-curated settings for Apache HBase. | 
| hbase-env | Change values in HBase's environment. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | 
| hdfs-env | Change values in the HDFS environment. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | 
| hcatalog-env | Change values in HCatalog's environment. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | 
| hive | Amazon EMR-curated settings for Apache Hive. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | 
| hive-env | Change values in the Hive environment. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | 
| hive-site | Change values in Hive's hive-site.xml file | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | 
| hue-ini | Change values in Hue's ini file | 
| httpfs-env | Change values in the HTTPFS environment. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | 
| hudi-env | Change values in the Hudi environment. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | 
| livy-conf | Change values in Livy's livy.conf file. | 
| livy-env | Change values in the Livy environment. | 
| livy-log4j | Change Livy log4j.properties settings. | 
| mapred-env | Change values in the MapReduce application's environment. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | 
| oozie-env | Change values in Oozie's environment. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | 
| pig-env | Change values in the Pig environment. | 
| pig-properties | Change values in Pig's pig.properties file. | 
| pig-log4j | Change values in Pig's log4j.properties file. | 
| presto-log | Change values in Presto's log.properties file. | 
| presto-config | Change values in Presto's config.properties file. | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| presto-env | Change values in Presto's presto-env.sh file. | 
| presto-node | Change values in Presto's node.properties file. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | 
| prestosql-log | Change values in Presto's log.properties file. | 
| prestosql-config | Change values in Presto's config.properties file. | 
| prestosql-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| prestosql-env | Change values in Presto's presto-env.sh file. | 
| prestosql-node | Change values in PrestoSQL's node.properties file. | 
| prestosql-connector-blackhole | Change values in PrestoSQL's blackhole.properties file. | 
| prestosql-connector-cassandra | Change values in PrestoSQL's cassandra.properties file. | 
| prestosql-connector-hive | Change values in PrestoSQL's hive.properties file. | 
| prestosql-connector-jmx | Change values in PrestoSQL's jmx.properties file. | 
| prestosql-connector-kafka | Change values in PrestoSQL's kafka.properties file. | 
| prestosql-connector-localfile | Change values in PrestoSQL's localfile.properties file. | 
| prestosql-connector-memory | Change values in PrestoSQL's memory.properties file. | 
| prestosql-connector-mongodb | Change values in PrestoSQL's mongodb.properties file. | 
| prestosql-connector-mysql | Change values in PrestoSQL's mysql.properties file. | 
| prestosql-connector-postgresql | Change values in PrestoSQL's postgresql.properties file. | 
| prestosql-connector-raptor | Change values in PrestoSQL's raptor.properties file. | 
| prestosql-connector-redis | Change values in PrestoSQL's redis.properties file. | 
| prestosql-connector-redshift | Change values in PrestoSQL's redshift.properties file. | 
| prestosql-connector-tpch | Change values in PrestoSQL's tpch.properties file. | 
| prestosql-connector-tpcds | Change values in PrestoSQL's tpcds.properties file. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | 
| spark | Amazon EMR-curated settings for Apache Spark. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | 
| spark-env | Change values in the Spark environment. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | 
| spark-log4j | Change values in Spark's log4j.properties file. | 
| spark-metrics | Change values in Spark's metrics.properties file. | 
| sqoop-env | Change values in Sqoop's environment. | 
| sqoop-oraoop-site | Change values in Sqoop OraOop's oraoop-site.xml file. | 
| sqoop-site | Change values in Sqoop's sqoop-site.xml file. | 
| tez-site | Change values in Tez's tez-site.xml file. | 
| yarn-env | Change values in the YARN environment. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | 
| zeppelin-env | Change values in the Zeppelin environment. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | 

# Amazon EMR release 6.0.1
<a name="emr-601-release"></a>

## 6.0.1 application versions
<a name="emr-601-app-versions"></a>

This release includes the following applications: [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | emr-6.0.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.828 | 1.11.828 | 1.11.711 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.11.12 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.0 | 1.11.0 |  -  |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.5 | 2.2.5 | 2.2.3 | 2.2.3 | 
| HCatalog | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hadoop | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 3.2.1-amzn-0 | 
| Hive | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hudi | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.7.1 | 4.7.1 | 4.4.0 | 4.4.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway |  -  |  -  |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.0.0 | 1.0.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 0.6.0-incubating | 
| MXNet | 1.6.0 | 1.6.0 | 1.5.1 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.1.0 | 5.1.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 |  -  |  -  | 
| Presto | 0.232 | 0.232 | 0.230 | 0.230 | 
| Spark | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 |  -  |  -  | 
| TensorFlow | 2.1.0 | 2.1.0 | 1.14.0 | 1.14.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 338 | 338 |  -  |  -  | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.0.1 release notes
<a name="emr-601-relnotes"></a>

This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.

**Changes, Enhancements, and Resolved Issues**
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ HTTPS is now enabled by default for Amazon Linux repositories. If you are using an Amazon S3 VPCE policy to restrict access to specific buckets, you must add the new Amazon Linux bucket ARN `arn:aws:s3:::amazonlinux-2-repos-$region/*` to your policy (replace `$region` with the region where the endpoint is). For more information, see this topic in the AWS discussion forums. [Announcement: Amazon Linux 2 now supports the ability to use HTTPS while connecting to package repositories ](https://forums.aws.amazon.com/ann.jspa?annID=8528). 

## 6.0.1 component versions
<a name="emr-601-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.2.6 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.14.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.0.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-s3-dist-cp | 2.14.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 1.5.0 | EMR S3Select Connector | 
| emrfs | 2.39.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-0.1 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-0.1 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-0.1 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-0.1 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-0.1 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-0.1 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-0.1 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-0.1 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-0.1 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-0.1 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-0.1 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.3 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.3 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.3 | HBase command-line client. | 
| hbase-rest-server | 2.2.3 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.3 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-0 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-0 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-0 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-0 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-0 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-0 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-0 | Service for accepting Hive queries as web requests. | 
| hudi | 0.5.0-incubating-amzn-1 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.5.0-incubating-amzn-1 | Bundle library for running Presto with Hudi. | 
| hue-server | 4.4.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.0.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.6.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.5.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 9.2.88 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.1.0 | Oozie command-line client. | 
| oozie-server | 5.1.0 | Service for accepting Oozie workflow requests. | 
| opencv | 3.4.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.230 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.230 | Service for executing pieces of a query. | 
| presto-client | 0.230 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| spark-client | 2.4.4 | Spark command-line clients. | 
| spark-history-server | 2.4.4 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 2.4.4 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 2.4.4 | Apache Spark libraries needed by YARN slaves. | 
| tensorflow | 1.14.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-SNAPSHOT | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.0.1 configuration classifications
<a name="emr-601-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).


**emr-6.0.1 classifications**  

| Classifications | Description | 
| --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | 
| core-site | Change values in Hadoop's core-site.xml file. | 
| emrfs-site | Change EMRFS settings. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | 
| hadoop-ssl-client | Change hadoop ssl client configuration | 
| hbase | Amazon EMR-curated settings for Apache HBase. | 
| hbase-env | Change values in HBase's environment. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | 
| hdfs-env | Change values in the HDFS environment. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | 
| hcatalog-env | Change values in HCatalog's environment. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | 
| hive | Amazon EMR-curated settings for Apache Hive. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | 
| hive-env | Change values in the Hive environment. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | 
| hive-site | Change values in Hive's hive-site.xml file | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | 
| hue-ini | Change values in Hue's ini file | 
| httpfs-env | Change values in the HTTPFS environment. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | 
| livy-conf | Change values in Livy's livy.conf file. | 
| livy-env | Change values in the Livy environment. | 
| livy-log4j | Change Livy log4j.properties settings. | 
| mapred-env | Change values in the MapReduce application's environment. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | 
| oozie-env | Change values in Oozie's environment. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | 
| presto-log | Change values in Presto's log.properties file. | 
| presto-config | Change values in Presto's config.properties file. | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| presto-env | Change values in Presto's presto-env.sh file. | 
| presto-node | Change values in Presto's node.properties file. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | 
| recordserver-env | Change values in the EMR RecordServer environment. | 
| recordserver-conf | Change values in EMR RecordServer's erver.properties file. | 
| recordserver-log4j | Change values in EMR RecordServer's log4j.properties file. | 
| spark | Amazon EMR-curated settings for Apache Spark. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | 
| spark-env | Change values in the Spark environment. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | 
| spark-log4j | Change values in Spark's log4j.properties file. | 
| spark-metrics | Change values in Spark's metrics.properties file. | 
| tez-site | Change values in Tez's tez-site.xml file. | 
| yarn-env | Change values in the YARN environment. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | 
| zeppelin-env | Change values in the Zeppelin environment. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | 

# Amazon EMR release 6.0.0
<a name="emr-600-release"></a>

## 6.0.0 application versions
<a name="emr-600-app-versions"></a>

This release includes the following applications: [http://ganglia.info](http://ganglia.info), [http://hbase.apache.org/](http://hbase.apache.org/), [https://cwiki.apache.org/confluence/display/Hive/HCatalog](https://cwiki.apache.org/confluence/display/Hive/HCatalog), [http://hadoop.apache.org/docs/current/](http://hadoop.apache.org/docs/current/), [http://hive.apache.org/](http://hive.apache.org/), [https://hudi.apache.org](https://hudi.apache.org), [http://gethue.com/](http://gethue.com/), [https://jupyterhub.readthedocs.io/en/latest/#](https://jupyterhub.readthedocs.io/en/latest/#), [https://livy.incubator.apache.org/](https://livy.incubator.apache.org/), [https://mxnet.incubator.apache.org/](https://mxnet.incubator.apache.org/), [http://oozie.apache.org/](http://oozie.apache.org/), [https://phoenix.apache.org/](https://phoenix.apache.org/), [https://prestodb.io/](https://prestodb.io/), [https://spark.apache.org/docs/latest/](https://spark.apache.org/docs/latest/), [https://www.tensorflow.org/](https://www.tensorflow.org/), [https://tez.apache.org/](https://tez.apache.org/), [https://zeppelin.incubator.apache.org/](https://zeppelin.incubator.apache.org/), and [https://zookeeper.apache.org](https://zookeeper.apache.org).

The table below lists the application versions available in this release of Amazon EMR and the application versions in the preceding three Amazon EMR releases (when applicable).

For a comprehensive history of application versions for each release of Amazon EMR, see the following topics:
+ [Application versions in Amazon EMR 7.x releases](emr-release-app-versions-7.x.md)
+ [Application versions in Amazon EMR 6.x releases](emr-release-app-versions-6.x.md)
+ [Application versions in Amazon EMR 5.x releases](emr-release-app-versions-5.x.md)
+ [Application versions in Amazon EMR 4.x releases](emr-release-app-versions-4.x.md)


**Application version information**  

|  | emr-6.1.1 | emr-6.1.0 | emr-6.0.1 | emr-6.0.0 | 
| --- | --- | --- | --- | --- | 
| AWS SDK for Java | 1.11.828 | 1.11.828 | 1.11.711 | 1.11.711 | 
| Python | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 2.7, 3.7 | 
| Scala | 2.12.10 | 2.12.10 | 2.12.10 | 2.12.10 | 
| AmazonCloudWatchAgent |  -  |  -  |  -  |  -  | 
| Delta |  -  |  -  |  -  |  -  | 
| Flink | 1.11.0 | 1.11.0 |  -  |  -  | 
| Ganglia | 3.7.2 | 3.7.2 | 3.7.2 | 3.7.2 | 
| HBase | 2.2.5 | 2.2.5 | 2.2.3 | 2.2.3 | 
| HCatalog | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hadoop | 3.2.1-amzn-1.1 | 3.2.1-amzn-1 | 3.2.1-amzn-0.1 | 3.2.1-amzn-0 | 
| Hive | 3.1.2-amzn-2 | 3.1.2-amzn-2 | 3.1.2-amzn-0 | 3.1.2-amzn-0 | 
| Hudi | 0.5.2-incubating-amzn-2 | 0.5.2-incubating-amzn-2 | 0.5.0-incubating-amzn-1 | 0.5.0-incubating-amzn-1 | 
| Hue | 4.7.1 | 4.7.1 | 4.4.0 | 4.4.0 | 
| Iceberg |  -  |  -  |  -  |  -  | 
| JupyterEnterpriseGateway |  -  |  -  |  -  |  -  | 
| JupyterHub | 1.1.0 | 1.1.0 | 1.0.0 | 1.0.0 | 
| Livy | 0.7.0-incubating | 0.7.0-incubating | 0.6.0-incubating | 0.6.0-incubating | 
| MXNet | 1.6.0 | 1.6.0 | 1.5.1 | 1.5.1 | 
| Mahout |  -  |  -  |  -  |  -  | 
| Oozie | 5.2.0 | 5.2.0 | 5.1.0 | 5.1.0 | 
| Phoenix | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 5.0.0-HBase-2.0 | 
| Pig | 0.17.0 | 0.17.0 |  -  |  -  | 
| Presto | 0.232 | 0.232 | 0.230 | 0.230 | 
| Spark | 3.0.0-amzn-0.1 | 3.0.0-amzn-0 | 2.4.4 | 2.4.4 | 
| Sqoop | 1.4.7 | 1.4.7 |  -  |  -  | 
| TensorFlow | 2.1.0 | 2.1.0 | 1.14.0 | 1.14.0 | 
| Tez | 0.9.2 | 0.9.2 | 0.9.2 | 0.9.2 | 
| Trino (PrestoSQL) | 338 | 338 |  -  |  -  | 
| Zeppelin | 0.9.0-preview1 | 0.9.0-preview1 | 0.9.0-SNAPSHOT | 0.9.0-SNAPSHOT | 
| ZooKeeper | 3.4.14 | 3.4.14 | 3.4.14 | 3.4.14 | 

## 6.0.0 release notes
<a name="emr-600-relnotes"></a>

The following release notes include information for Amazon EMR release 6.0.0.

Initial release date: March 10, 2020

**Supported applications**
+ AWS SDK for Java version 1.11.711
+ Ganglia version 3.7.2
+ Hadoop version 3.2.1
+ HBase version 2.2.3
+ HCatalog version 3.1.2
+ Hive version 3.1.2
+ Hudi version 0.5.0-incubating
+ Hue version 4.4.0
+ JupyterHub version 1.0.0
+ Livy version 0.6.0
+ MXNet version 1.5.1
+ Oozie version 5.1.0
+ Phoenix version 5.0.0
+ Presto version 0.230
+ Spark version 2.4.4
+ TensorFlow version 1.14.0
+ Zeppelin version 0.9.0-SNAPSHOT
+ Zookeeper version 3.4.14
+ Connectors and drivers: DynamoDB Connector 4.14.0

**Note**  
Flink, Sqoop, Pig, and Mahout are not available in Amazon EMR version 6.0.0. 

**New features**
+ YARN Docker Runtime Support - YARN applications, such as Spark jobs, can now run in the context of a Docker container. This allows you to easily define dependencies in a Docker image without the need to install custom libraries on your Amazon EMR cluster. For more information, see [Configure Docker Integration](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-docker.html) and [Run Spark applications with Docker using Amazon EMR 6.0.0](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-docker.html).
+ Hive LLAP Support - Hive now supports the LLAP execution mode for improved query performance. For more information, see [Using Hive LLAP](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-llap.html).

**Changes, enhancements, and resolved issues**
+ This is a release to fix issues with Amazon EMR Scaling when it fails to scale up/scale down a cluster successfully or causes application failures.
+ Fixed an issue where scaling requests failed for a large, highly utilized cluster when Amazon EMR on-cluster daemons were running health checking activities, such as gathering YARN node state and HDFS node state. This was happening because on-cluster daemons were not able to communicate the health status data of a node to internal Amazon EMR components.
+ Improved EMR on-cluster daemons to correctly track the node states when IP addresses are reused to improve reliability during scaling operations.
+ [SPARK-29683](https://issues.apache.org/jira/browse/SPARK-29683). Fixed an issue where job failures occurred during cluster scale-down as Spark was assuming all available nodes were deny-listed.
+ [YARN-9011](https://issues.apache.org/jira/browse/YARN-9011). Fixed an issue where job failures occurred due to a race condition in YARN decommissioning when cluster tried to scale up or down.
+ Fixed issue with step or job failures during cluster scaling by ensuring that the node states are always consistent between the Amazon EMR on-cluster daemons and YARN/HDFS.
+ Fixed an issue where cluster operations such as scale down and step submission failed for Amazon EMR clusters enabled with Kerberos authentication. This was because the Amazon EMR on-cluster daemon did not renew the Kerberos ticket, which is required to securely communicate with HDFS/YARN running on the primary node.
+ Newer Amazon EMR releases fix the issue with a lower "Max open files" limit on older AL2 in Amazon EMR. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later now include a permanent fix with a higher "Max open files" setting.
+ Amazon Linux
  + Amazon Linux 2 is the operating system for the EMR 6.x release series.
  + `systemd` is used for service management instead of `upstart` used inAmazon Linux 1.
+ Java Development Kit (JDK)
  + Corretto JDK 8 is the default JDK for the EMR 6.x release series.
+ Scala
  + Scala 2.12 is used with Apache Spark and Apache Livy.
+ Python 3
  + Python 3 is now the default version of Python in EMR.
+ YARN node labels
  + Beginning with Amazon EMR 6.x release series, the YARN node labels feature is disabled by default. The application master processes can run on both core and task nodes by default. You can enable the YARN node labels feature by configuring following properties: `yarn.node-labels.enabled` and `yarn.node-labels.am.default-node-label-expression`. For more information, see [Understanding Primary, Core, and Task Nodes](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-master-core-task-nodes.html).

**Known issues**
+ **Lower "Max open files" limit on older AL2 [fixed in newer releases].** Amazon EMR releases: emr-5.30.x, emr-5.31.0, emr-5.32.0, emr-6.0.0, emr-6.1.0, and emr-6.2.0 are based on older versions ofAmazon Linux 2 (AL2), which have a lower ulimit setting for "Max open files" when Amazon EMR clusters are created with the default AMI. Amazon EMR releases 5.30.1, 5.30.2, 5.31.1, 5.32.1, 6.0.1, 6.1.1, 6.2.1, 5.33.0, 6.3.0 and later include a permanent fix with a higher "Max open files" setting. Releases with the lower open file limit causes a "Too many open files" error when submitting Spark job. In the impacted releases, the Amazon EMR default AMI has a default ulimit setting of 4096 for "Max open files," which is lower than the 65536 file limit in the latestAmazon Linux 2 AMI. The lower ulimit setting for "Max open files" causes Spark job failure when the Spark driver and executor try to open more than 4096 files. To fix the issue, Amazon EMR has a bootstrap action (BA) script that adjusts the ulimit setting at cluster creation. 

  If you are using an older Amazon EMR version that doesn't have the permanent fix for this issue, the following workaround lets you to explicitly set the instance-controller ulimit to a maximum of 65536 files.

**Explicitly set a ulimit from the command line**

  1. Edit `/etc/systemd/system/instance-controller.service` to add the following parameters to Service section.

     `LimitNOFILE=65536`

     `LimitNPROC=65536`

  1. Restart InstanceController

     `$ sudo systemctl daemon-reload`

     `$ sudo systemctl restart instance-controller`

  **Set a ulimit using bootstrap action (BA)**

  You can also use a bootstrap action (BA) script to configure the instance-controller ulimit to 65536 files at cluster creation.

  ```
  #!/bin/bash
  for user in hadoop spark hive; do
  sudo tee /etc/security/limits.d/$user.conf << EOF
  $user - nofile 65536
  $user - nproc 65536
  EOF
  done
  for proc in instancecontroller logpusher; do
  sudo mkdir -p /etc/systemd/system/$proc.service.d/
  sudo tee /etc/systemd/system/$proc.service.d/override.conf << EOF
  [Service]
  LimitNOFILE=65536
  LimitNPROC=65536
  EOF
  pid=$(pgrep -f aws157.$proc.Main)
  sudo prlimit --pid $pid --nofile=65535:65535 --nproc=65535:65535
  done
  sudo systemctl daemon-reload
  ```
+ Spark interactive shell, including PySpark, SparkR, and spark-shell, does not support using Docker with additional libraries.
+ To use Python 3 with Amazon EMR version 6.0.0, you must add `PATH` to `yarn.nodemanager.env-whitelist`.
+ The Live Long and Process (LLAP) functionality is not supported when you use the AWS Glue Data Catalog as the metastore for Hive.
+ When using Amazon EMR 6.0.0 with Spark and Docker integration, you need to configure the instances in your cluster with the same instance type and the same amount of EBS volumes to avoid failure when submitting a Spark job with Docker runtime.
+ In Amazon EMR 6.0.0, HBase on Amazon S3 storage mode is impacted by the [HBASE-24286](https://issues.apache.org/jira/browse/HBASE-24286). issue. HBase master cannot initialize when the cluster is created using existing S3 data.
+ Known issue in clusters with multiple primary nodes and Kerberos authentication

  If you run clusters with multiple primary nodes and Kerberos authentication in Amazon EMR releases 5.20.0 and later, you may encounter problems with cluster operations such as scale down or step submission, after the cluster has been running for some time. The time period depends on the Kerberos ticket validity period that you defined. The scale-down problem impacts both automatic scale-down and explicit scale down requests that you submitted. Additional cluster operations can also be impacted. 

  Workaround:
  + SSH as `hadoop` user to the lead primary node of the EMR cluster with multiple primary nodes.
  +  Run the following command to renew Kerberos ticket for `hadoop` user. 

    ```
    kinit -kt <keytab_file> <principal>
    ```

    Typically, the keytab file is located at `/etc/hadoop.keytab` and the principal is in the form of `hadoop/<hostname>@<REALM>`.
**Note**  
This workaround will be effective for the time period the Kerberos ticket is valid. This duration is 10 hours by default, but can configured by your Kerberos settings. You must re-run the above command once the Kerberos ticket expires.

## 6.0.0 component versions
<a name="emr-600-components"></a>

The components that Amazon EMR installs with this release are listed below. Some are installed as part of big-data application packages. Others are unique to Amazon EMR and installed for system processes and features. These typically start with `emr` or `aws`. Big-data application packages in the most recent Amazon EMR release are usually the latest version found in the community. We make community releases available in Amazon EMR as quickly as possible.

Some components in Amazon EMR differ from community versions. These components have a version label in the form `CommunityVersion-amzn-EmrVersion`. The `EmrVersion` starts at 0. For example, if open source community component named `myapp-component` with version 2.2 has been modified three times for inclusion in different Amazon EMR releases, its release version is listed as `2.2-amzn-2`.


| Component | Version | Description | 
| --- | --- | --- | 
| aws-sagemaker-spark-sdk | 1.2.6 | Amazon SageMaker Spark SDK | 
| emr-ddb | 4.14.0 | Amazon DynamoDB connector for Hadoop ecosystem applications. | 
| emr-goodies | 3.0.0 | Extra convenience libraries for the Hadoop ecosystem. | 
| emr-kinesis | 3.5.0 | Amazon Kinesis connector for Hadoop ecosystem applications. | 
| emr-s3-dist-cp | 2.14.0 | Distributed copy application optimized for Amazon S3. | 
| emr-s3-select | 1.5.0 | EMR S3Select Connector | 
| emrfs | 2.39.0 | Amazon S3 connector for Hadoop ecosystem applications. | 
| ganglia-monitor | 3.7.2 | Embedded Ganglia agent for Hadoop ecosystem applications along with the Ganglia monitoring agent. | 
| ganglia-metadata-collector | 3.7.2 | Ganglia metadata collector for aggregating metrics from Ganglia monitoring agents. | 
| ganglia-web | 3.7.1 | Web application for viewing metrics collected by the Ganglia metadata collector. | 
| hadoop-client | 3.2.1-amzn-0 | Hadoop command-line clients such as 'hdfs', 'hadoop', or 'yarn'. | 
| hadoop-hdfs-datanode | 3.2.1-amzn-0 | HDFS node-level service for storing blocks. | 
| hadoop-hdfs-library | 3.2.1-amzn-0 | HDFS command-line client and library | 
| hadoop-hdfs-namenode | 3.2.1-amzn-0 | HDFS service for tracking file names and block locations. | 
| hadoop-hdfs-journalnode | 3.2.1-amzn-0 | HDFS service for managing the Hadoop filesystem journal on HA clusters. | 
| hadoop-httpfs-server | 3.2.1-amzn-0 | HTTP endpoint for HDFS operations. | 
| hadoop-kms-server | 3.2.1-amzn-0 | Cryptographic key management server based on Hadoop's KeyProvider API. | 
| hadoop-mapred | 3.2.1-amzn-0 | MapReduce execution engine libraries for running a MapReduce application. | 
| hadoop-yarn-nodemanager | 3.2.1-amzn-0 | YARN service for managing containers on an individual node. | 
| hadoop-yarn-resourcemanager | 3.2.1-amzn-0 | YARN service for allocating and managing cluster resources and distributed applications. | 
| hadoop-yarn-timeline-server | 3.2.1-amzn-0 | Service for retrieving current and historical information for YARN applications. | 
| hbase-hmaster | 2.2.3 | Service for an HBase cluster responsible for coordination of Regions and execution of administrative commands. | 
| hbase-region-server | 2.2.3 | Service for serving one or more HBase regions. | 
| hbase-client | 2.2.3 | HBase command-line client. | 
| hbase-rest-server | 2.2.3 | Service providing a RESTful HTTP endpoint for HBase. | 
| hbase-thrift-server | 2.2.3 | Service providing a Thrift endpoint to HBase. | 
| hcatalog-client | 3.1.2-amzn-0 | The 'hcat' command line client for manipulating hcatalog-server. | 
| hcatalog-server | 3.1.2-amzn-0 | Service providing HCatalog, a table and storage management layer for distributed applications. | 
| hcatalog-webhcat-server | 3.1.2-amzn-0 | HTTP endpoint providing a REST interface to HCatalog. | 
| hive-client | 3.1.2-amzn-0 | Hive command line client. | 
| hive-hbase | 3.1.2-amzn-0 | Hive-hbase client. | 
| hive-metastore-server | 3.1.2-amzn-0 | Service for accessing the Hive metastore, a semantic repository storing metadata for SQL on Hadoop operations. | 
| hive-server2 | 3.1.2-amzn-0 | Service for accepting Hive queries as web requests. | 
| hudi | 0.5.0-incubating-amzn-1 | Incremental processing framework to power data pipline at low latency and high efficiency. | 
| hudi-presto | 0.5.0-incubating-amzn-1 | Bundle library for running Presto with Hudi. | 
| hue-server | 4.4.0 | Web application for analyzing data using Hadoop ecosystem applications | 
| jupyterhub | 1.0.0 | Multi-user server for Jupyter notebooks | 
| livy-server | 0.6.0-incubating | REST interface for interacting with Apache Spark | 
| nginx | 1.12.1 | nginx [engine x] is an HTTP and reverse proxy server | 
| mxnet | 1.5.1 | A flexible, scalable, and efficient library for deep learning. | 
| mariadb-server | 5.5.64\$1 | MariaDB database server. | 
| nvidia-cuda | 9.2.88 | Nvidia drivers and Cuda toolkit | 
| oozie-client | 5.1.0 | Oozie command-line client. | 
| oozie-server | 5.1.0 | Service for accepting Oozie workflow requests. | 
| opencv | 3.4.0 | Open Source Computer Vision Library. | 
| phoenix-library | 5.0.0-HBase-2.0 | The phoenix libraries for server and client | 
| phoenix-query-server | 5.0.0-HBase-2.0 | A light weight server providing JDBC access as well as Protocol Buffers and JSON format access to the Avatica API  | 
| presto-coordinator | 0.230 | Service for accepting queries and managing query execution among presto-workers. | 
| presto-worker | 0.230 | Service for executing pieces of a query. | 
| presto-client | 0.230 | Presto command-line client which is installed on an HA cluster's stand-by masters where Presto server is not started. | 
| r | 3.4.3 | The R Project for Statistical Computing | 
| spark-client | 2.4.4 | Spark command-line clients. | 
| spark-history-server | 2.4.4 | Web UI for viewing logged events for the lifetime of a completed Spark application. | 
| spark-on-yarn | 2.4.4 | In-memory execution engine for YARN. | 
| spark-yarn-slave | 2.4.4 | Apache Spark libraries needed by YARN slaves. | 
| tensorflow | 1.14.0 | TensorFlow open source software library for high performance numerical computation. | 
| tez-on-yarn | 0.9.2 | The tez YARN application and libraries. | 
| webserver | 2.4.41\$1 | Apache HTTP server. | 
| zeppelin-server | 0.9.0-SNAPSHOT | Web-based notebook that enables interactive data analytics. | 
| zookeeper-server | 3.4.14 | Centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. | 
| zookeeper-client | 3.4.14 | ZooKeeper command line client. | 

## 6.0.0 configuration classifications
<a name="emr-600-class"></a>

Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as `hive-site.xml`. For more information, see [Configure applications](emr-configure-apps.md).


**emr-6.0.0 classifications**  

| Classifications | Description | 
| --- | --- | 
| capacity-scheduler | Change values in Hadoop's capacity-scheduler.xml file. | 
| container-executor | Change values in Hadoop YARN's container-executor.cfg file. | 
| container-log4j | Change values in Hadoop YARN's container-log4j.properties file. | 
| core-site | Change values in Hadoop's core-site.xml file. | 
| emrfs-site | Change EMRFS settings. | 
| hadoop-env | Change values in the Hadoop environment for all Hadoop components. | 
| hadoop-log4j | Change values in Hadoop's log4j.properties file. | 
| hadoop-ssl-server | Change hadoop ssl server configuration | 
| hadoop-ssl-client | Change hadoop ssl client configuration | 
| hbase | Amazon EMR-curated settings for Apache HBase. | 
| hbase-env | Change values in HBase's environment. | 
| hbase-log4j | Change values in HBase's hbase-log4j.properties file. | 
| hbase-metrics | Change values in HBase's hadoop-metrics2-hbase.properties file. | 
| hbase-policy | Change values in HBase's hbase-policy.xml file. | 
| hbase-site | Change values in HBase's hbase-site.xml file. | 
| hdfs-encryption-zones | Configure HDFS encryption zones. | 
| hdfs-env | Change values in the HDFS environment. | 
| hdfs-site | Change values in HDFS's hdfs-site.xml. | 
| hcatalog-env | Change values in HCatalog's environment. | 
| hcatalog-server-jndi | Change values in HCatalog's jndi.properties. | 
| hcatalog-server-proto-hive-site | Change values in HCatalog's proto-hive-site.xml. | 
| hcatalog-webhcat-env | Change values in HCatalog WebHCat's environment. | 
| hcatalog-webhcat-log4j2 | Change values in HCatalog WebHCat's log4j2.properties. | 
| hcatalog-webhcat-site | Change values in HCatalog WebHCat's webhcat-site.xml file. | 
| hive | Amazon EMR-curated settings for Apache Hive. | 
| hive-beeline-log4j2 | Change values in Hive's beeline-log4j2.properties file. | 
| hive-parquet-logging | Change values in Hive's parquet-logging.properties file. | 
| hive-env | Change values in the Hive environment. | 
| hive-exec-log4j2 | Change values in Hive's hive-exec-log4j2.properties file. | 
| hive-llap-daemon-log4j2 | Change values in Hive's llap-daemon-log4j2.properties file. | 
| hive-log4j2 | Change values in Hive's hive-log4j2.properties file. | 
| hive-site | Change values in Hive's hive-site.xml file | 
| hiveserver2-site | Change values in Hive Server2's hiveserver2-site.xml file | 
| hue-ini | Change values in Hue's ini file | 
| httpfs-env | Change values in the HTTPFS environment. | 
| httpfs-site | Change values in Hadoop's httpfs-site.xml file. | 
| hadoop-kms-acls | Change values in Hadoop's kms-acls.xml file. | 
| hadoop-kms-env | Change values in the Hadoop KMS environment. | 
| hadoop-kms-log4j | Change values in Hadoop's kms-log4j.properties file. | 
| hadoop-kms-site | Change values in Hadoop's kms-site.xml file. | 
| jupyter-notebook-conf | Change values in Jupyter Notebook's jupyter\$1notebook\$1config.py file. | 
| jupyter-hub-conf | Change values in JupyterHubs's jupyterhub\$1config.py file. | 
| jupyter-s3-conf | Configure Jupyter Notebook S3 persistence. | 
| jupyter-sparkmagic-conf | Change values in Sparkmagic's config.json file. | 
| livy-conf | Change values in Livy's livy.conf file. | 
| livy-env | Change values in the Livy environment. | 
| livy-log4j | Change Livy log4j.properties settings. | 
| mapred-env | Change values in the MapReduce application's environment. | 
| mapred-site | Change values in the MapReduce application's mapred-site.xml file. | 
| oozie-env | Change values in Oozie's environment. | 
| oozie-log4j | Change values in Oozie's oozie-log4j.properties file. | 
| oozie-site | Change values in Oozie's oozie-site.xml file. | 
| phoenix-hbase-metrics | Change values in Phoenix's hadoop-metrics2-hbase.properties file. | 
| phoenix-hbase-site | Change values in Phoenix's hbase-site.xml file. | 
| phoenix-log4j | Change values in Phoenix's log4j.properties file. | 
| phoenix-metrics | Change values in Phoenix's hadoop-metrics2-phoenix.properties file. | 
| presto-log | Change values in Presto's log.properties file. | 
| presto-config | Change values in Presto's config.properties file. | 
| presto-password-authenticator | Change values in Presto's password-authenticator.properties file. | 
| presto-env | Change values in Presto's presto-env.sh file. | 
| presto-node | Change values in Presto's node.properties file. | 
| presto-connector-blackhole | Change values in Presto's blackhole.properties file. | 
| presto-connector-cassandra | Change values in Presto's cassandra.properties file. | 
| presto-connector-hive | Change values in Presto's hive.properties file. | 
| presto-connector-jmx | Change values in Presto's jmx.properties file. | 
| presto-connector-kafka | Change values in Presto's kafka.properties file. | 
| presto-connector-localfile | Change values in Presto's localfile.properties file. | 
| presto-connector-memory | Change values in Presto's memory.properties file. | 
| presto-connector-mongodb | Change values in Presto's mongodb.properties file. | 
| presto-connector-mysql | Change values in Presto's mysql.properties file. | 
| presto-connector-postgresql | Change values in Presto's postgresql.properties file. | 
| presto-connector-raptor | Change values in Presto's raptor.properties file. | 
| presto-connector-redis | Change values in Presto's redis.properties file. | 
| presto-connector-redshift | Change values in Presto's redshift.properties file. | 
| presto-connector-tpch | Change values in Presto's tpch.properties file. | 
| presto-connector-tpcds | Change values in Presto's tpcds.properties file. | 
| ranger-kms-dbks-site | Change values in dbks-site.xml file of Ranger KMS. | 
| ranger-kms-site | Change values in ranger-kms-site.xml file of Ranger KMS. | 
| ranger-kms-env | Change values in the Ranger KMS environment. | 
| ranger-kms-log4j | Change values in kms-log4j.properties file of Ranger KMS. | 
| ranger-kms-db-ca | Change values for CA file on S3 for MySQL SSL connection with Ranger KMS. | 
| recordserver-env | Change values in the EMR RecordServer environment. | 
| recordserver-conf | Change values in EMR RecordServer's erver.properties file. | 
| recordserver-log4j | Change values in EMR RecordServer's log4j.properties file. | 
| spark | Amazon EMR-curated settings for Apache Spark. | 
| spark-defaults | Change values in Spark's spark-defaults.conf file. | 
| spark-env | Change values in the Spark environment. | 
| spark-hive-site | Change values in Spark's hive-site.xml file | 
| spark-log4j | Change values in Spark's log4j.properties file. | 
| spark-metrics | Change values in Spark's metrics.properties file. | 
| tez-site | Change values in Tez's tez-site.xml file. | 
| yarn-env | Change values in the YARN environment. | 
| yarn-site | Change values in YARN's yarn-site.xml file. | 
| zeppelin-env | Change values in the Zeppelin environment. | 
| zookeeper-config | Change values in ZooKeeper's zoo.cfg file. | 
| zookeeper-log4j | Change values in ZooKeeper's log4j.properties file. | 