

# Managing indexes in Amazon OpenSearch Service
<a name="managing-indices"></a>

After you add data to Amazon OpenSearch Service, you often need to reindex that data, work with index aliases, move an index to more cost-effective storage, or delete it altogether. This chapter covers storage tiering solutions, Index State Management, and other index operations. For information on the OpenSearch index APIs, see the [OpenSearch documentation](https://docs.opensearch.org/latest/opensearch/reindex-data/).

**Topics**
+ [OpenSearch Optimized Instances for Amazon OpenSearch Service domains](or1.md)
+ [Multi-tier storage for Amazon OpenSearch Service](multi-tier-storage.md)
+ [UltraWarm storage for Amazon OpenSearch Service](ultrawarm.md)
+ [Cold storage for Amazon OpenSearch Service](cold-storage.md)
+ [Index State Management in Amazon OpenSearch Service](ism.md)
+ [Summarizing indexes in Amazon OpenSearch Service with index rollups](rollup.md)
+ [Transforming indexes in Amazon OpenSearch Service](transforms.md)
+ [Cross-cluster replication for Amazon OpenSearch Service](replication.md)
+ [Migrating Amazon OpenSearch Service indexes using remote reindex](remote-reindex.md)
+ [Managing time-series data in Amazon OpenSearch Service with data streams](data-streams.md)

# OpenSearch Optimized Instances for Amazon OpenSearch Service domains
<a name="or1"></a>

The OpenSearch optimized instance family for Amazon OpenSearch Service is a cost-effective solution for storing large volumes of data. A domain with OpenSearch optimized instances use local storage as primary, with data copied synchronously to Amazon S3 as it arrives. This storage structure provides increased indexing throughput with high durability. OR1, OR2, OM2 uses uses Amazon Elastic Block Store (Amazon EBS) `gp3` or `io1` volumes locally whereas OI2 instances use local NVMe disks. The OpenSearch optimized instance family also supports automatic data recovery in the event of failure. For information about OpenSearch optimized instance type options, see [Current generation instance types](supported-instance-types.md#latest-gen). 

If you're running indexing heavy operational analytics workloads such as log analytics, observability, or security analytics, you can benefit from the improved performance and compute efficiency of OpenSearch optimized instances. In addition, the automatic data recovery offered by OpenSearch optimized instances improves the overall reliability of your domain.

OpenSearch Service sends storage-related OpenSearch optimized instance metrics to Amazon CloudWatch. For a list of available metrics, see [OpenSearch Optimized Instances (OR1) metrics](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-or1).

OpenSearch optimized instances are available on-demand or with Reserved Instance pricing, with an hourly rate for the instances and storage provisioned in Amazon EBS and Amazon S3. 

**Topics**
+ [Limitations](#or1-considerations)
+ [Tuning for better ingestion throughput](#or1-ultrawarm-tuning)
+ [How OpenSearch optimized instances differ from other instances](#or1-optimized-instances)
+ [How OpenSearch optimized instances differ from UltraWarm instances](#or1-ultrawarm-differences)
+ [Provisioning a domain with OpenSearch optimized instances](#or1-using)

## Limitations
<a name="or1-considerations"></a>

Consider the following limitations when using OpenSearch optimized instances for your domain.
+ Newly created domains must be running OpenSearch version 2.11 or higher.
+ Exisiting domains must be running OpenSearch version 2.15 or higher.
+ Your domain must have encryption at rest enabled. For more information, see [Encryption of data at rest for Amazon OpenSearch Service](encryption-at-rest.md).
+ If your domain uses dedicated master nodes, they must use Graviton instances. For more information about dedicated master nodes, see [Dedicated master nodes in Amazon OpenSearch Service](managedomains-dedicatedmasternodes.md).
+ The refresh interval for indexes on OpenSearch optimized instances must be 10 seconds or higher. The default refresh interval for OpenSearch optimized instances is 10 seconds.

## Tuning for better ingestion throughput
<a name="or1-ultrawarm-tuning"></a>

To get the best indexing throughput from your OpenSearch optimized instances, we recommend that you do the following:
+ Use large bulk sizes to improve buffer utilization. The recommended size is 10 MB.
+ Use multiple clients to improve parallel processing performance.
+ Set your number of active primary shards to match the number of data nodes to maximize resource utilization.

## How OpenSearch optimized instances differ from other instances
<a name="or1-optimized-instances"></a>

OpenSearch optimized instances differ from non-optimized instances in the following ways:
+ For OpenSearch optimized instances, indexing is only performed on primary shards. 
+ If OpenSearch optimized instances are configured with replicas, the indexing rate may appear lower than it actually is. For example, if there is one primary shard and one replica shard, the indexing rate might show a rate of 1000 when the actual indexing rate is 2000.
+ OpenSearch optimized instances perform buffer operations prior to sending to a remote source. This results in higher ingestion latencies. 
**Note**  
The `IndexingLatency` metric is not affected, as it doesn’t include time to sync translog.
+ Replica shards can be a few seconds behind primary shards. You can monitor the lag using the `ReplicationLagMaxTime` Amazon CloudWatch metric 

## How OpenSearch optimized instances differ from UltraWarm instances
<a name="or1-ultrawarm-differences"></a>

OpenSearch Service provides UltraWarm instances that are a cost-effective way to store large amounts of read-only data. Both OpenSearch optimized and UltraWarm instances store data locally in Amazon EBS and remotely in Amazon S3. However, OpenSearch optimized and UltraWarm instances differ in several important ways:
+ OpenSearch optimized instances keep a copy of data in *both* your local and remote store. In UltraWarm instances, data is kept primarily in remote store to reduce storage costs. Depending on your usage patterns, data can be moved to local storage.
+ OpenSearch optimized instances are active and can accept read and write operations, whereas the data on UltraWarm instances is read-only until you manually move it back to hot storage.
+ UltraWarm relies on index snapshots for data durability. OpenSearch optimized instances, by comparison, perform replication and recovery behind the scenes. In the event of a red index, OpenSearch optimized instances will automatically restore missing shards from your remote storage in Amazon S3. The recovery time varies depending on the volume of data to be recovered. 

For more information about UltraWarm storage, see [UltraWarm storage for Amazon OpenSearch Service](ultrawarm.md).

## Provisioning a domain with OpenSearch optimized instances
<a name="or1-using"></a>

You can select OpenSearch optimized instances for your data nodes when you create a new domain with the AWS Management Console or the AWS Command Line Interface (AWS CLI). You can then index and query the data using your existing tools.

### Console
<a name="or1-console"></a>

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, choose **Domains**. 

1. Choose **Create domain**.

1. In the **Number of data nodes** section, expand the **Instance family** menu and choose **OpenSearch optimized**.

1. Choose the instance type and other storage settings.

1. In the **Encryption** section, make sure that **Enable encryption of data at rest** is selected.

1. Configure the rest of your domain and choose **Create**.

### AWS CLI
<a name="or1-cli"></a>

To provision a domain that uses OpenSearch optimized storage using the AWS CLI, you must provide the value of the specific instance type size (such as OR1, OR2, OM2, or OI2) in the `InstanceType`.

The following example creates a domain with OR1 instances of size `2xlarge` and enables encryption at rest.

```
aws opensearch create-domain \
  --domain-name test-domain \
  --engine-version OpenSearch_2.11 \
  --cluster-config "InstanceType=or1.2xlarge.search,InstanceCount=3,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3" \
  --ebs-options "EBSEnabled=true,VolumeType=gp3,VolumeSize=200" \
  --encryption-at-rest-options Enabled=true \
  --advanced-security-options "Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions={MasterUserName=test-user,MasterUserPassword=test-password}" \
  --node-to-node-encryption-options Enabled=true \
  --domain-endpoint-options EnforceHTTPS=true \
  --access-policies '{"Version": "2012-10-17",		 	 	 "Statement":[{"Effect":"Allow","Principal":{"AWS":"*"},"Action":"es:*","Resource":"arn:aws:es:us-east-1:account-id:domain/test-domain/*"}]}'
```

The following example creates a domain with OI2 instances of size `large` and enables encryption at rest. Note that OI2 instances do not require EBS configuration as they use local NVMe storage.

```
aws opensearch create-domain \
  --domain-name test-domain-oi2 \
  --engine-version OpenSearch_2.11 \
  --cluster-config "InstanceType=oi2.2xlarge.search,InstanceCount=3,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3" \
  --encryption-at-rest-options Enabled=true \
  --advanced-security-options "Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions={MasterUserName=test-user,MasterUserPassword=test-password}" \
  --node-to-node-encryption-options Enabled=true \
  --domain-endpoint-options EnforceHTTPS=true \
  --access-policies '{"Version": "2012-10-17",		 	 	 "Statement":[{"Effect":"Allow","Principal":{"AWS":"*"},"Action":"es:*","Resource":"arn:aws:es:us-east-1:account-id:domain/test-domain-oi2/*"}]}'
```

# Multi-tier storage for Amazon OpenSearch Service
<a name="multi-tier-storage"></a>

Multi-tier storage for Amazon OpenSearch Service is an intelligent data management solution that optimizes both performance and costs by managing data across different storage tiers. This architecture allows organizations to efficiently balance the trade-off between performance and cost by keeping frequently accessed data in high-performance hot storage while moving less frequently accessed data to more cost-effective warm storage.

Amazon OpenSearch Service offers two architecture options for hot/warm storage tier:
+ **OpenSearch Multi-tier Storage Architecture **
  + Combines Amazon S3 with local instance storage
  + Powered by OpenSearch Optimized Instances
  + Supports write operations in warm tier
  + Supports seamless data migration between hot and warm tier
  + Available on OpenSearch 3.3 and above
  + Does not support Cold Tier
+ **UltraWarm-based Architecture**
  + Combines Amazon S3 with local instance storage
  + Powered by UltraWarm Instances
  + Optimized for read-only warm tier workloads
  + Available on Elasticsearch version 6.8 and above and all OpenSearch versions
  + Supports Cold Tier

**Note**  
 This documentation only focuses on the multi-tier architecture. For Ultrawarm storage architecture, see [Ultrawarm](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/ultrawarm.html) and [Cold Storage](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cold-storage.html) 

## Multi-tier storage architecture
<a name="multi-tier"></a>

**Topics**
+ [Key benefits](#multi-tier-benefits)
+ [Prerequisites](#multi-tier-prerequisites)
+ [Limitations](#multi-tier-limitations)
+ [Things to note](#things-to-note)
+ [Creating a multi-tier domain](#multi-tier-creating)
+ [Managing tier migrations](#multi-tier-managing)
+ [Security configuration](#multi-tier-security)
+ [Best practices](#multi-tier-best-practices)
+ [Monitoring metrics](#multi-tier-metrics)

### Key benefits
<a name="multi-tier-benefits"></a>
+ **Writable Warm:** Supports write operations on warm indices
+ **Seamless Migration:** Seamless movement of data across storage tiers
+ **Cost Optimization:** Reduce storage costs by moving less-active data to cost-effective warm storage
+ **Performance Enhancement:** Maintain high performance for frequently accessed data in the hot tier
+ **Flexible Data Management:** Choose the architecture that best fits your workload requirements
+ **Automated Management:** Simplified data lifecycle management across storage tiers

### Prerequisites
<a name="multi-tier-prerequisites"></a>
+ **Engine version:** OpenSearch **3.3 or later**
+ **Instance families:**
  + Hot nodes: OR1, OR2, OM2, or OI2
  + Warm Nodes: OI2
+ **Security:** Node-to-node encryption, encryption at rest, HTTPS enforced

### Limitations
<a name="multi-tier-limitations"></a>
+ Works on all domains with OpenSearch optimized instances that do not have Ultrawarm enabled already
+ No support for cold tier

### Things to note
<a name="things-to-note"></a>
+ Hot to warm migrations do not trigger a force merge in mult-tier architecture. If needed, user can still orchestrate force merges using Index Management policy. 
+ Apart from indexing, warm nodes now also perform background merge operations (similar to hot).
+ All search requests on warm indices are routed to primary shard and replica serves reads only when primary shard is down.
+ Automated snapshots for warm indices are also supported with this architecture
+ Cross cluster replication is only supported for hot indices
+ Index APIs such as Shrink, Split and Clone doesn't work with warm indices.

### Creating a multi-tier domain
<a name="multi-tier-creating"></a>

#### Step 1: Create the domain
<a name="multi-tier-step1"></a>

```
aws opensearch create-domain \
      --domain-name my-domain \
      --engine-version OpenSearch_3.3 \
      --cluster-config InstanceCount=3,InstanceType=or2.large.search,DedicatedMasterEnabled=true,DedicatedMasterType=m6g.large.search,DedicatedMasterCount=3,WarmEnabled=true,WarmCount=3,WarmType=oi2.2xlarge.search \
      --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=11 \
      --node-to-node-encryption-options Enabled=true \
      --encryption-at-rest-options Enabled=true \
      --domain-endpoint-options EnforceHTTPS=true,TLSSecurityPolicy=Policy-Min-TLS-1-2-2019-07 \
      --advanced-security-options '{"Enabled":true,"InternalUserDatabaseEnabled":true,"MasterUserOptions":{"MasterUserName":"user_name","MasterUserPassword":"your_pass"}}' \
      --access-policies '{"Version": "2012-10-17",		 	 	  "Statement":[{"Effect":"Allow","Principal":"*","Action":"es:*","Resource":"*"}]}' \
      --region us-east-1
```

#### Step 2: Verify warm nodes
<a name="multi-tier-step2"></a>

```
aws opensearch describe-domain-nodes --domain-name my-domain --region us-east-1
```

Sample response (excerpt):

```
{
      "NodeType": "Warm",
      "InstanceType": "oi2.large.search",
      "NodeStatus": "Active"
    }
```

### Managing tier migrations
<a name="multi-tier-managing"></a>

Multi-tier domains support:
+ New Tiering APIs for a simplified experience
+ Legacy UltraWarm APIs for compatibility

#### New tiering APIs
<a name="multi-tier-new-apis"></a>

**Migrate an index to warm:**

```
curl -XPOST 'https://localhost:9200/index-name/_tier/warm'
```

Response:

```
{"acknowledged": true}
```

**Migrate an index to hot:**

```
curl -XPOST 'https://localhost:9200/index-name/_tier/hot'
```

Response:

```
{"acknowledged": true}
```

**Check tiering status:**

```
curl -XGET 'https://localhost:9200/index-name/_tier'
```

Example response:

```
{
      "tiering_status": {
         "index": "index-name",
         "state": "RUNNING_SHARD_RELOCATION",
         "source": "HOT",
         "target": "WARM",
         "start_time": 1745836500563,
         "shard_level_status": {
           "running": 0,
           "total": 100,
           "pending": 100,
           "succeeded": 0
         }
      }
    }
```

**Detailed shard view:**

```
curl 'https://localhost:9200/index1/_tier?detailed=true'
```

**List all ongoing migrations (text):**

```
curl 'https://localhost:9200/_tier/all'
```

**List all ongoing migrations (JSON):**

```
curl 'https://localhost:9200/_tier/all?format=json'
```

**Filter by target tier:**

```
curl 'https://localhost:9200/_tier/all?target=_warm'
```

#### Legacy UltraWarm APIs for compatibility
<a name="multi-tier-legacy-apis"></a>

**Migrate to warm:**

```
curl -XPOST localhost:9200/_ultrawarm/migration/index2/_warm
```

**Migrate to hot:**

```
curl -XPOST localhost:9200/_ultrawarm/migration/index2/_hot
```

**Check status:**

```
curl -XGET localhost:9200/_ultrawarm/migration/index2/_status
```

### Security configuration
<a name="multi-tier-security"></a>

If you enable multi-tier storage on a preexisting Amazon OpenSearch Service domain, the `storage_tiering_manager` role might not be defined on the domain. Non-admin users must be mapped to this role in order to manage warm indexes on domains using fine-grained access control. To manually create the `storage_tiering_manager` role, perform the following steps:

1. In OpenSearch Dashboards, go to **Security** and choose **Permissions**.

1. Choose **Create action group** and configure the following groups:    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/multi-tier-storage.html)

1. Choose **Roles** and **Create role**.

1. Name the role `storage_tiering_manager`.

1. For **Cluster permissions**, select `storage_tiering_cluster` and `cluster_monitor`.

1. For **Index**, type `*`.

1. For **Index permissions**, select `storage_tiering_index_read`, `storage_tiering_index_write` and `indices_monitor`.

1. Choose **Create**.

1. After you create the role, map it to any user or backend role that will manage multi-tier indexes.

### Best practices
<a name="multi-tier-best-practices"></a>

As you implement multi-tier storage in your Amazon OpenSearch Service domains, consider the following best practices:
+ Regularly review your data access patterns to optimize tier allocation
+ Monitor performance metrics to ensure efficient resource utilization
+ Use the new tiering APIs for granular control over data migration

### Monitoring metrics
<a name="multi-tier-metrics"></a>

Multi-tier storage domains provide additional metrics for monitoring warm tier performance. These metrics include both existing UltraWarm metrics and new metrics specific to the OpenSearch Optimized Instances architecture:

#### New metrics
<a name="multi-tier-new-metrics"></a>


| Metric Name | Node Level Stats | Cluster Level Stats | Granularity | 
| --- | --- | --- | --- | 
| WarmIndexingLatency | Average | Average | 1 min | 
| WarmIndexingRate | Average | Average, Maximum, Sum | 1 min | 
| WarmThreadpoolIndexingQueue | Maximum | Sum, Maximum, Average | 1 min | 
| WarmThreadpoolIndexingRejected | Maximum | Sum | 1 min | 
| WarmThreadpoolIndexingThreads | Maximum | Sum, Average | 1 min | 

# UltraWarm storage for Amazon OpenSearch Service
<a name="ultrawarm"></a>

UltraWarm provides a cost-effective way to store large amounts of read-only data on Amazon OpenSearch Service. Standard data nodes use "hot" storage, which takes the form of instance stores or Amazon EBS volumes attached to each node. Hot storage provides the fastest possible performance for indexing and searching new data.

Rather than attached storage, UltraWarm nodes use Amazon S3 and a sophisticated caching solution to improve performance. For indexes that you are not actively writing to, query less frequently, and don't need the same performance from, UltraWarm offers significantly lower costs per GiB of data. Because warm indexes are read-only unless you return them to hot storage, UltraWarm is best-suited to immutable data, such as logs.

In OpenSearch, warm indexes behave just like any other index. You can query them using the same APIs or use them to create visualizations in OpenSearch Dashboards.

**Topics**
+ [Prerequisites](#ultrawarm-pp)
+ [UltraWarm storage requirements and performance considerations](#ultrawarm-calc)
+ [UltraWarm pricing](#ultrawarm-pricing)
+ [Enabling UltraWarm](#ultrawarm-enable)
+ [Migrating indexes to UltraWarm storage](#ultrawarm-migrating)
+ [Automating migrations](#ultrawarm-ism)
+ [Migration tuning](#ultrawarm-settings)
+ [Cancelling migrations](#ultrawarm-cancel)
+ [Listing hot and warm indexes](#ultrawarm-api)
+ [Returning warm indexes to hot storage](#ultrawarm-migrating-back)
+ [Restoring warm indexes from snapshots](#ultrawarm-snapshot)
+ [Manual snapshots of warm indexes](#ultrawarm-manual-snapshot)
+ [Migrating warm indexes to cold storage](#ultrawarm-cold)
+ [Best practices for KNN indexes](#ultrawarm-recommendations)
+ [Disabling UltraWarm](#ultrawarm-disable)

## Prerequisites
<a name="ultrawarm-pp"></a>

UltraWarm has a few important prerequisites:
+ UltraWarm requires OpenSearch or Elasticsearch 6.8 or higher.
+ To use warm storage, domains must have [dedicated master nodes](managedomains-dedicatedmasternodes.md).
+ When using a [Multi-AZ with Standby](managedomains-multiaz.md#managedomains-za-standby) domain, the number of warm nodes must be a multiple of the number of Availability Zones being used.
+ If your domain uses a T2 or T3 instance type for your data nodes, you can't use warm storage.
+ If your index uses approximate k-NN (`"index.knn":true`), you can move it to warm storage from version 2.17 and later. Domains on versions earlier than 2.17 can upgrade to 2.17 to use this functionality, but KNN indices created on versions earlier than 2.x can't migrate to UltraWarm. 
+ If the domain uses [fine-grained access control](fgac.md), users must be mapped to the `ultrawarm_manager` role in OpenSearch Dashboards to make UltraWarm API calls.

**Note**  
The `ultrawarm_manager` role might not be defined on some preexisting OpenSearch Service domains. If you don't see the role in Dashboards, you need to [manually create it](#ultrawarm-create-role).

### Configure permissions
<a name="ultrawarm-create-role"></a>

If you enable UltraWarm on a preexisting OpenSearch Service domain, the `ultrawarm_manager` role might not be defined on the domain. Non-admin users must be mapped to this role in order to manage warm indexes on domains using fine-grained access control. To manually create the `ultrawarm_manager` role, perform the following steps:

1. In OpenSearch Dashboards, go to **Security** and choose **Permissions**.

1. Choose **Create action group** and configure the following groups:     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/ultrawarm.html)

1. Choose **Roles** and **Create role**.

1. Name the role **ultrawarm\$1manager**.

1. For **Cluster permissions, **select `ultrawarm_cluster` and `cluster_monitor`.

1. For **Index**, type `*`.

1. For **Index permissions**, select `ultrawarm_index_read`, `ultrawarm_index_write`, and `indices_monitor`.

1. Choose **Create**.

1. After you create the role, [map it](fgac.md#fgac-mapping) to any user or backend role that will manage UltraWarm indexes.

## UltraWarm storage requirements and performance considerations
<a name="ultrawarm-calc"></a>

As covered in [Calculating storage requirements](bp-storage.md), data in hot storage incurs significant overhead: replicas, Linux reserved space, and OpenSearch Service reserved space. For example, a 20 GiB primary shard with one replica shard requires roughly 58 GiB of hot storage.

Because it uses Amazon S3, UltraWarm incurs none of this overhead. When calculating UltraWarm storage requirements, you consider only the size of the primary shards. The durability of data in S3 removes the need for replicas, and S3 abstracts away any operating system or service considerations. That same 20 GiB shard requires 20 GiB of warm storage. If you provision an `ultrawarm1.large.search` instance, you can use all 20 TiB of its maximum storage for primary shards. See [UltraWarm storage quotas](limits.md#limits-ultrawarm) for a summary of instance types and the maximum amount of storage that each can address.

With UltraWarm, we still recommend a maximum shard size of 50 GiB. The [number of CPU cores and amount of RAM allocated to each UltraWarm instance type](#ultrawarm-pricing) gives you an idea of the number of shards they can simultaneously search. Note that while only primary shards count toward UltraWarm storage in S3, OpenSearch Dashboards and `_cat/indices` still report UltraWarm index size as the *total* of all primary and replica shards.

For example, each `ultrawarm1.medium.search` instance has two CPU cores and can address up to 1.5 TiB of storage on S3. Two of these instances have a combined 3 TiB of storage, which works out to approximately 62 shards if each shard is 50 GiB. If a request to the cluster only searches four of these shards, performance might be excellent. If the request is broad and searches all 62 of them, the four CPU cores might struggle to perform the operation. Monitor the `WarmCPUUtilization` and `WarmJVMMemoryPressure` [UltraWarm metrics](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-uw) to understand how the instances handle your workloads.

If your searches are broad or frequent, consider leaving the indexes in hot storage. Just like any other OpenSearch workload, the most important step to determining if UltraWarm meets your needs is to perform representative client testing using a realistic dataset.

## UltraWarm pricing
<a name="ultrawarm-pricing"></a>

With hot storage, you pay for what you provision. Some instances require an attached Amazon EBS volume, while others include an instance store. Whether that storage is empty or full, you pay the same price.

With UltraWarm storage, you pay for what you use. An `ultrawarm1.large.search` instance can address up to 20 TiB of storage on S3, but if you store only 1 TiB of data, you're only billed for 1 TiB of data. Like all other node types, you also pay an hourly rate for each UltraWarm node. For more information, see [Pricing for Amazon OpenSearch Service](what-is.md#pricing).

## Enabling UltraWarm
<a name="ultrawarm-enable"></a>

The console is the simplest way to create a domain that uses warm storage. While creating the domain, choose **Enable warm data nodes** and the number of warm nodes that you want. The same basic process works on existing domains, provided they meet the [prerequisites](#ultrawarm-pp). Even after the domain state changes from **Processing** to **Active**, UltraWarm might not be available to use for several hours.

When using a Multi-AZ with Standby domain, the number of warm nodes must be a multiple of the number of Availability Zones. For more information, see [Multi-AZ with Standby](managedomains-multiaz.md#managedomains-za-standby).

You can also use the [AWS CLI](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/opensearch/index.html) or [configuration API](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html) to enable UltraWarm, specifically the `WarmEnabled`, `WarmCount`, and `WarmType` options in `ClusterConfig`.

**Note**  
Domains support a maximum number of warm nodes. For details, see [Amazon OpenSearch Service quotas](limits.md).

### Sample CLI command
<a name="ultrawarm-sample-cli"></a>

The following AWS CLI command creates a domain with three data nodes, three dedicated master nodes, six warm nodes, and fine-grained access control enabled:

```
aws opensearch create-domain \
  --domain-name my-domain \
  --engine-version Opensearch_1.0 \
  --cluster-config InstanceCount=3,InstanceType=r6g.large.search,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3,ZoneAwarenessEnabled=true,ZoneAwarenessConfig={AvailabilityZoneCount=3},WarmEnabled=true,WarmCount=6,WarmType=ultrawarm1.medium.search \
  --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=11 \
  --node-to-node-encryption-options Enabled=true \
  --encryption-at-rest-options Enabled=true \
  --domain-endpoint-options EnforceHTTPS=true,TLSSecurityPolicy=Policy-Min-TLS-1-2-2019-07 \
  --advanced-security-options Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions='{MasterUserName=master-user,MasterUserPassword=master-password}' \
  --access-policies '{"Version": "2012-10-17",		 	 	 "Statement":[{"Effect":"Allow","Principal":{"AWS":["123456789012"]},"Action":["es:*"],"Resource":"arn:aws:es:us-west-1:123456789012:domain/my-domain/*"}]}' \
  --region us-east-1
```

For detailed information, see the [AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/).

### Sample configuration API request
<a name="ultrawarm-sample-config-api"></a>

The following request to the configuration API creates a domain with three data nodes, three dedicated master nodes, and six warm nodes with fine-grained access control enabled and a restrictive access policy:

```
POST https://es.us-east-2.amazonaws.com/2021-01-01/opensearch/domain
{
  "ClusterConfig": {
    "InstanceCount": 3,
    "InstanceType": "r6g.large.search",
    "DedicatedMasterEnabled": true,
    "DedicatedMasterType": "r6g.large.search",
    "DedicatedMasterCount": 3,
    "ZoneAwarenessEnabled": true,
    "ZoneAwarenessConfig": {
      "AvailabilityZoneCount": 3
    },
    "WarmEnabled": true,
    "WarmCount": 6,
    "WarmType": "ultrawarm1.medium.search"
  },
  "EBSOptions": {
    "EBSEnabled": true,
    "VolumeType": "gp2",
    "VolumeSize": 11
  },
  "EncryptionAtRestOptions": {
    "Enabled": true
  },
  "NodeToNodeEncryptionOptions": {
    "Enabled": true
  },
  "DomainEndpointOptions": {
    "EnforceHTTPS": true,
    "TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07"
  },
   "AdvancedSecurityOptions": {
    "Enabled": true,
    "InternalUserDatabaseEnabled": true,
    "MasterUserOptions": {
      "MasterUserName": "master-user",
      "MasterUserPassword": "master-password"
    }
  },
  "EngineVersion": "Opensearch_1.0",
  "DomainName": "my-domain",
  "AccessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"123456789012\"]},\"Action\":[\"es:*\"],\"Resource\":\"arn:aws:es:us-east-1:123456789012:domain/my-domain/*\"}]}"
}
```

For detailed information, see the [Amazon OpenSearch Service API Reference](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html).

## Migrating indexes to UltraWarm storage
<a name="ultrawarm-migrating"></a>

If you finished writing to an index and no longer need the fastest possible search performance, migrate it from hot to UltraWarm:

```
POST _ultrawarm/migration/my-index/_warm
```

Then check the status of the migration:

```
GET _ultrawarm/migration/my-index/_status

{
  "migration_status": {
    "index": "my-index",
    "state": "RUNNING_SHARD_RELOCATION",
    "migration_type": "HOT_TO_WARM",
    "shard_level_status": {
      "running": 0,
      "total": 5,
      "pending": 3,
      "failed": 0,
      "succeeded": 2
    }
  }
}
```

Index health must be green to perform a migration. If you migrate several indexes in quick succession, you can get a summary of all migrations in plaintext, similar to the `_cat` API:

```
GET _ultrawarm/migration/_status?v

index    migration_type state
my-index HOT_TO_WARM    RUNNING_SHARD_RELOCATION
```

OpenSearch Service migrates one index at a time to UltraWarm. You can have up to 200 migrations in the queue. Any request that exceeds the limit will be rejected. To check the current number of migrations in the queue, monitor the `HotToWarmMigrationQueueSize` [metric](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-uw). Indexes remain available throughout the migration process—no downtime.

The migration process has the following states:

```
PENDING_INCREMENTAL_SNAPSHOT
RUNNING_INCREMENTAL_SNAPSHOT
FAILED_INCREMENTAL_SNAPSHOT
PENDING_FORCE_MERGE
RUNNING_FORCE_MERGE
FAILED_FORCE_MERGE
PENDING_FULL_SNAPSHOT
RUNNING_FULL_SNAPSHOT
FAILED_FULL_SNAPSHOT
PENDING_SHARD_RELOCATION
RUNNING_SHARD_RELOCATION
FINISHED_SHARD_RELOCATION
```

As these states indicate, migrations might fail during snapshots, shard relocations, or force merges. Failures during snapshots or shard relocation are typically due to node failures or S3 connectivity issues. Lack of disk space is usually the underlying cause of force merge failures.

After a migration finishes, the same `_status` request returns an error. If you check the index at that time, you can see some settings that are unique to warm indexes:

```
GET my-index/_settings

{
  "my-index": {
    "settings": {
      "index": {
        "refresh_interval": "-1",
        "auto_expand_replicas": "false",
        "provided_name": "my-index",
        "creation_date": "1599241458998",
        "unassigned": {
          "node_left": {
            "delayed_timeout": "5m"
          }
        },
        "number_of_replicas": "1",
        "uuid": "GswyCdR0RSq0SJYmzsIpiw",
        "version": {
          "created": "7070099"
        },
        "routing": {
          "allocation": {
            "require": {
              "box_type": "warm"
            }
          }
        },
        "number_of_shards": "5",
        "merge": {
          "policy": {
            "max_merge_at_once_explicit": "50"
          }
        }
      }
    }
  }
}
```
+ `number_of_replicas`, in this case, is the number of passive replicas, which don't consume disk space.
+ `routing.allocation.require.box_type` specifies that the index should use warm nodes rather than standard data nodes.
+ `merge.policy.max_merge_at_once_explicit` specifies the number of segments to simultaneously merge during the migration.

Indexes in warm storage are read-only unless you [return them to hot storage](#ultrawarm-migrating-back), which makes UltraWarm best-suited to immutable data, such as logs. You can query the indexes and delete them, but you can't add, update, or delete individual documents. If you try, you might encounter the following error:

```
{
  "error" : {
    "root_cause" : [
      {
        "type" : "cluster_block_exception",
        "reason" : "index [indexname] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];"
      }
    ],
    "type" : "cluster_block_exception",
    "reason" : "index [indexname] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];"
  },
  "status" : 429
}
```

## Automating migrations
<a name="ultrawarm-ism"></a>

We recommend using [Index State Management in Amazon OpenSearch Service](ism.md) to automate the migration process after an index reaches a certain age or meets other conditions. See the [sample policy](ism.md#ism-example-cold) that demonstrates this workflow.

## Migration tuning
<a name="ultrawarm-settings"></a>

Index migrations to UltraWarm storage require a force merge. Each OpenSearch index is composed of some number of shards, and each shard is composed of some number of Lucene segments. The force merge operation purges documents that were marked for deletion and conserves disk space. By default, UltraWarm merges indexes into one segment, except for kNN indices, where a default value of 20 is used.

You can change this value up to 1,000 segments using the `index.ultrawarm.migration.force_merge.max_num_segments` setting. Higher values speed up the migration process, but increase query latency for the warm index after the migration finishes. To change the setting, make the following request:

```
PUT my-index/_settings
{
  "index": {
    "ultrawarm": {
      "migration": {
        "force_merge": {
          "max_num_segments": 1
        }
      }
    }
  }
}
```

To check how long this stage of the migration process takes, monitor the `HotToWarmMigrationForceMergeLatency` [metric](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-uw).

## Cancelling migrations
<a name="ultrawarm-cancel"></a>

UltraWarm handles migrations sequentially, in a queue. If a migration is in the queue, but has not yet started, you can remove it from the queue using the following request:

```
POST _ultrawarm/migration/_cancel/my-index
```

If your domain uses fine-grained access control, you must have the `indices:admin/ultrawarm/migration/cancel` permission to make this request.

## Listing hot and warm indexes
<a name="ultrawarm-api"></a>

UltraWarm adds two additional options, similar to `_all`, to help manage hot and warm indexes. For a list of all warm or hot indexes, make the following requests:

```
GET _warm
GET _hot
```

You can use these options in other requests that specify indexes, such as:

```
_cat/indices/_warm
_cluster/state/_all/_hot
```

## Returning warm indexes to hot storage
<a name="ultrawarm-migrating-back"></a>

If you need to write to an index again, migrate it back to hot storage:

```
POST _ultrawarm/migration/my-index/_hot
```

You can have up to 10 queued migrations from warm to hot storage at a time. OpenSearch Service processes migration requests one at a time, in the order that they were queued. To check the current number, monitor the `WarmToHotMigrationQueueSize` [metric](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-uw).

After the migration finishes, check the index settings to make sure they meet your needs. Indexes return to hot storage with one replica.

## Restoring warm indexes from snapshots
<a name="ultrawarm-snapshot"></a>

In addition to the standard repository for automated snapshots, UltraWarm adds a second repository for warm indexes, `cs-ultrawarm`. Each snapshot in this repository contains only one index. If you delete a warm index, its snapshot remains in the `cs-ultrawarm` repository for 14 days, just like any other automated snapshot.

When you restore a snapshot from `cs-ultrawarm`, it restores to warm storage, not hot storage. Snapshots in the `cs-automated` and `cs-automated-enc` repositories restore to hot storage.

**To restore an UltraWarm snapshot to warm storage**

1. Identify the latest snapshot that contains the index you want to restore:

   ```
   GET _snapshot/cs-ultrawarm/_all?verbose=false
   
   {
     "snapshots": [{
       "snapshot": "snapshot-name",
       "version": "1.0",
       "indices": [
         "my-index"
       ]
     }]
   }
   ```
**Note**  
By default, the `GET _snapshot/<repo>` operation displays verbose data information such as start time, end time, and duration for each snapshot within a repository. The `GET _snapshot/<repo>` operation retrieves information from the files of each snapshot contained in a repository. If you do not need the start time, end time, and duration and require only the name and index information of a snapshot, we recommend using the `verbose=false` parameter when listing snapshots to minimize processing time and prevent timing out.

1. If the index already exists, delete it:

   ```
   DELETE my-index
   ```

   If you don't want to delete the index, [return it to hot storage](#ultrawarm-migrating-back) and [reindex](https://docs.opensearch.org/latest/opensearch/reindex-data/) it.

1. Restore the snapshot:

   ```
   POST _snapshot/cs-ultrawarm/snapshot-name/_restore
   ```

   UltraWarm ignores any index settings you specify in this restore request, but you can specify options like `rename_pattern` and `rename_replacement`. For a summary of OpenSearch snapshot restore options, see the [OpenSearch documentation](https://docs.opensearch.org/latest/opensearch/snapshot-restore/#restore-snapshots).

## Manual snapshots of warm indexes
<a name="ultrawarm-manual-snapshot"></a>

You *can* take manual snapshots of warm indexes, but we don't recommend it. The automated `cs-ultrawarm` repository already contains a snapshot for each warm index, taken during the migration, at no additional charge.

By default, OpenSearch Service does not include warm indexes in manual snapshots. For example, the following call only includes hot indexes:

```
PUT _snapshot/my-repository/my-snapshot
```

If you choose to take manual snapshots of warm indexes, several important considerations apply.
+ You can't mix hot and warm indexes. For example, the following request fails:

  ```
  PUT _snapshot/my-repository/my-snapshot
  {
    "indices": "warm-index-1,hot-index-1",
    "include_global_state": false
  }
  ```

  If they include a mix of hot and warm indexes, wildcard (`*`) statements fail, as well.
+ You can only include one warm index per snapshot. For example, the following request fails:

  ```
  PUT _snapshot/my-repository/my-snapshot
  {
    "indices": "warm-index-1,warm-index-2,other-warm-indices-*",
    "include_global_state": false
  }
  ```

  This request succeeds:

  ```
  PUT _snapshot/my-repository/my-snapshot
  {
    "indices": "warm-index-1",
    "include_global_state": false
  }
  ```
+ Manual snapshots always restore to hot storage, even if they originally included a warm index.

## Migrating warm indexes to cold storage
<a name="ultrawarm-cold"></a>

If you have data in UltraWarm that you query infrequently, consider migrating it to cold storage. Cold storage is meant for data you only access occasionally or is no longer in active use. You can't read from or write to cold indexes, but you can migrate them back to warm storage at no cost whenever you need to query them. For instructions, see [Migrating indexes to cold storage](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cold-storage.html#coldstorage-migrating).

## Best practices for KNN indexes
<a name="ultrawarm-recommendations"></a>
+ Ultrawarm/Cold tier is available for all KNN index engine types. We recommend it for KNN indexes using Lucene engine and Disk-optimized vector search, which does not require to fully load the graph data in off-heap memory. While using it with native in-memory engines like FAISS and NMSLIB, you must account for the shards graph size that will be actively searched on, and provision the UltraWarm instances, preferably of the `uw.large` instance type, accordingly. For example, if customers have 2 `uw.large` instances configured, then they each will have approximately `knn.memory.circuit_breaker.limit * 61` GiB available off-heap memory. You get optimal performance if all your warm queries are targeting shards whose cumulative graph size does not exceed available off-heap memory. Latency is impacted if the available memory is lower than needed to load the graph because of evictions and waiting on off-heap memory to become available. That's why we don’t recommend using `uw.medium` instances for use cases where in-memory engines are being used or for higher search throughput cases, irrespective of engines.
+ KNN indexes migrating to UltraWarm will not be force-merged to single segment. This avoids any impact on the hot and warm nodes running into OOM issues because of graph size becoming too big for in-memory engines. Due to the increase in number of segments per shard, this might result in consuming more local cache space and allowing fewer indices to migrate to the warm tier. You can choose to force-merge indexes to single segment by using the existing setting, and overriding it before migrating indexes to the warm tier. For more information, see [Migration tuning](#ultrawarm-settings).
+ If you have a use case where indexes are searched infrequently and do not serve a latency sensitive workload, you can choose to migrate those indexes to the UltraWarm tier. This will help you to scale down the hot tier compute instances and let the UltraWarm tier compute handle the query on such low priority indexes. This can also help to provide isolation of resources consumed between the queries of low and high priority indexes so they don't impact each other.

## Disabling UltraWarm
<a name="ultrawarm-disable"></a>

The console is the simplest way to disable UltraWarm. Choose the domain, **Actions**, and **Edit cluster configuration**. Deselect **Enable warm data nodes** and choose **Save changes**. You can also use the `WarmEnabled` option in the AWS CLI and configuration API.

Before you disable UltraWarm, you must either [delete](https://opensearch.org/docs/latest/opensearch/rest-api/index-apis/delete-index/) all warm indexes or [migrate them back to hot storage](#ultrawarm-migrating-back). After warm storage is empty, wait five minutes before attempting to disable UltraWarm.

# Cold storage for Amazon OpenSearch Service
<a name="cold-storage"></a>

Cold storage lets you store any amount of infrequently accessed or historical data on your Amazon OpenSearch Service domain and analyze it on demand, at a lower cost than other storage tiers. Cold storage is appropriate if you need to do periodic research or forensic analysis on your older data. Practical examples of data suitable for cold storage include infrequently accessed logs, data that must be preserved to meet compliance requirements, or logs that have historical value. 

Similar to [UltraWarm](ultrawarm.md) storage, cold storage is backed by Amazon S3. When you need to query cold data, you can selectively attach it to existing UltraWarm nodes. You can manage the migration and lifecycle of your cold data manually or with Index State Management policies.

**Topics**
+ [Prerequisites](#coldstorage-pp)
+ [Cold storage requirements and performance considerations](#coldstorage-calc)
+ [Cold storage pricing](#coldstorage-pricing)
+ [Enabling cold storage](#coldstorage-enable)
+ [Managing cold indexes in OpenSearch Dashboards](#coldstorage-dashboards)
+ [Migrating indexes to cold storage](#coldstorage-migrating)
+ [Automating migrations to cold storage](#coldstorage-ism)
+ [Canceling migrations to cold storage](#coldstorage-cancel)
+ [Listing cold indexes](#coldstorage-list)
+ [Migrating cold indexes to warm storage](#coldstorage-migrating-back)
+ [Restoring cold indexes from snapshots](#cold-snapshot)
+ [Canceling migrations from cold to warm storage](#coldtowarm-cancel)
+ [Updating cold index metadata](#cold-update-metadata)
+ [Deleting cold indexes](#cold-delete)
+ [Disabling cold storage](#coldstorage-disable)

## Prerequisites
<a name="coldstorage-pp"></a>

Cold storage has the following prerequisites:
+ Cold storage requires OpenSearch or Elasticsearch version 7.9 or later.
+ To enable cold storage on an OpenSearch Service domain, you must also enable warm storage on the same domain.
+ To use cold storage, domains must have [dedicated master nodes](managedomains-dedicatedmasternodes.md).
+ If your domain uses a T2 or T3 instance type for your data nodes, you can't use cold storage.
+ If your index uses approximate k-NN (`"index.knn":true`), you can move it to cold storage from version 2.17 and later. Domains on versions earlier than 2.17 can upgrade to 2.17 to use this functionality, but KNN indices created on versions earlier than 2.x can't migrate to Cold. 
+ If the domain uses [fine-grained access control](fgac.md), non-admin users must be [mapped](fgac.md#fgac-mapping) to the `cold_manager` role in OpenSearch Dashboards in order to manage cold indexes.

**Note**  
The `cold_manager` role might not exist on some preexisting OpenSearch Service domains. If you don't see the role in Dashboards, you need to [manually create it](#coldstorage-create-role).

### Configure permissions
<a name="coldstorage-create-role"></a>

If you enable cold storage on a preexisting OpenSearch Service domain, the `cold_manager` role might not be defined on the domain. If the domain uses [fine-grained access control](fgac.md), non-admin users must be mapped to this role in order to manage cold indexes. To manually create the `cold_manager` role, perform the following steps:

1. In OpenSearch Dashboards, go to **Security** and choose **Permissions**.

1. Choose **Create action group** and configure the following groups:     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/cold-storage.html)

1. Choose **Roles** and **Create role**.

1. Name the role **cold\$1manager**.

1. For **Cluster permissions**, choose the `cold_cluster` group you created.

1. For **Index**, enter `*`.

1. For **Index permissions**, choose the `cold_index` group you created.

1. Choose **Create**.

1. After you create the role, [map it](fgac.md#fgac-mapping) to any user or backend role that manages cold indexes.

## Cold storage requirements and performance considerations
<a name="coldstorage-calc"></a>

Because cold storage uses Amazon S3, it incurs none of the overhead of hot storage, such as replicas, Linux reserved space, and OpenSearch Service reserved space. Cold storage doesn't have specific instance types because it doesn't have any compute capacity attached to it. You can store any amount of data in cold storage. Monitor the `ColdStorageSpaceUtilization` metric in Amazon CloudWatch to see how much cold storage space you're using.

## Cold storage pricing
<a name="coldstorage-pricing"></a>

Similar to UltraWarm storage, with cold storage you only pay for data storage. There's no compute cost for cold data and you wont get billed if theres no data in cold storage.

You don't incur any transfer charges when moving data between cold and warm storage. While indexes are being migrated between warm and cold storage, you continue to pay for only one copy of the index. After the migration completes, the index is billed according to the storage tier it was migrated to. For more information about cold storage pricing, see [Amazon OpenSearch Service pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Enabling cold storage
<a name="coldstorage-enable"></a>

The console is the simplest way to create a domain that uses cold storage. While you create the domain, first choose **Enable warm data nodes**, because you must enable warm storage on the same domain. Then, choose **Enable cold storage**. 

The same process works on existing domains as long as you meet the [prerequisites](#coldstorage-pp). Even after the domain state changes from **Processing** to **Active**, cold storage might not be available for several hours.

You can also use the [AWS CLI](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/opensearch/index.html) or [configuration API](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html) to enable cold storage.

### Sample CLI command
<a name="coldstorage-sample-cli"></a>

The following AWS CLI command creates a domain with three data nodes, three dedicated master nodes, cold storage enabled, and fine-grained access control enabled:

```
aws opensearch create-domain \
  --domain-name my-domain \
  --engine-version Opensearch_1.0 \
  --cluster-config ColdStorageOptions={Enabled=true},WarmEnabled=true,WarmCount=4,WarmType=ultrawarm1.medium.search,InstanceType=r6g.large.search,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3,InstanceCount=3 \
  --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=11 \
  --node-to-node-encryption-options Enabled=true \
  --encryption-at-rest-options Enabled=true \
  --domain-endpoint-options EnforceHTTPS=true,TLSSecurityPolicy=Policy-Min-TLS-1-2-2019-07 \
  --advanced-security-options Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions='{MasterUserName=master-user,MasterUserPassword=master-password}' \
  --region us-east-2
```

For detailed information, see the [AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/).

### Sample configuration API request
<a name="coldstorage-sample-config-api"></a>

The following request to the configuration API creates a domain with three data nodes, three dedicated master nodes, cold storage enabled, and fine-grained access control enabled:

```
POST https://es.us-east-2.amazonaws.com/2021-01-01/opensearch/domain
{
  "ClusterConfig": {
    "InstanceCount": 3,
    "InstanceType": "r6g.large.search",
    "DedicatedMasterEnabled": true,
    "DedicatedMasterType": "r6g.large.search",
    "DedicatedMasterCount": 3,
    "ZoneAwarenessEnabled": true,
    "ZoneAwarenessConfig": {
      "AvailabilityZoneCount": 3
     },
    "WarmEnabled": true,
    "WarmCount": 4,
    "WarmType": "ultrawarm1.medium.search",
    "ColdStorageOptions": {
       "Enabled": true
     }
  },
  "EBSOptions": {
    "EBSEnabled": true,
    "VolumeType": "gp2",
    "VolumeSize": 11
  },
  "EncryptionAtRestOptions": {
    "Enabled": true
  },
  "NodeToNodeEncryptionOptions": {
    "Enabled": true
  },
  "DomainEndpointOptions": {
    "EnforceHTTPS": true,
    "TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07"
  },
   "AdvancedSecurityOptions": {
    "Enabled": true,
    "InternalUserDatabaseEnabled": true,
    "MasterUserOptions": {
      "MasterUserName": "master-user",
      "MasterUserPassword": "master-password"
    }
  },
  "EngineVersion": "Opensearch_1.0",
  "DomainName": "my-domain"
}
```

For detailed information, see the [Amazon OpenSearch Service API Reference](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html).

## Managing cold indexes in OpenSearch Dashboards
<a name="coldstorage-dashboards"></a>

You can manage hot, warm and cold indexes with the existing Dashboards interface in your OpenSearch Service domain. Dashboards enables you to migrate indexes between warm and cold storage, and monitor index migration status, without using the CLI or configuration API. For more information, see [Managing indexes in OpenSearch Dashboards](dashboards.md#dashboards-indices).

## Migrating indexes to cold storage
<a name="coldstorage-migrating"></a>

When you migrate indexes to cold storage, you provide a time range for the data to make discovery easier. You can select a timestamp field based on the data in your index, manually provide a start and end timestamp, or choose to not specify one.


| Parameter | Supported value | Description | 
| --- | --- | --- | 
| timestamp\$1field | The date/time field from the index mapping. |  The minimum and maximum values of the provided field are computed and stored as the `start_time` and `end_time` metadata for the cold index.  | 
| start\$1time and end\$1time |  One of the following formats: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/cold-storage.html)  |  The provided values are stored as the `start_time` and `end_time` metadata for the cold index.   | 

If you don't want to specify a timestamp, add `?ignore=timestamp` to the request instead.

The following request migrates a warm index to cold storage and provides start and end times for the data in that index:

```
POST _ultrawarm/migration/my-index/_cold
  {
    "start_time": "2020-03-09",
    "end_time": "2020-03-09T23:00:00Z"
  }
```

Then check the status of the migration:

```
GET _ultrawarm/migration/my-index/_status

{
  "migration_status": {
    "index": "my-index",
    "state": "RUNNING_METADATA_RELOCATION",
    "migration_type": "WARM_TO_COLD"
  }
}
```

OpenSearch Service migrates one index at a time to cold storage. You can have up to 100 migrations in the queue. Any request that exceeds the limit will be rejected. To check the current number of migrations in the queue, monitor the `WarmToColdMigrationQueueSize` [metric](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-coldstorage). The migration process has the following states:

```
ACCEPTED_COLD_MIGRATION - Migration request is accepted and queued.
RUNNING_METADATA_MIGRATION - The migration request was selected for execution and metadata is migrating to cold storage.
FAILED_METADATA_MIGRATION - The attempt to add index metadata has failed and all retries are exhausted.
PENDING_INDEX_DETACH - Index metadata migration to cold storage is completed. Preparing to detach the warm index state from the local cluster.
RUNNING_INDEX_DETACH - Local warm index state from the cluster is being removed. Upon success, the migration request will be completed.
FAILED_INDEX_DETACH - The index detach process failed and all retries are exhausted.
```

## Automating migrations to cold storage
<a name="coldstorage-ism"></a>

You can use [Index State Management](ism.md) to automate the migration process after an index reaches a certain age or meets other conditions. See the [sample policy](ism.md#ism-example-cold), which demonstrates how to automatically migrate indexes from hot to UltraWarm to cold storage.

**Note**  
An explicit `timestamp_field` is required in order to move indexes to cold storage using an Index State Management policy.

## Canceling migrations to cold storage
<a name="coldstorage-cancel"></a>

If a migration to cold storage is queued or in a failed state, you can cancel the migration using the following request:

```
POST _ultrawarm/migration/_cancel/my-index

{
  "acknowledged" : true
}
```

If your domain uses fine-grained access control, you need the `indices:admin/ultrawarm/migration/cancel` permission to make this request.

## Listing cold indexes
<a name="coldstorage-list"></a>

Before querying, you can list the indexes in cold storage to decide which ones to migrate to UltraWarm for further analysis. The following request lists all cold indexes, sorted by index name:

```
GET _cold/indices/_search
```

**Sample response**

```
{
  "pagination_id" : "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY",
  "total_results" : 3,
  "indices" : [
    {
      "index" : "my-index-1",
      "index_cold_uuid" : "hjEoh26mRRCFxRIMdgvLmg",
      "size" : 10339,
      "creation_date" : "2021-06-28T20:23:31.206Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    },
    {
      "index" : "my-index-2",
      "index_cold_uuid" : "0vIS2n-oROmOWDFmwFIgdw",
      "size" : 6068,
      "creation_date" : "2021-07-15T19:41:18.046Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    },
    {
      "index" : "my-index-3",
      "index_cold_uuid" : "EaeXOBodTLiDYcivKsXVLQ",
      "size" : 32403,
      "creation_date" : "2021-07-08T00:12:01.523Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    }
  ]
}
```

### Filtering
<a name="coldstorage-filter"></a>

You can filter cold indexes based on a prefix-based index pattern and time range offsets. 

The following request lists indexes that match the prefix pattern of `event-*`:

```
GET _cold/indices/_search
 {
   "filters":{
      "index_pattern": "event-*"
   }
 }
```

**Sample response**

```
{
  "pagination_id" : "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY",
  "total_results" : 1,
  "indices" : [
    {
      "index" : "events-index",
      "index_cold_uuid" : "4eFiab7rRfSvp3slrIsIKA",
      "size" : 32263273,
      "creation_date" : "2021-08-18T18:25:31.845Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    }
  ]
}
```

The following request returns indexes with `start_time` and `end_time` metadata fields between `2019-03-01` and `2020-03-01`:

```
GET _cold/indices/_search
{
  "filters": {
    "time_range": {
      "start_time": "2019-03-01",
      "end_time": "2020-03-01"
    }
  }
}
```

**Sample response**

```
{
  "pagination_id" : "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY",
  "total_results" : 1,
  "indices" : [
    {
      "index" : "my-index",
      "index_cold_uuid" : "4eFiab7rRfSvp3slrIsIKA",
      "size" : 32263273,
      "creation_date" : "2021-08-18T18:25:31.845Z",
      "start_time" : "2019-05-09T00:00Z",
      "end_time" : "2019-09-09T23:00Z"
    }
  ]
}
```

### Sorting
<a name="coldstorage-sort"></a>

You can sort cold indexes by metadata fields such as index name or size. The following request lists all indexes sorted by size in descending order:

```
GET _cold/indices/_search
 {
 "sort_key": "size:desc"
 }
```

**Sample response**

```
{
  "pagination_id" : "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY",
  "total_results" : 5,
  "indices" : [
    {
      "index" : "my-index-6",
      "index_cold_uuid" : "4eFiab7rRfSvp3slrIsIKA",
      "size" : 32263273,
      "creation_date" : "2021-08-18T18:25:31.845Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    },
    {
      "index" : "my-index-9",
      "index_cold_uuid" : "mbD3ZRVDRI6ONqgEOsJyUA",
      "size" : 57922,
      "creation_date" : "2021-07-07T23:41:35.640Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    },
    {
      "index" : "my-index-5",
      "index_cold_uuid" : "EaeXOBodTLiDYcivKsXVLQ",
      "size" : 32403,
      "creation_date" : "2021-07-08T00:12:01.523Z",
      "start_time" : "2020-03-09T00:00Z",
      "end_time" : "2020-03-09T23:00Z"
    }
  ]
}
```

Other valid sort keys are `start_time:asc/desc`, `end_time:asc/desc`, and `index_name:asc/desc`.

### Pagination
<a name="coldstorage-pagination"></a>

You can paginate a list of cold indexes. Configure the number of indexes to be returned per page with the `page_size` parameter (default is 10). Every `_search` request on your cold indexes returns a `pagination_id` which you can use for subsequent calls.

The following request paginates the results of a `_search` request of your cold indexes and displays the next 100 results:

```
GET _cold/indices/_search?page_size=100
{
"pagination_id": "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY"
}
```

## Migrating cold indexes to warm storage
<a name="coldstorage-migrating-back"></a>

After you narrow down your list of cold indexes with the filtering criteria in the previous section, migrate them back to UltraWarm where you can query the data and use it to create visualizations. 

The following request migrates two cold indexes back to warm storage:

```
POST _cold/migration/_warm
 {
 "indices": "my-index1,my-index2"
 }


{
  "acknowledged" : true
}
```

To check the status of the migration and retrieve the migration ID, send the following request:

```
GET _cold/migration/_status
```

**Sample response**

```
{
  "cold_to_warm_migration_status" : [
    {
      "migration_id" : "tyLjXCA-S76zPQbPVHkOKA",
      "indices" : [
        "my-index1,my-index2"
      ],
      "state" : "RUNNING_INDEX_CREATION"
    }
  ]
}
```

To get index-specific migration information, include the index name:

```
GET _cold/migration/my-index/_status
```

Rather than specifying an index, you can list the indexes by their current migration status. Valid values are `_failed`, `_accepted`, and `_all`.

The following command gets the status of all indexes in a single migration request:

```
GET _cold/migration/_status?migration_id=my-migration-id
```

Retrieve the migration ID using the status request. For detailed migration information, add `&verbose=true`.

You can migrate indexes from cold to warm storage in batches of 10 or less, with a maximum of 100 indexes being migrated simultaneously. Any request that exceeds the limit will be rejected. To check the current number of migrations currently taking place, monitor the `ColdToWarmMigrationQueueSize` [metric](managedomains-cloudwatchmetrics.md#managedomains-cloudwatchmetrics-coldstorage). The migration process has the following states:

```
ACCEPTED_MIGRATION_REQUEST - Migration request is accepted and queued.
RUNNING_INDEX_CREATION - Migration request is picked up for processing and will create warm indexes in the cluster.
PENDING_COLD_METADATA_CLEANUP - Warm index is created and the migration service will attempt to clean up cold metadata.
RUNNING_COLD_METADATA_CLEANUP - Cleaning up cold metadata from the indexes migrated to warm storage.
FAILED_COLD_METADATA_CLEANUP - Failed to clean up metadata in the cold tier.
FAILED_INDEX_CREATION - Failed to create an index in the warm tier.
```

## Restoring cold indexes from snapshots
<a name="cold-snapshot"></a>

If you need to restore a deleted cold index, you can restore it back to the warm tier by following the instructions in [Restoring warm indexes from snapshots](ultrawarm.md#ultrawarm-snapshot) and then migrating the index back to cold tier again. You can't restore a deleted cold index directly back to the cold tier. OpenSearch Service retains cold indexes for 14 days after they've been deleted.

## Canceling migrations from cold to warm storage
<a name="coldtowarm-cancel"></a>

If an index migration from cold to warm storage is queued or in a failed state, you can cancel it with the following request:

```
POST _cold/migration/my-index/_cancel

{
  "acknowledged" : true
}
```

To cancel migration for a batch of indexes (maximum of 10 at a time), specify the migration ID:

```
POST _cold/migration/_cancel?migration_id=my-migration-id

{
  "acknowledged" : true
}
```

Retrieve the migration ID using the status request.

## Updating cold index metadata
<a name="cold-update-metadata"></a>

You can update the `start_time` and `end_time` fields for a cold index:

```
PATCH _cold/my-index
 {
 "start_time": "2020-01-01",
 "end_time": "2020-02-01"
 }
```

You can't update the `timestamp_field` of an index in cold storage.

**Note**  
OpenSearch Dashboards doesn't support the PATCH method. Use [curl](https://curl.haxx.se/), [Postman](https://www.getpostman.com/), or some other method to update cold metadata.

## Deleting cold indexes
<a name="cold-delete"></a>

If you're not using an ISM policy you can delete cold indexes manually. The following request deletes a cold index:

```
DELETE _cold/my-index

{
  "acknowledged" : true
}
```

## Disabling cold storage
<a name="coldstorage-disable"></a>

The OpenSearch Service console is the simplest way to disable cold storage. Select the domain and choose **Actions**, **Edit cluster configuration**, then deselect **Enable cold storage**. 

To use the AWS CLI or configuration API, under `ColdStorageOptions`, set `"Enabled"="false"`.

Before you disable cold storage, you must either delete all cold indexes or migrate them back to warm storage, otherwise the disable action fails. 

# Index State Management in Amazon OpenSearch Service
<a name="ism"></a>

Index State Management (ISM) in Amazon OpenSearch Service lets you define custom management policies that automate routine tasks, and apply them to indexes and index patterns. You no longer need to set up and manage external processes to run your index operations.

A policy contains a default state and a list of states for the index to transition between. Within each state, you can define a list of actions to perform and conditions that trigger these transitions. A typical use case is to periodically delete old indexes after a certain period of time. For example, you can define a policy that moves your index into a `read_only` state after 30 days and then ultimately deletes it after 90 days.

After you attach a policy to an index, ISM creates a job that runs every 5 to 8 minutes (or 30 to 48 minutes for pre-1.3 clusters) to perform policy actions, check conditions, and transition the index into different states. The base time for this job to run is every 5 minutes, plus a random 0-60% jitter is added to it to make sure you do not see a surge of activity from all your indexes at the same time. ISM doesn't run jobs if the cluster state is red.

ISM requires OpenSearch or Elasticsearch 6.8 or later.

**Note**  
This documentation provides a brief overview of ISM and several sample policies. It also explains how ISM for Amazon OpenSearch Service domains differs from ISM on self-managed OpenSearch clusters. For full documentation of ISM, including a comprehensive parameter reference, descriptions of each setting, and an API reference, see [Index State Management](https://docs.opensearch.org/latest/im-plugin/ism/index/) in the OpenSearch documentation.

**Important**  
You can no longer use index templates to apply ISM policies to newly created indexes. You can continue to automatically manage newly created indexes with the [ISM template field](https://opensearch.org/docs/latest/im-plugin/ism/policies/#sample-policy-with-ism-template-for-auto-rollover). This update introduces a breaking change that affects existing CloudFormation templates using this setting. 

## Create an ISM policy
<a name="ism-start"></a>

**To get started with Index State Management**

1. Open the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. Select the domain that you want to create an ISM policy for.

1. From the domain's dashboard, navigate to the OpenSearch Dashboards URL and sign in with your master username and password. The URL follows this format:

   ```
   domain-endpoint/_dashboards/
   ```

1. Open the left navigation panel within OpenSearch Dashboards and choose **Index Management**, then **Create policy**. 

1. Use the [visual editor](https://opensearch.org/docs/latest/im-plugin/ism/index/#visual-editor) or [JSON editor](https://opensearch.org/docs/latest/im-plugin/ism/index/#json-editor) to create policies. We recommend using the visual editor as it offers a more structured way of defining policies. For help creating policies, see the [sample policies](#ism-example) below.

1. After you create a policy, attach it to one or more indexes:

   ```
   POST _plugins/_ism/add/my-index
   {
     "policy_id": "my-policy-id"
   }
   ```
**Note**  
If your domain is running a legacy Elasticsearch version, use `_opendistro` instead of `_plugins`.

   Alternatively, select the index in OpenSearch Dashboards and choose **Apply policy**.

## Sample policies
<a name="ism-example"></a>

The following sample policies demonstrate how to automate common ISM use cases.

### Hot to warm to cold storage
<a name="ism-example-cold"></a>

This sample policy moves an index from hot storage to [UltraWarm](ultrawarm.md), and eventually to  [cold storage](cold-storage.md). Then, it deletes the index.

The index is initially in the `hot` state. After ten days, ISM moves it  to the `warm` state. 80 days later, after the index is 90 days old, ISM moves the index to the  `cold` state. After a year, the service sends a notification to an Amazon Chime room that the index is being deleted and then permanently deletes it. 

Note that cold indexes require the `cold_delete` operation rather than the normal `delete` operation. Also note that an explicit `timestamp_field` is required in your data in order to manage cold indexes with ISM.

```
{
  "policy": {
    "description": "Demonstrate a hot-warm-cold-delete workflow.",
    "default_state": "hot",
    "schema_version": 1,
    "states": [{
        "name": "hot",
        "actions": [],
        "transitions": [{
          "state_name": "warm",
          "conditions": {
            "min_index_age": "10d"
          }
        }]
      },
      {
        "name": "warm",
        "actions": [{
          "warm_migration": {},
          "retry": {
            "count": 5,
            "delay": "1h"
          }
        }],
        "transitions": [{
          "state_name": "cold",
          "conditions": {
            "min_index_age": "90d"
          }
        }]
      },
      {
        "name": "cold",
        "actions": [{
            "cold_migration": {
              "timestamp_field": "<your timestamp field>"
            }
          }
        ],
        "transitions": [{
          "state_name": "delete",
          "conditions": {
             "min_index_age": "365d"
          }
        }]
      },
      {
        "name": "delete",
        "actions": [{
          "notification": {
            "destination": {
              "chime": {
                "url": "<URL>"
              }
            },
            "message_template": {
              "source": "The index {{ctx.index}} is being deleted."
            }
          }
        },
        {
          "cold_delete": {}
        }]
      }
    ]
  }
}
```

### Reduce replica count
<a name="ism-example-replica"></a>

This sample policy reduces replica count to zero after seven days to conserve disk space and then deletes the index after 21 days. This policy assumes your index is non-critical and no longer receiving write requests; having zero replicas carries some risk of data loss.

```
{
  "policy": {
    "description": "Changes replica count and deletes.",
    "schema_version": 1,
    "default_state": "current",
    "states": [{
        "name": "current",
        "actions": [],
        "transitions": [{
          "state_name": "old",
          "conditions": {
            "min_index_age": "7d"
          }
        }]
      },
      {
        "name": "old",
        "actions": [{
          "replica_count": {
            "number_of_replicas": 0
          }
        }],
        "transitions": [{
          "state_name": "delete",
          "conditions": {
            "min_index_age": "21d"
          }
        }]
      },
      {
        "name": "delete",
        "actions": [{
          "delete": {}
        }],
        "transitions": []
      }
    ]
  }
}
```

### Take an index snapshot
<a name="ism-example-snapshot"></a>

This sample policy uses the `[snapshot](https://docs.opensearch.org/latest/im-plugin/ism/policies/#snapshot)` operation to take a snapshot of an index as soon as it contains at least one document. `repository` is the name of the manual snapshot repository you registered in Amazon S3. `snapshot` is the name of the snapshot. For snapshot prerequisites and steps to register a repository, see [Creating index snapshots in Amazon OpenSearch Service](managedomains-snapshots.md).

```
{
  "policy": {
    "description": "Takes an index snapshot.",
    "schema_version": 1,
    "default_state": "empty",
    "states": [{
        "name": "empty",
        "actions": [],
        "transitions": [{
          "state_name": "occupied",
          "conditions": {
            "min_doc_count": 1
          }
        }]
      },
      {
        "name": "occupied",
        "actions": [{
          "snapshot": {
            "repository": "<my-repository>",
            "snapshot": "<my-snapshot>"
            }
          }],
          "transitions": []
      }
    ]
  }
}
```

## ISM templates
<a name="ism-template"></a>

You can set up an `ism_template` field in a policy so when you create an index that matches the template pattern, the policy is automatically attached to that index. In this example, any index you create with a name that begins with "log" is automatically matched to the ISM policy `my-policy-id`:

```
PUT _plugins/_ism/policies/my-policy-id
{
  "policy": {
    "description": "Example policy.",
    "default_state": "...",
    "states": [...],
    "ism_template": {
      "index_patterns": ["log*"],
      "priority": 100
    }
  }
}
```

For a more detailed example, see [Sample policy with ISM template for auto rollover](https://opensearch.org/docs/latest/im-plugin/ism/policies/#sample-policy-with-ism-template-for-auto-rollover).

## Differences
<a name="ism-diff"></a>

Compared to OpenSearch and Elasticsearch, ISM for Amazon OpenSearch Service has several differences. 

### ISM operations
<a name="alerting-diff-op"></a>
+ OpenSearch Service supports three unique ISM operations, `warm_migration`, `cold_migration`, and `cold_delete`:
  + If your domain has [UltraWarm](ultrawarm.md) enabled, the `warm_migration` action transitions the index to warm storage.
  + If your domain has [cold storage](cold-storage.md) enabled, the `cold_migration` action transitions the index to cold storage, and the `cold_delete` action deletes the index from cold storage.

  Even if one of these actions doesn’t complete within the [set timeout period](https://docs.opensearch.org/latest/im-plugin/ism/policies/#actions), the migration or deletion of indexes still continues. Setting an [error\$1notification](https://opensearch.org/docs/latest/im-plugin/ism/policies/#error-notifications) for one of the above actions will notify you that the action failed if it didn’t complete within the timeout period, but the notification is only for your own reference. The actual operation has no inherent timeout and continues to run until it eventually succeeds or fails. 
+ If your domain runs OpenSearch or Elasticsearch 7.4 or later, OpenSearch Service supports the ISM `open` and `close` operations.
+ If your domain runs OpenSearch or Elasticsearch 7.7 or later, OpenSearch Service supports the ISM `snapshot` operation.

### Cold storage ISM operations
<a name="ism-cold-storage"></a>

For cold indexes, you must specify a `?type=_cold` parameter when you use the following ISM APIs:
+ [Add policy](https://opensearch.org/docs/latest/im-plugin/ism/api/#add-policy)
+ [Remove policy](https://opensearch.org/docs/latest/im-plugin/ism/api/#remove-policy-from-index)
+ [Update policy](https://opensearch.org/docs/latest/im-plugin/ism/api/#update-policy)
+ [Retry failed index](https://opensearch.org/docs/latest/im-plugin/ism/api/#retry-failed-index)
+ [Explain index](https://opensearch.org/docs/latest/im-plugin/ism/api/#explain-index)

These APIs for cold indexes have the following additional differences:
+ Wildcard operators are not supported except when you use it at the end. For example, `_plugins/_ism/<add, remove, change_policy, retry, explain>/logstash-*` is supported but `_plugins/_ism/<add, remove, change_policy, retry, explain>/iad-*-prod` isn’t supported.
+ Multiple index names and patterns are not supported. For example, `_plugins/_ism/<add, remove, change_policy, retry, explain>/app-logs` is supported but `_plugins/_ism/<add, remove, change_policy, retry, explain>/app-logs,sample-data` isn’t supported.

### ISM settings
<a name="ism-diff-settings"></a>

OpenSearch and Elasticsearch let you change all available ISM settings using the `_cluster/settings` API. On Amazon OpenSearch Service, you can only change the following [ISM settings](https://opensearch.org/docs/latest/im-plugin/ism/settings/):
+ **Cluster-level settings:**
  + `plugins.index_state_management.enabled`
  + `plugins.index_state_management.history.enabled`
+ **Index-level settings:**
  + `plugins.index_state_management.rollover_alias`

   

# Tutorial: Automating Index State Management processes
<a name="ism-tutorial"></a>

This tutorial demonstrates how to implement an ISM policy that automates routine index management tasks and apply them to indexes and index patterns.

[Index State Management (ISM)](ism.md) in Amazon OpenSearch Service lets you automate recurring index management activities, so you can avoid using additional tools to manage index lifecycles. You can create a policy that automates these operations based on index age, size, and other conditions, all from within your Amazon OpenSearch Service domain.

OpenSearch Service supports three storage tiers: the default "hot" state for active writing and low-latency analytics, UltraWarm for read-only data up to three petabytes, and cold storage for unlimited long-term archival.

This tutorial presents a sample use case of handling time-series data in daily indexes. In this tutorial, you set up a policy that takes an automated snapshot of each attached index after 24 hours. It then migrates the index from the default hot state to UltraWarm storage after two days, cold storage after 30 days, and finally deletes the index after 60 days.

## Prerequisites
<a name="ism-tutorialprerequisites"></a>
+ Your OpenSearch Service domain must be running Elasticsearch version 6.8 or later.
+ Your domain must have [UltraWarm](ultrawarm.md) and [cold storage](cold-storage.md) enabled.
+ You must [register a manual snapshot repository](managedomains-snapshot-registerdirectory.md) for your domain. 
+ Your user role needs sufficient permissions to access the OpenSearch Service console. If necessary, validate and [configure access to your domain](ac.md).

## Step 1: Configure the ISM policy
<a name="ism-tutorial-policy"></a>

First, configure an ISM policy in OpenSearch Dashboards.

1. From your domain dashboard in the OpenSearch Service console, navigate to the OpenSearch Dashboards URL and sign in with your master username and password. The URL follows this format: `domain-endpoint/_dashboards/`.

1. In OpenSearch Dashboards, choose **Add sample data** and add one or more of the sample indexes to your domain.

1. Open the left navigation panel and choose **Index Management**, then choose **Create policy**.

1. Name the policy `ism-policy-example`.

1. Replace the default policy with the following policy:

   ```
   {
     "policy": {
       "description": "Move indexes between storage tiers",
       "default_state": "hot",
       "states": [
         {
           "name": "hot",
           "actions": [],
           "transitions": [
             {
               "state_name": "snapshot",
               "conditions": {
                 "min_index_age": "24h"
               }
             }
           ]
         },
         {
           "name": "snapshot",
           "actions": [
             {
               "retry": {
                 "count": 5,
                 "backoff": "exponential",
                 "delay": "30m"
               },
               "snapshot": {
                 "repository": "snapshot-repo",
                 "snapshot": "ism-snapshot"
               }
             }
           ],
           "transitions": [
             {
               "state_name": "warm",
               "conditions": {
                 "min_index_age": "2d"
               }
             }
           ]
         },
         {
           "name": "warm",
           "actions": [
             {
               "retry": {
                 "count": 5,
                 "backoff": "exponential",
                 "delay": "1h"
               },
               "warm_migration": {}
             }
           ],
           "transitions": [
             {
               "state_name": "cold",
               "conditions": {
                 "min_index_age": "30d"
               }
             }
           ]
         },
         {
           "name": "cold",
           "actions": [
             {
               "retry": {
                 "count": 5,
                 "backoff": "exponential",
                 "delay": "1h"
               },
               "cold_migration": {
                 "start_time": null,
                 "end_time": null,
                 "timestamp_field": "@timestamp",
                 "ignore": "none"
               }
             }
           ],
           "transitions": [
             {
               "state_name": "delete",
               "conditions": {
                 "min_index_age": "60d"
               }
             }
           ]
         },
         {
           "name": "delete",
           "actions": [
             {
               "cold_delete": {}
             }
           ],
           "transitions": []
         }
       ],
       "ism_template": [
         {
           "index_patterns": [
             "index-*"
           ],
           "priority": 100
         }
       ]
     }
   }
   ```
**Note**  
The `ism_template` field automatically attaches the policy to any newly created index that matches one of the specified `index_patterns`. In this case, all indexes that start with `index-`. You can modify this field to match an index format in your environment. For more information, see [ISM templates](ism.md#ism-template). 

1. In the `snapshot` section of the policy, replace `snapshot-repo` with the name of the [snapshot repository](managedomains-snapshot-registerdirectory.md) that you registered for your domain. You can also optionally replace `ism-snapshot`, which will be the name of snapshot when it's created.

1. Choose **Create**. The policy is now visible on the **State management policies** page.

## Step 2: Attach the policy to one or more indexes
<a name="ism-tutorial-attach"></a>

Now that you created your policy, attach it to one or more indexes in your cluster.

1. Go to the **Hot indicies** tab and search for `opensearch_dashboards_sample`, which lists all of the sample indexes that you added in step 1.

1. Select all of the indexes and choose **Apply policy**, then choose the **ism-policy-example** policy that you just created.

1. Choose **Apply**.

You can monitor the indexes as they move through the various states on the **Policy managed indices** page.

# Summarizing indexes in Amazon OpenSearch Service with index rollups
<a name="rollup"></a>

Index rollups in Amazon OpenSearch Service let you reduce storage costs by periodically rolling up old data into summarized indexes.

You pick the fields that interest you and use an index rollup to create a new index with only those fields aggregated into coarser time buckets. You can store months or years of historical data at a fraction of the cost with the same query performance.

Index rollups requires OpenSearch or Elasticsearch 7.9 or later. 

**Note**  
This documentation helps you get started with creating an index rollup job in Amazon OpenSearch Service. For comprehensive documentation, including a list of all available settings and a full API reference, see [Index rollups](https://docs.opensearch.org/latest/im-plugin/index-rollups/) in the OpenSearch documentation.

## Creating an index rollup job
<a name="rollup-example"></a>

To get started, choose **Index Management** in OpenSearch Dashboards. Select **Rollup Jobs** and choose **Create rollup job**.

### Step 1: Set up indexes
<a name="rollup-example-1"></a>

Set up the source and target indexes. The source index is the one that you want to roll up. The target index is where the index rollup results are saved.

After you create an index rollup job, you can’t change your index selections.

### Step 2: Define aggregations and metrics
<a name="rollup-example-2"></a>

Select the attributes with the aggregations (terms and histograms) and metrics (avg, sum, max, min, and value count) that you want to roll up. Make sure you don’t add a lot of highly granular attributes, because you won’t save much space.

### Step 3: Specify schedules
<a name="rollup-example-3"></a>

Specify a schedule to roll up your indexes as it’s being ingested. The index rollup job is enabled by default.

### Step 4: Review and create
<a name="rollup-example-4"></a>

Review your configuration and select **Create**.

### Step 5: Search the target index
<a name="rollup-example-5"></a>

You can use the standard `_search` API to search the target index. You can’t access the internal structure of the data in the target index because the plugin automatically rewrites the query in the background to suit the target index. This is to make sure you can use the same query for the source and target index.

To query the target index, set `size` to 0:

```
GET target_index/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "avg_cpu": {
      "avg": {
        "field": "cpu_usage"
      }
    }
  }
}
```

**Note**  
OpenSearch versions 2.2 and later support searching multiple rollup indexes in one request. OpenSearch versions prior to 2.2 and legacy Elasticsearch OSS versions only support one rollup index per search.

# Transforming indexes in Amazon OpenSearch Service
<a name="transforms"></a>

Whereas [index rollup jobs](rollup.md) let you reduce data granularity by rolling up old data into condensed indexes, transform jobs let you create a different, summarized view of your data centered around certain fields, so you can visualize or analyze the data in different ways.

Index transforms have an OpenSearch Dashboards user interface and a REST API. The feature requires OpenSearch 1.0 or later.

**Note**  
This documentation provides a brief overview of index transforms to help you get started using it on an Amazon OpenSearch Service domain. For comprehensive documentation and a REST API reference, see [Index transforms](https://docs.opensearch.org/latest/im-plugin/index-transforms/) in the open source OpenSearch documentation.

## Creating an index transform job
<a name="transforms-example"></a>

If you don’t have any data in your cluster, use the sample flight data within OpenSearch Dashboards to try out transform jobs. After adding the data, launch OpenSearch Dashboards. Then choose **Index Management**, **Transform Jobs**, and **Create Transform Job**.

### Step 1: Choose indexes
<a name="transforms-example-1"></a>

In the **Indices** section, select the source and target index. You can either select an existing target index or create a new one by entering a name for it.

If you want to transform just a subset of your source index, choose **Add Data Filter**, and use the OpenSearch [query DSL](https://docs.opensearch.org/latest/opensearch/query-dsl/) to specify a subset of your source index.

### Step 2: Choose fields
<a name="transforms-example-2"></a>

After choosing your indexes, choose the fields you want to use in your transform job, as well as whether to use groupings or aggregations.
+ You can use groupings to place your data into separate buckets in your transformed index. For example, if you want to group all of the airport destinations within the sample flight data, group the `DestAirportID` field into a target field of `DestAirportID_terms` field, and you can find the grouped airport IDs in your transformed index after the transform job finishes.
+ On the other hand, aggregations let you perform simple calculations. For example, you might include an aggregation in your transform job to define a new field of `sum_of_total_ticket_price` that calculates the sum of all airplane tickets. Then you can analyze the new data in your transformed index.

### Step 3: Specify a schedule
<a name="transforms-example-3"></a>

Transform jobs are enabled by default and run on schedules. For **transform execution interval**, specify an interval in minutes, hours, or days.

### Step 4: Review and monitor
<a name="transforms-example-4"></a>

Review your configuration and select **Create**. Then monitor the **Transform job status** column.

### Step 5: Search the target index
<a name="transforms-example-5"></a>

After the job finishes, you can use the standard `_search` API to search the target index. 

For example, after running a transform job that transforms the flight data based on the `DestAirportID` field, you can run the following request to return all fields that have a value of `SFO`:

```
GET target_index/_search
{
  "query": {
    "match": {
      "DestAirportID_terms" : "SFO"
    }
  }
}
```

# Cross-cluster replication for Amazon OpenSearch Service
<a name="replication"></a>

With cross-cluster replication in Amazon OpenSearch Service, you can replicate user indexes, mappings, and metadata from one OpenSearch Service domain to another. Using cross-cluster replication helps to ensure disaster recovery if there is an outage, and allows you to replicate data across geographically distant data centers to reduce latency. You pay [standard AWS data transfer charges](https://aws.amazon.com/opensearch-service/pricing/) for the data transferred between domains. 

Cross-cluster replication follows an active-passive replication model where the *local* or *follower* index pulls data from the *remote* or *leader* index. The leader index refers to the source of the data, or the index that you want to replicate data from. The follower index refers to the target for the data, or the index that you want to replicate data to.

Cross-cluster replication is available on domains running Elasticsearch 7.10 or OpenSearch 1.1 or later. 

**Note**  
This documentation describes how to set up cross-cluster replication from an Amazon OpenSearch Service perspective. This includes using the AWS Management Console to set up cross-cluster connections, which is not possible on a self-managed OpenSearch cluster. For full documentation, including a settings reference and a comprehensive API reference, see [Cross-cluster replication](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/index/) in the OpenSearch documentation.

**Topics**
+ [Limitations](#replication-limitations)
+ [Prerequisites](#replication-prereqs)
+ [Permissions requirements](#replication-permissions)
+ [Set up a cross-cluster connection](#replication-connect)
+ [Start replication](#replication-start)
+ [Confirm replication](#replication-confirm)
+ [Pause and resume replication](#replication-pause-resume)
+ [Stop replication](#replication-stop)
+ [Auto-follow](#replication-autofollow)
+ [Upgrading connected domains](#replication-upgrade)

## Limitations
<a name="replication-limitations"></a>

Cross-cluster replication has the following limitations:
+ You can't replicate data between Amazon OpenSearch Service domains and self-managed OpenSearch or Elasticsearch clusters.
+ You can't replicate an index from a follower domain to another follower domain. If you want to replicate an index to multiple follower domains, you can only replicate it from the single leader domain.
+ A domain can be connected, through a combination of inbound and outbound connections, to a maximum of 20 other domains.
+ When you initially set up a cross-cluster connection, the leader domain must be on the same or a higher version than the follower domain.
+ You can't use CloudFormation to connect domains.
+ You can't use cross-cluster replication on M3 or burstable (T2 and T3) instances.
+ You can't replicate data between UltraWarm or cold indexes. Both indexes must be in hot storage.
+ When you delete an index from the leader domain, the corresponding index on the follower domain isn't automatically deleted.
+ Cross-cluster replication is not supported between default and [opt-in ](https://docs.aws.amazon.com/general/latest/gr/rande-manage.html). Both domains must be either in default Regions or in opt-in Regions.

## Prerequisites
<a name="replication-prereqs"></a>

Before you set up cross-cluster replication, make sure that your domains meet the following requirements:
+ Elasticsearch 7.10 or OpenSearch 1.1 or later
+ [Fine-grained access control](fgac.md) enabled
+ [Node-to-node encryption](ntn.md) enabled
+ Leader indices must have `index.soft_deletes.enabled` set to `true`. Indices created in Elasticsearch 7.0 or OpenSearch 1.0 and later have this setting enabled by default. However, indices created in Elasticsearch 6.x and then upgraded retain `soft_deletes=false`. To replicate such indices, you must reindex them first.

  To check if an index has soft deletes enabled:

  ```
  GET <index-name>/_settings?include_defaults=true&flat_settings=true&filter_path=*.settings.index.soft_deletes.enabled
  ```

  If `soft_deletes` is `false`, reindex the data to a new index before starting replication.

## Permissions requirements
<a name="replication-permissions"></a>

In order to start replication, you must include the `es:ESCrossClusterGet` permission on the remote (leader) domain. We recommend the following IAM policy on the remote domain. This policy also lets you perform other operations, such as indexing documents and performing standard searches:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "*"
        ]
      },
      "Action": [
        "es:ESHttp*"
      ],
      "Resource": "arn:aws:es:us-east-1:111122223333:domain/leader-domain/*"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "es:ESCrossClusterGet",
      "Resource": "arn:aws:es:us-east-1:111122223333:domain/leader-domain"
    }
  ]
}
```

------

Make sure that the `es:ESCrossClusterGet` permission is applied for `/leader-domain` and not `/leader-domain/*`.

In order for non-admin users to perform replication activities, they also need to be mapped to the appropriate permissions. Most permissions correspond to specific [REST API operations](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/api/). For example, the `indices:admin/plugins/replication/index/_resume` permission lets you resume replication of an index. For a full list of permissions, see [Replication permissions](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/permissions/#replication-permissions) in the OpenSearch documentation.

**Note**  
The commands to start replication and create a replication rule are special cases. Because they invoke background processes on the leader and follower domains, you must pass a `leader_cluster_role` and `follower_cluster_role` in the request. OpenSearch Service uses these roles in all backend replication tasks. For information about mapping and using these roles, see [Map the leader and follower cluster roles](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/permissions/#map-the-leader-and-follower-cluster-roles) in the OpenSearch documentation.

## Set up a cross-cluster connection
<a name="replication-connect"></a>

To replicate indexes from one domain to another, you need to set up a cross-cluster connection between the domains. The easiest way to connect domains is through the **Connections** tab of the domain dashboard. You can also use the [configuration API](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/Welcome.html) or the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/opensearch/create-outbound-connection.html). Because cross-cluster replication follows a "pull" model, you initate connections from the follower domain.

**Note**  
If you previously connected two domains to perform [cross-cluster searches](cross-cluster-search.md), you can't use that same connection for replication. The connection is marked as `SEARCH_ONLY` in the console. In order to perform replication between two previously connected domains, you must delete the connection and recreate it. When you've done this, the connection is available for both cross-cluster search and cross-cluster replication.

**To set up a connection**

1. In the Amazon OpenSearch Service console, select the follower domain, go to the **Connections** tab, and choose **Request**.

1. For **Connection alias**, enter a name for your connection.

1. Choose between connecting to a domain in your AWS account and Region or in another account or Region.
   + To connect to a domain in your AWS account and Region, select the domain and choose **Request**.
   + To connect to a domain in another AWS account or Region, specify the ARN of the remote domain and choose **Request**.

OpenSearch Service validates the connection request. If the domains are incompatible, the connection fails. If validation succeeds, it's sent to the destination domain for approval. When the destination domain approves the request, you can begin replication. 

Cross-cluster replication supports bidirectional replication. This means that you can create an outbound connection from domain A to domain B, and another outbound connection from domain B to domain A. You can then set up replication so that domain A follows an index in domain B, and domain B follows an index in domain A.

## Start replication
<a name="replication-start"></a>

After you establish a cross-cluster connection, you can begin to replicate data. First, create an index on the leader domain to replicate: 

```
PUT leader-01
```

To replicate that index, send this command to the follower domain:

```
PUT _plugins/_replication/follower-01/_start
{
   "leader_alias": "connection-alias",
   "leader_index": "leader-01",
   "use_roles":{
      "leader_cluster_role": "all_access",
      "follower_cluster_role": "all_access"
   }
}
```

You can find the connection alias on the **Connections** tab on your domain dashboard.

This example assumes that an admin is issuing the request and uses `all_access` for the `leader_cluster_role` and `follower_cluster_role` for simplicity. In production environments, however, we recommend that you create replication users on both the leader and follower indexes, and map them accordingly. The usernames must be identical. For information about these roles and how to map them, see [Map the leader and follower cluster roles](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/permissions/#map-the-leader-and-follower-cluster-roles) in the OpenSearch documentation.

## Confirm replication
<a name="replication-confirm"></a>

To confirm that replication is happening, get the replication status:

```
GET _plugins/_replication/follower-01/_status

{
  "status" : "SYNCING",
  "reason" : "User initiated",
  "leader_alias" : "connection-alias",
  "leader_index" : "leader-01",
  "follower_index" : "follower-01",
  "syncing_details" : {
    "leader_checkpoint" : -5,
    "follower_checkpoint" : -5,
    "seq_no" : 0
  }
}
```

The leader and follower checkpoint values begin as negative integers and reflect the number of shards you have (-1 for one shard, -5 for five shards, and so on). The values increment to positive integers with each change that you make. If the values are the same, it means that the indexes are fully synced. You can use these checkpoint values to measure replication latency across your domains.

To further validate replication, add a document to the leader index:

```
PUT leader-01/_doc/1
{
   "Doctor Sleep":"Stephen King"
}
```

And confirm that it shows up on the follower index:

```
GET follower-01/_search

{
    ...
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "follower-01",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "Doctor Sleep" : "Stephen King"
        }
      }
    ]
  }
}
```

## Pause and resume replication
<a name="replication-pause-resume"></a>

You can temporarily pause replication if you need to remediate issues or reduce load on the leader domain. Send this request to the follower domain. Make sure to include an empty request body:

```
POST _plugins/_replication/follower-01/_pause
{}
```

Then get the status to ensure that replication is paused:

```
GET _plugins/_replication/follower-01/_status

{
  "status" : "PAUSED",
  "reason" : "User initiated",
  "leader_alias" : "connection-alias",
  "leader_index" : "leader-01",
  "follower_index" : "follower-01"
}
```

When you're done making changes, resume replication. Send this request to the follower domain. Make sure to include an empty request body:

```
POST _plugins/_replication/follower-01/_resume
{}
```

You can't resume replication after it's been paused for more than 12 hours. You must stop replication, delete the follower index, and restart replication of the leader.

## Stop replication
<a name="replication-stop"></a>

When you stop replication completely, the follower index unfollows the leader and becomes a standard index. You can't restart replication after you stop it.

Stop replication from the follower domain. Make sure to include an empty request body:

```
POST _plugins/_replication/follower-01/_stop
{}
```

## Auto-follow
<a name="replication-autofollow"></a>

You can define a set of replication rules against a single leader domain that automatically replicate indexes that match a specified pattern. When an index on the leader domain matches one of the patterns (for example, `books*`), a matching follower index is created on the follower domain. OpenSearch Service replicates any existing indexes that match the pattern, as well as new indexes that you create. It does not replicate indexes that already exist on the follower domain.

To replicate all indexes (with the exception of system-created indexes, and those that already exist on the follower domain), use a wildcard (`*`) pattern. 

### Create a replication rule
<a name="replication-rule-create"></a>

Create a replication rule on the follower domain, and specify the name of the cross-cluster connection:

```
POST _plugins/_replication/_autofollow
{
   "leader_alias" : "connection-alias",
   "name": "rule-name",
   "pattern": "books*",
   "use_roles":{
      "leader_cluster_role": "all_access",
      "follower_cluster_role": "all_access"
   }
}
```

You can find the connection alias on the **Connections** tab on your domain dashboard.

This example assumes that an admin is issuing the request, and it uses `all_access` as the leader and follower domain roles for simplicity. In production environments, however, we recommend that you create replication users on both the leader and follower indexes and map them accordingly. The usernames must be identical. For information about these roles and how to map them, see [Map the leader and follower cluster roles](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/permissions/#map-the-leader-and-follower-cluster-roles) in the OpenSearch documentation.

To retrieve a list of existing replication rules on a domain, use the [auto-follow stats API operation](https://docs.opensearch.org/latest/tuning-your-cluster/replication-plugin/api/#get-auto-follow-stats).

To test the rule, create an index that matches the pattern on the leader domain:

```
PUT books-are-fun
```

And check that its replica appears on the follower domain:

```
GET _cat/indices

health status index          uuid                     pri rep docs.count docs.deleted store.size pri.store.size
green  open   books-are-fun  ldfHO78xYYdxRMULuiTvSQ     1   1          0            0       208b           208b
```

### Delete a replication rule
<a name="replication-rule-delete"></a>

When you delete a replication rule, OpenSearch Service stops replicating *new* indices that match the pattern, but continues existing replication activity until you [stop replication](#replication-stop) of those indexes.

Delete replication rules from the follower domain:

```
DELETE _plugins/_replication/_autofollow
{
   "leader_alias" : "connection-alias",
   "name": "rule-name"
}
```

## Upgrading connected domains
<a name="replication-upgrade"></a>

In order to upgrade the engine version of two domains that have a cross-cluster connection, upgrade the follower domain first and then the leader domain. Do not delete the connection between them, otherwise replication pauses and you won't be able to resume it.

# Migrating Amazon OpenSearch Service indexes using remote reindex
<a name="remote-reindex"></a>

Remote reindex lets you copy indexes from one Amazon OpenSearch Service domain to another. You can migrate indexes from any OpenSearch Service domains or self-managed OpenSearch and Elasticsearch clusters. 

A *remote* domain and index refers to the source of the data, or the domain and index that you want to copy data from. A *local* domain and index refers to the target for the data, or the domain and index that you want to copy data to.

Remote reindexing requires OpenSearch 1.0 or later, or Elasticsearch 6.7 or later, on the local domain. The remote domain must be lower or the same major version as the local domain. Elasticsearch versions are considered to be *lower* than OpenSearch versions, meaning you can reindex data from Elasticsearch domains to OpenSearch domains. Within the same major version, the remote domain can be any minor version. For example, remote reindexing from Elasticsearch 7.10.x to 7.9 is supported, but OpenSearch 1.0 to Elasticsearch 7.10.x isn't supported.

**Note**  
This documentation describes how to reindex data between Amazon OpenSearch Service domains. For full documentation for the `reindex` operation, including detailed steps and supported options, see [Reindex document](https://docs.opensearch.org/latest/opensearch/reindex-data/) in the OpenSearch documentation.

**Topics**
+ [Prerequisites](#remote-reindex-prereq)
+ [Reindex data between OpenSearch Service internet domains](#remote-reindex-domain)
+ [Reindex data between OpenSearch Service domains when the remote is in a VPC](#remote-reindex-vpc)
+ [Reindex data between non-OpenSearch Service domains](#remote-reindex-non-aos)
+ [Reindex large datasets](#remote-reindex-largedatasets)
+ [Remote reindex settings](#remote-reindex-settings)

## Prerequisites
<a name="remote-reindex-prereq"></a>

Remote reindex has the following requirements:
+ The remote domain must be accessible from the local domain. For a remote domain that resides within a VPC, the local domain must have access to the VPC. This process varies by network configuration, but likely involves connecting to a VPN or managed network, or using the native [VPC endpoint connection](#remote-reindex-vpc). To learn more, see [Launching your Amazon OpenSearch Service domains within a VPC](vpc.md). 
+ The request must be authorized by the remote domain like any other REST request. If the remote domain has fine-grained access control enabled, you must have permission to perform reindex on the remote domain and read the index on the local domain. For more security considerations, see [Fine-grained access control in Amazon OpenSearch Service](fgac.md).
+ We recommend you create an index with the desired setting on your local domain before you start the reindex process.
+ If your domain uses a T2 or T3 instance type for your data nodes, you can't use remote reindex.

## Reindex data between OpenSearch Service internet domains
<a name="remote-reindex-domain"></a>

The most basic scenario is that the remote index is in the same AWS Region as your local domain with a publicly accessible endpoint and you have signed IAM credentials.

From the remote domain, specify the remote index to reindex from and the local index to reindex to:

```
POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443"
    },
    "index": "remote_index"
  },
  "dest": {
    "index": "local_index"
  }
}
```

You must add 443 at the end of the remote domain endpoint for a validation check.

To verify that the index is copied over to the local domain, send this request to the local domain:

```
GET local_index/_search
```

If the remote index is in a Region different from your local domain, pass in its Region name, such as in this sample request:

```
POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443",
      "region": "eu-west-1"
    },
    "index": "remote_index"
  },
  "dest": {
    "index": "local_index"
  }
}
```

In case of isolated Region like AWS GovCloud (US) or China Regions, the endpoint might not be accessible because your IAM user is not recognized in those Regions.

If the remote domain is secured with [basic authentication](fgac-http-auth.md), specify the username and password:

```
POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443",
      "username": "username",
      "password": "password"
    },
    "index": "remote_index"
  },
  "dest": {
    "index": "local_index"
  }
}
```

## Reindex data between OpenSearch Service domains when the remote is in a VPC
<a name="remote-reindex-vpc"></a>

Every OpenSearch Service domain is made up of its own internal virtual private cloud (VPC) infrastructure. When you create a new domain in an existing OpenSearch Service VPC, an elastic network interface is created for each data node in the VPC. 

Because the remote reindex operation is performed from the remote OpenSearch Service domain, and therefore within its own private VPC, you need a way to access the local domain’s VPC. You can either do this by using the built-in VPC endpoint connection feature to establish a connection through AWS PrivateLink, or by configuring a proxy.

If your local domain uses OpenSearch version 1.0 or later, you can use the console or the AWS CLI to create an AWS PrivateLink connection. An AWS PrivateLink connection allows resources in the local VPC to privately connect to resources in the remote VPC within the same AWS Region.

In order to create a VPC endpoint connection, the source domain to be reindexed must be in a local VPC, and both the source and destination domains must be in the same AWS Region.

### Reindex data with the AWS Management Console
<a name="reindex-console"></a>

You can use remote reindex with the console to copy indexes between two domains that share a VPC endpoint connection.

1. Navigate to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/](https://console.aws.amazon.com/aos/).

1. In the left navigation pane, choose **Domains**. 

1. Select the local domain, or the domain that you want to copy data to. This opens the domain details page. Choose the **Connections** tab below the general information and choose **Request**.

1. On the **Request connection** page, select **VPC Endpoint Connection** for your connection mode and enter other relevant details. These details include the remote domain, which is the domain that you want to copy data from. Then, choose **Request**.

1. Navigate to the remote domain's details page, choose the **Connections** tab, and find the **Inbound connections** table. Select the check box next to the name of the domain that you just created the connection from (the local domain). Choose **Approve**.

1. Navigate back to the local domain, choose the **Connections** tab, and find the **Outbound connections** table. After the connection between the two domains is active, an endpoint becomes available in the **Endpoint** column in the table. Copy the endpoint.

1. Open the dashboard for the local domain and choose **Dev Tools** in the left navigation. To confirm that the remote domain index doesn't exist on your local domain yet, run the following GET request. Replace *remote-domain-index-name* with your own index name.

   ```
   GET remote-domain-index-name/_search
   {
      "query":{
         "match_all":{}
      }
   }
   ```

   In the output, you should see an error that indicates that the index wasn't found.

1. Below your GET request, create a POST request and use your endpoint as the remote host, as follows.

   ```
   POST _reindex
   {
      "source":{
         "remote":{
            "host":"connection-endpoint",
            "username":"username",
            "password":"password"
         },
         "index":"remote-domain-index-name"
      },
      "dest":{
         "index":"local-domain-index-name"
      }
   }
   ```

   Run this request.

1. Run the GET request again. The output should now indicate that the local index exists. You can query this index to verify that OpenSearch copied all the data from the remote index.

### Reindex data with OpenSearch Service API operations
<a name="reindex-api"></a>

You can use remote reindex with the API to copy indexes between two domains that share a VPC endpoint connection.

1. Use the [CreateOutboundConnection](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_CreateOutboundConnection.html) API operation to request a new connection from your local domain to your remote domain.

   ```
   POST https://es.region.amazonaws.com/2021-01-01/opensearch/cc/outboundConnection
   
   {
      "ConnectionAlias": "remote-reindex-example",
      "ConnectionMode": "VPC_ENDPOINT",
      "LocalDomainInfo": { 
         "AWSDomainInformation": { 
            "DomainName": "local-domain-name",
            "OwnerId": "aws-account-id",
            "Region": "region"
         }
      },
      "RemoteDomainInfo": { 
         "AWSDomainInformation": { 
            "DomainName": "remote-domain-name",
            "OwnerId": "aws-account-id",
            "Region": "region"
         }
      }
   }
   ```

   You receive a `ConnectionId` in the response. Save this ID to use in the next step.

1. Use the [AcceptInboundConnection](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_AcceptInboundConnection.html) API operation with your connection ID to approve the request from the local domain.

   ```
   PUT https://es.region.amazonaws.com/2021-01-01/opensearch/cc/inboundConnection/ConnectionId/accept
   ```

1. Use the [DescribeOutboundConnections](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_DescribeOutboundConnections.html) API operation to retrieve the endpoint for your remote domain. 

   ```
   {
       "Connections": [
           {
               "ConnectionAlias": "remote-reindex-example",
               "ConnectionId": "connection-id",
               "ConnectionMode": "VPC_ENDPOINT",
               "ConnectionProperties": {
                   "Endpoint": "connection-endpoint"
               },
               ...
           }
       ]
   }
   ```

   Save the *connection-endpoint* to use in Step 5.

1. To confirm that the remote domain index doesn't exist on your local domain yet, run the following GET request. Replace *remote-domain-index-name* with your own index name.

   ```
   GET local-domain-endpoint/remote-domain-index-name/_search
   {
      "query":{
         "match_all":{}
      }
   }
   ```

   In the output, you should see an error that indicates that the index wasn't found.

1. Create a POST request and use your endpoint as the remote host, as follows.

   ```
   POST local-domain-endpoint/_reindex
   {
      "source":{
         "remote":{
            "host":"connection-endpoint",
            "username":"username",
            "password":"password"
         },
         "index":"remote-domain-index-name"
      },
      "dest":{
         "index":"local-domain-index-name"
      }
   }
   ```

   Run this request.

1. Run the GET request again. The output should now indicate that the local index exists. You can query this index to verify that OpenSearch copied all the data from the remote index.

If the remote domain is hosted inside a VPC and you don't want to use the VPC endpoint connection feature, you must configure a proxy with a publicly accessible endpoint. In this case, OpenSearch Service requires a public endpoint because it doesn't have the ability to send traffic into your VPC. 

When you run a domain in [VPC mode](vpc.md), one or more endpoints are placed in your VPC. However, these endpoints are only for traffic coming into the domain within the VPC, and they don't permit traffic into the VPC itself. 

The remote reindex command is run from the local domain, so the originating traffic isn't able to use those endpoints to access the remote domain. That's why a proxy is required in this use case. The proxy domain must have a certificate signed by a public certificate authority (CA). Self-signed or private CA-signed certificates are not supported.

## Reindex data between non-OpenSearch Service domains
<a name="remote-reindex-non-aos"></a>

If the remote index is hosted outside of OpenSearch Service, like in a self-managed EC2 instance, set the `external` parameter to `true`:

```
POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443",
      "username": "username",
      "password": "password",
      "external": true
    },
    "index": "remote_index"
  },
  "dest": {
    "index": "local_index"
  }
}
```

In this case, only [basic authentication](fgac-http-auth.md) with a username and password is supported. The remote domain must have a publicly accessible endpoint (even if it's in the same VPC as the local OpenSearch Service domain) and a certificate signed by a public CA. Self-signed or private CA-signed certificates aren't supported.

## Reindex large datasets
<a name="remote-reindex-largedatasets"></a>

Remote reindex sends a scroll request to the remote domain with the following default values: 
+ Search context of 5 minutes
+ Socket timeout of 30 seconds
+ Batch size of 1,000

We recommend tuning these parameters to accommodate your data. For large documents, consider a smaller batch size and/or longer timeout. For more information, see [Paginate results](https://docs.opensearch.org/docs/latest/search-plugins/searching-data/paginate/).

```
POST _reindex?pretty=true&scroll=10h&wait_for_completion=false
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443",
      "socket_timeout": "60m"
    },
    "size": 100,
    "index": "remote_index"
  },
  "dest": {
    "index": "local_index"
  }
}
```

We also recommend adding the following settings to the local index for better performance:

```
PUT local_index
{
  "settings": {
    "refresh_interval": -1,
    "number_of_replicas": 0
  }
}
```

After the reindex process is complete, you can set your desired replica count and remove the refresh interval setting.

To reindex only a subset of documents that you select through a query, send this request to the local domain:

```
POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote-domain-endpoint:443"
    },
    "index": "remote_index",
    "query": {
      "match": {
        "field_name": "text"
      }
    }
  },
  "dest": {
    "index": "local_index"
  }
}
```

Remote reindex doesn't support slicing, so you can't perform multiple scroll operations for the same request in parallel.

## Remote reindex settings
<a name="remote-reindex-settings"></a>

In addition to the standard reindexing options, OpenSearch Service supports the following options:


| Options | Valid values | Description | Required | 
| --- | --- | --- | --- | 
| external | Boolean | If the remote domain is not an OpenSearch Service domain, or if you're reindexing between two VPC domains, specify as true. | No | 
| region | String | If the remote domain is in a different Region, specify the Region name. | No | 

# Managing time-series data in Amazon OpenSearch Service with data streams
<a name="data-streams"></a>

A typical workflow to manage time-series data involves multiple steps, such as creating a rollover index alias, defining a write index, and defining common mappings and settings for the backing indices.

Data streams in Amazon OpenSearch Service help simplify this initial setup process. Data streams work out of the box for time-based data such as application logs that are typically append-only in nature. 

Data streams requires OpenSearch version 1.0 or later. 

**Note**  
This documentation provides basic steps to help you get started with data streams on an Amazon OpenSearch Service domain. For comprehensive documentation, see [Data streams](https://docs.opensearch.org/latest/opensearch/data-streams/) in the OpenSearch documentation. 

## Getting started with data streams
<a name="data-streams-example"></a>

A data stream is internally composed of multiple backing indices. Search requests are routed to all the backing indices, while indexing requests are routed to the latest write index.

### Step 1: Create an index template
<a name="data-streams-example-1"></a>

To create a data stream, you first need to create an index template that configures a set of indexes as a data stream. The `data_stream` object indicates that it’s a data stream and not a regular index template. The index pattern matches with the name of the data stream:

```
PUT _index_template/logs-template
{
  "index_patterns": [
    "my-data-stream",
    "logs-*"
  ],
  "data_stream": {},
  "priority": 100
}
```

In this case, each ingested document must have an `@timestamp` field. You can also define your own custom timestamp field as a property in the `data_stream` object:

```
PUT _index_template/logs-template
{
  "index_patterns": "my-data-stream",
  "data_stream": {
    "timestamp_field": {
      "name": "request_time"
    }
  }
}
```

### Step 2: Create a data stream
<a name="data-streams-example-2"></a>

After you create an index template, you can directly start ingesting data without creating a data stream. 

Because we have a matching index template with a `data_stream` object, OpenSearch automatically creates the data stream:

```
POST logs-staging/_doc
{
  "message": "login attempt failed",
  "@timestamp": "2013-03-01T00:00:00"
}
```

### Step 3: Ingest data into the data stream
<a name="data-streams-example-3"></a>

To ingest data into a data stream, you can use the regular indexing APIs. Make sure every document that you index has a timestamp field. If you try to ingest a document that doesn’t have a timestamp field, you get an error.

```
POST logs-redis/_doc
{
  "message": "login attempt",
  "@timestamp": "2013-03-01T00:00:00"
}
```

### Step 4: Searching a data stream
<a name="data-streams-example-4"></a>

You can search a data stream just like you search a regular index or an index alias. The search operation applies to all of the backing indexes (all data present in the stream).

```
GET logs-redis/_search
{
  "query": {
    "match": {
      "message": "login"
    }
  }
}
```

### Step 5: Rollover a data stream
<a name="data-streams-example-5"></a>

You can set up an [Index State Management (ISM)](ism.md) policy to automate the rollover process for the data stream. The ISM policy is applied to the backing indexes at the time of their creation. When you associate a policy to a data stream, it only affects the future backing indexes of that data stream. You also don’t need to provide the `rollover_alias` setting, because the ISM policy infers this information from the backing index.

**Note**  
If you migrate a backing index to [cold storage](cold-storage.md), OpenSearch removes this index from the data stream. Even if you move the index back to [UltraWarm](ultrawarm.md), the index remains independent and not part of the original data stream. After an index has been removed from the data stream, searching against the stream won't return any data from the index.

**Warning**  
The write index for a data stream can't be migrated to cold storage. If you wish to migrate data in your data stream to cold storage, you must rollover the data stream before migration.

### Step 6: Manage data streams in OpenSearch Dashboards
<a name="data-streams-example-6"></a>

To manage data streams from OpenSearch Dashboards, open **OpenSearch Dashboards**, choose **Index Management**, select **Indices** or **Policy managed indices**.

### Step 7: Delete a data stream
<a name="data-streams-example-7"></a>

The delete operation first deletes the backing indexes of a data stream and then deletes the data stream itself.

To delete a data stream and all of its hidden backing indices:

```
DELETE _data_stream/name_of_data_stream
```