

# Build an environment to upgrade, migrate, and compare OpenSearch clusters
Solution overview

OpenSearch is widely adopted for log analytics and search functionalities. However, self-managing OpenSearch can be operationally demanding. Amazon OpenSearch Service offer more manageable alternatives, but transitioning to these services or updating to the latest OpenSearch version has historically been complex. Also, it can be difficult for a customer to predict the outcome of a migration. The Migration Assistant for Amazon OpenSearch Service solution addresses these challenges - it simplifies the migration process, ensures integrity, and validates performance post-migration.

The Migration Assistant for Amazon OpenSearch Service solution is a toolkit designed to ease the transition to OpenSearch, facilitate upgrades to the latest OpenSearch versions, and refine cluster configurations based on observed traffic patterns. Whether you’re looking to set up a proof-of-concept in AWS, transition production workloads with confidence, or enhance your current OpenSearch clusters, this guide provides references to step-by-step instructions, best practices, and insights to leverage the full potential of the OpenSearch migrations package.

 **Supported migration paths** 

The following matrix shows which source versions can be directly migrated to which OpenSearch target versions:


| Source Version | OpenSearch 1.x | OpenSearch 2.x | OpenSearch 3.x | 
| --- | --- | --- | --- | 
|  Elasticsearch 1.x  |  ✓  |  ✓  |  ✓  | 
|  Elasticsearch 2.x  |  ✓  |  ✓  |  ✓  | 
|  Elasticsearch 5.x  |  ✓  |  ✓  |  ✓  | 
|  Elasticsearch 6.x  |  ✓  |  ✓  |  ✓  | 
|  Elasticsearch 7.x  |  ✓  |  ✓  |  ✓  | 
|  Elasticsearch 8.x  |  |  ✓  |  ✓  | 
|  OpenSearch 1.x  |  ✓  |  ✓  |  ✓  | 
|  OpenSearch 2.x  |  |  ✓  |  ✓  | 

 **Benefits of using this solution:** 
+ Migrate cluster metadata, including index settings, type mappings, index templates, and aliases.
+ Migrate existing data from legacy clusters to OpenSearch clusters, including Amazon OpenSearch Service (AOS) Domains.
+ Intercept and redirect live traffic from self-managed Elasticsearch or OpenSearch clusters to Amazon OpenSearch Service domainsuwith minimal latency.
+ Replicate production traffic on target clusters to validate and ensure accuracy.
+ Simulate real-world traffic by capturing and replaying request patterns to fine-tune system performance.
+ Deploy across the most common AWS Regions for global reach and scalability.
+ Provides a recommended path for migration while continuing to maintain service availability.

This implementation guide provides an overview of the Migration Assistant for Amazon OpenSearch Service solution, its reference architecture and components, considerations for planning the deployment, and configuration steps for deploying the solution to the Amazon Web Services (AWS) Cloud. It also references the solution’s open-source documentation on [GitHub](https://github.com/opensearch-project/opensearch-migrations), which includes a User guide, developer documentation, and tips to enhance and contribute to the solution.

The intended audience for using this solution’s features and capabilities in their environment includes solution architects, business decision makers, DevOps engineers, data scientists, and cloud professionals.

Use this navigation table to quickly find answers to these questions:


| If you want to . . . | Read . . . | 
| --- | --- | 
|  Know the cost for running this solution. The estimated cost for running this solution in the US East (N. Virginia) Region is approximately USD \$13,096 for a 15-day migration with 100 TB of existing data and 15 MBps of live traffic.  |   [Cost](cost.md)   | 
|  Understand the security considerations for this solution.  |   [Security](security-1.md)   | 
|  Know how to plan for quotas for this solution.  |   [Quotas](quotas.md)   | 
|  Know which AWS Regions support this solution.  |   [Supported AWS Regions](plan-your-deployment.md#supported-aws-regions)   | 
|  View or download the AWS CloudFormation template included in this solution to automatically deploy the infrastructure resources (the "stack") for this solution.  |      | 
|  Access the source code and optionally use the AWS Cloud Development Kit (AWS CDK) to deploy the solution.  |   [GitHub repository](https://github.com/aws-solutions/migration-assistant-for-amazon-opensearch/tree/release/v2.0.0)   | 

# Features and benefits


The solution provides the following features:

 **Backfill with reindex-from-snapshot** 

This solution guides users through the process of transferring data from a snapshot stored in an [Amazon Simple Storage Service](https://aws.amazon.com/s3) (Amazon S3) bucket to a designated (target) cluster.

 **Live traffic capture and replay** 

The solution offers guidance and tools to intercept traffic intended for an original cluster and archive it for future replay on a destination cluster. Typically, the replay occurs at the same rate and concurrency as the original traffic to precisely mimic the workload experienced by the source cluster. Users can choose to replay the recorded traffic subsequently or adjust the replay speed. This flexibility enables users to fine-tune the target cluster, enhancing its performance to suit their requirements.

 **Traffic validation** 

The solution records requests and responses between the source and destination clusters for comparison. It then forwards the latency metrics and response codes to an analytics platform, enabling users to analyze the data essential for transitioning their traffic from a legacy system to a new Amazon OpenSearch Service destination. The solution stores each request and response for both the source and target for deeper inspection if needed.

 **Integration with AWS Service Catalog AppRegistry and Application Manager, a capability of AWS Systems Manager** 

This solution includes a [Service Catalog AppRegistry](https://docs.aws.amazon.com/servicecatalog/latest/arguide/intro-app-registry.html) resource to register the solution’s CloudFormation template and its underlying resources as an application in both Service Catalog AppRegistry and [Application Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/application-manager.html). With this integration, you can centrally manage the solution’s resources and enable application search, reporting, and management actions.

# Use cases


 **Migrating existing data** 

Migration Assistant for Amazon OpenSearch Service offers various options for migrating existing data, including detailed guidance on running a reindex-from-snapshot applicable across all supported migration routes, such as from Elasticsearch 5.6, 6.8, 7.10.2, or 7.17 to OpenSearch 2.19.

 **Near real-time migration of HTTP traffic between clusters** 

The solution offers you the option to capture data destined for a source cluster and store this data for reuse. A user can replay this data to a target cluster in near real-time to migrate as soon as possible, or replay at a later time.

 **Replay traffic to multiple targets** 

The solution allows you to capture traffic for replay through multiple instances or in sequential runs, facilitating the validation of diverse cluster workloads and configurations.

 **Precise simulation of your cluster workloads** 

The solution allows users to capture and replay traffic either simultaneously with multiple instances, or in separate sequential runs. This feature aids in validating different cluster workloads and configurations. By default, the Traffic Replayer preserves the original concurrency and request rate to accurately simulate production loads, ensuring a fair like-for-like comparison.

 **Validate target cluster results** 

The solution facilitates user comparisons of source and target traffic in terms of accuracy and performance. It captures metrics and logs for analysis, providing users with the necessary confidence to migrate their production traffic to a new target.

# Concepts and definitions


This section describes key concepts and defines terminology specific to this solution:

 **source cluster** 

The originating cluster on a specific version of Elasticsearch or OpenSearch that the user is attempting to either upgrade or decommission.

 **target cluster** 

The destination cluster that the user is trying upgrade, migrate to, or optimize.

 **capture proxy** 

A pass-through HTTP proxy designed to capture and log all of the request and response traffic to a durable source for later reuse.

 **Traffic Replayer** 

A tool designed to simulate original traffic workloads by retrieving recorded request traffic and sending it to a target cluster. The Traffic Replayer correlates the request and response traffic of the originating request with the request and response traffic to the target, and stores the traffic persistently.

 **existing data** 

Documents that were on the source cluster at the point where a snapshot is taken.

 **live/continuous data** 

Data intercepted by the Capture Proxy and subsequently processed through a Traffic Replayer. Initially, this information is transmitted from clients to the source cluster, where it is intercepted by the Capture Proxy. Subsequently, the data is relayed back to the designated target cluster.

**Note**  
For a general reference of AWS terms, see the [AWS Glossary](https://docs.aws.amazon.com/general/latest/gr/glos-chap.html).