View a markdown version of this page

Use the solution - Migration Assistant for Amazon OpenSearch Service

Use the solution

This section describes how to run a migration using the Migration Assistant for Amazon OpenSearch Service solution after you have deployed it on Amazon EKS. The day-to-day operator interface is the Workflow CLI, which runs in the Migration Console pod (migration-console-0) on Amazon EKS. The supporting console CLI provides component-level inspection and ad-hoc operations during validation and troubleshooting.

Getting started with the Workflow CLI

This sequence is the shortest safe path to your first migration to Amazon OpenSearch Service or Amazon OpenSearch Serverless: load the right schema for your version, prove connectivity, run a small pilot, and only then run the full workflow.

Before you start

Make sure all of the following are true:

  • Migration Assistant is deployed on Amazon EKS. See Deploy the solution.

  • The source cluster and the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection are reachable from the Amazon EKS cluster.

  • Snapshot storage is ready in Amazon S3 if you plan to run backfill.

  • Any basic-auth Kubernetes secrets you need can be created in the ma namespace.

Step 1: Access the Migration Console

kubectl exec -it migration-console-0 -n ma -- /bin/bash

If you are accessing Amazon EKS from a new shell, refresh your kubeconfig first:

aws eks update-kubeconfig --region <REGION> --name migration-eks-cluster-<STAGE>-<REGION>

Step 2: Confirm the installed version

console --version

This matters because the workflow schema can change by release.

Step 3: Load the version-matched sample

workflow configure sample --load

This gives you the safest starting point for your installed release.

Step 4: Edit the workflow configuration

workflow configure edit

Fill in the fields that describe your migration:

  • Source endpoint, version, and authentication.

  • Target endpoint and authentication. For Amazon OpenSearch Serverless, set service: aoss in the SigV4 authConfig. For Amazon OpenSearch Service, set service: es.

  • Snapshot repository details if you are running backfill.

  • The migration pattern: backfill only, capture and replay only, or both.

Note

Do not start by editing every possible field. Start with the minimum required fields for your path.

Target configuration for Amazon OpenSearch Serverless

When the target is an Amazon OpenSearch Serverless collection, set the target cluster like this:

{ "targetClusters": { "target": { "endpoint": "https://<collection-id>.<region>.aoss.amazonaws.com", "authConfig": { "sigv4": { "region": "<region>", "service": "aoss" } } } } }

The migration IAM role created by the Amazon EKS deployment must also be added as a principal in your collection’s data access policy. The IAM role is named <eks-cluster-name>-migrations-role. Add it to the collection’s data access policy with both collection-level and index-level permissions before running the workflow.

Target configuration for Amazon OpenSearch Service

When the target is an Amazon OpenSearch Service domain:

{ "targetClusters": { "target": { "endpoint": "https://<domain-endpoint>", "authConfig": { "sigv4": { "region": "<region>", "service": "es" } } } } }

If your domain has fine-grained access control (FGAC) enabled, map the migration IAM role to a security role on the domain (typically all_access during migration, then scoped down). See Troubleshooting.

Step 5: Create Kubernetes secrets if you use basic authentication

kubectl create secret generic source-credentials \ --from-literal=username=<SOURCE_USER> \ --from-literal=password=<SOURCE_PASSWORD> \ -n ma kubectl create secret generic target-credentials \ --from-literal=username=<TARGET_USER> \ --from-literal=password=<TARGET_PASSWORD> \ -n ma

Reference those secret names in authConfig.basic.secretName in your workflow configuration.

Step 6: Verify connectivity before submitting a workflow

console clusters connection-check

The check runs against both source and target by default. To narrow it to one side:

console clusters connection-check --cluster source console clusters connection-check --cluster target

For a direct API check:

console clusters curl source / console clusters curl target /

If these checks fail, stop and fix connectivity or authentication first. Do not start a workflow yet.

Step 7: Verify AWS identity if you use SigV4

If your source or target uses Amazon OpenSearch Service or Amazon OpenSearch Serverless, verify pod identity is working from the Migration Console pod:

aws sts get-caller-identity

If console clusters connection-check works in the Migration Console but the workflow later fails with HTTP 401 or 403, verify that the Argo workflow executor pods are using the argo-workflow-executor service account with its EKS Pod Identity association. On Amazon EKS, both the Migration Console pod and the workflow executor pods get Pod Identity-backed AWS credentials automatically through the bootstrap script.

Step 8: Run a pilot migration first

Use a small allowlist or a representative subset before you attempt the full migration. This is the easiest way to catch mapping issues, authentication issues, and throughput problems early.

workflow submit workflow manage

Use workflow manage to watch the run and approve any gated steps.

Step 9: Validate the pilot

Check counts and basic behavior on the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection before you expand scope:

console clusters cat-indices console clusters curl target /<index>/_count console clusters curl target /<index>/_search?size=5&pretty

If you are migrating applications with live traffic, also validate representative queries against the target.

Step 10: Run the real migration

After the pilot succeeds, widen the configuration to the full index set and submit again:

workflow configure edit workflow submit workflow manage

Step 11: Use logs if anything fails

workflow status workflow output workflow output --follow

workflow submit automatically stops and replaces an existing workflow with the same name, so you do not need to manually clean up between runs. If a previous run left orphaned migration custom resources, use workflow reset instead of deleting Argo workflows directly:

workflow reset # interactive — lists CRDs and prompts before delete workflow reset migration-foo # delete a specific resource by name workflow reset --all # delete everything (capture proxies are protected) workflow reset --all --include-proxies --delete-storage # also remove capture proxies and Apache Kafka PVCs

Core commands

The workflow CLI orchestrates a full migration; the console CLI inspects or manually drives a single component during validation and troubleshooting.

Workflow commands

Command Why you use it

workflow configure sample

Shows the sample schema for your installed version

workflow configure sample --load

Loads that sample as your starting point

workflow configure edit

Opens the workflow config in your editor ($EDITOR, defaults to vi). Use --stdin to pipe configuration from stdin instead of launching an editor

workflow configure view

Prints the current config

workflow configure clear

Clears the current config and lets you start over. Use --confirm to skip the confirmation prompt

workflow submit

Starts the migration workflow (auto-stops and replaces an existing one with the same name)

workflow submit --wait --timeout 300

Submits and blocks until the workflow completes or the timeout is reached

workflow manage

Primary day-to-day interface for monitoring, approvals, and logs (interactive TUI)

workflow status

Shows the current workflow tree in a non-interactive form

workflow status --all

Shows running and completed workflows

workflow output

Shows logs across workflow pods

workflow output --follow

Streams logs live

workflow approve <PATTERN>

Approves pending gates that match exact names or globs

workflow reset

Lists migration custom resources and lets you delete them safely

Workflow command flags

The flags below cover all options available on each workflow subcommand. Global flags (-v/--verbose) can be placed before any subcommand to increase logging.

workflow submit

Flag Purpose

--wait

Block until the workflow completes or times out (default: return immediately)

--timeout <seconds>

Timeout when using --wait (default: 120)

--wait-interval <seconds>

Polling interval between status checks when using --wait (default: 2)

--session <name>

Configuration session name to load parameters from (default: default)

--workflow-name <name>

Name of the workflow to replace if one already exists (default: migration)

--namespace <ns>

Kubernetes namespace (default: ma)

workflow status

Flag Purpose

--all

Show all workflows including completed ones (default: only running)

--all-workflows

Show status for all workflows regardless of name (mutually exclusive with --workflow-name)

--workflow-name <name>

Show status for a specific workflow (default: migration)

--live-status

Run a current status check for each snapshot and backfill still in progress

--argo-server <url>

Argo Server URL (default: auto-detected from environment)

--namespace <ns>

Kubernetes namespace (default: ma)

--insecure

Skip TLS certificate verification (default: true)

--token <token>

Bearer token for Argo Server authentication

workflow output

Flag Purpose

--follow / -f

Stream logs live using stern (default: show historical logs)

--timestamps

Include RFC 3339 timestamps in log output

--all-workflows

Show output for all workflows (mutually exclusive with --workflow-name)

--workflow-name <name>

Filter logs to a specific workflow (default: migration)

-l / --selector <labels>

Label selector to filter pods (e.g. source=a,target=b)

--namespace <ns>

Kubernetes namespace (default: ma)

--argo-server <url>

Argo Server URL

--insecure

Skip TLS certificate verification (default: true)

--token <token>

Bearer token for authentication

workflow manage

Flag Purpose

--workflow-name <name>

Workflow to manage (default: migration)

--argo-server <url>

Argo Server URL (default: auto-detected from environment)

--namespace <ns>

Kubernetes namespace (default: ma)

--insecure

Skip TLS certificate verification (default: true)

--token <token>

Bearer token for authentication

workflow reset

Flag Purpose

<path>

Name or glob pattern of a specific resource to delete (e.g. migration-foo, kafka-*)

--all

Delete all migration custom resources

--cascade

Also delete resources that depend on the targeted resource

--include-proxies

Include capture proxies in deletion (they are protected by default)

--delete-storage

Delete Kafka PVCs and orphaned PVs during reset

--namespace <ns>

Kubernetes namespace (default: ma)

workflow approve

Flag Purpose

<TASK_NAMES> (one or more)

Exact names or glob patterns of pending approval gates to approve (e.g. *.evaluateMetadata)

--workflow-name <name>

Workflow containing the gates (default: migration)

--argo-server <url>

Argo Server URL (default: auto-detected from environment)

--namespace <ns>

Kubernetes namespace (default: ma)

--insecure

Skip TLS certificate verification (default: true)

--token <token>

Bearer token for authentication

Console commands

The console CLI groups operations by component:

Command Why you use it

console --version

Confirms which schema and behavior your Migration Console is running

console clusters connection-check

Verifies the Migration Console can reach and authenticate to source and target

console clusters cat-indices [--cluster source|target|proxy]

Lists indexes on one or both clusters

console clusters curl source /_cat/indices?v

Issues a direct API request against the named cluster

console clusters clear-indices --cluster target --acknowledge-risk

Destructive — deletes all indexes on the named cluster

console snapshot {create|status|delete|unregister-repo}

Manage snapshots in Amazon S3

console metadata {migrate|evaluate}

Run or preview metadata migration outside the workflow

console backfill {describe|start|pause|stop|scale|status}

Inspect or drive RFS backfill

console replay {describe|start|stop|scale|status}

Inspect or drive Traffic Replayer

console metrics {list|get-data}

Inspect Migration Assistant metrics

console kafka {create-topic|list-topics|delete-topic|…​}

Inspect Strimzi-managed Apache Kafka used by capture and replay

console tuples

Inspect captured request/response tuples for replay validation

Note

The workflow path drives metadata, backfill, and replay automatically. Reach for the equivalent console command only when you want to inspect state or work around a specific failure (for example, to call console snapshot status while a long-running snapshot is in progress).

Approval gates

Not every migration step should run without human review. Approval gates let the workflow stop at meaningful checkpoints — typically transitions after metadata work, backfill milestones, and cutover-sensitive steps — so you can validate before continuing.

workflow manage workflow approve <STEP_NAME>

Status symbols

Symbol Meaning

Succeeded

Running

Pending

Failed

Waiting for approval

Migration scenarios

Migration Assistant supports three migration patterns. Pick the one that matches your downtime tolerance.

Scenario 1: Backfill only

Best when you can tolerate a brief write freeze, or when writes can be paused and replayed from an external queue.

Snapshot source → Migrate metadata → Backfill documents → Verify → Switch traffic

Scenario 2: Capture and Replay only

Best when the data is small enough that live replay alone can synchronize the target on Amazon OpenSearch Service or Amazon OpenSearch Serverless, or when you want to replay traffic against multiple targets to compare results.

Reroute traffic to capture proxy → Migrate metadata → Replay traffic → Verify → Switch traffic to target

Scenario 3: Backfill + Capture and Replay (zero-downtime)

The most comprehensive approach. Capture begins first so no writes are lost, then backfill brings over historical data, then replay catches the target up to real-time.

Reroute traffic to capture proxy → Snapshot source → Migrate metadata → Backfill documents → Replay captured traffic → Verify → Switch traffic to target

Backfill tuning

Useful Reindex-from-Snapshot settings include:

  • podReplicas — number of RFS pods running in parallel (one shard per pod).

  • maxConnections — bulk-indexer concurrency to the target.

  • documentsPerBulkRequest — bulk batch size.

  • maxShardSizeBytes — maximum supported shard size (default 80 GiB). Larger shards must be reduced before backfill (force-merge or split).

  • initialLeaseDuration — ISO-8601 duration each worker holds a shard lease before re-acquisition (default PT10M).

  • allowedDocExceptionTypes — list of exception class names from the target’s response that should be counted as success rather than retried.

  • allowLooseVersionMatching — bypass the strict source/target version compatibility check.

Because RFS reads from snapshot storage in Amazon S3, increasing worker count does not add live read load to the source cluster. It mainly changes how quickly the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is driven.

Replay tuning

Useful Traffic Replayer settings include:

  • podReplicas — number of replayer pods.

  • speedupFactor — default 1.1. 2.0 means twice the original traffic timeline.

  • removeAuthHeader — strips the captured Authorization header before replaying. Useful when the captured traffic carries credentials that would not be valid against the target.

  • authHeaderOverride — replaces the captured Authorization header with a static value.

  • dependsOnSnapshotMigrations — ensures replay only starts after backfill completes.

  • nonRetryableDocExceptionTypes — list of exception class names that should be counted as failures but not retried.

Warning

Setting both replayerConfig.removeAuthHeader: true and an authConfig block on the same target is rejected by the schema. Pick one — either rely on the target’s authConfig (the Traffic Replayer applies it for you) or strip the captured header.

Cutover and rollback

Switching traffic to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is the cutover step. By this point, capture has already protected writes during backfill, replay has caught the target up, and validation is complete.

Before you switch:

  • replay has reached the live edge,

  • the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is healthy,

  • representative application queries work on the target,

  • the application team is ready to move traffic, and

  • the rollback path is still available.

The exact cutover mechanism depends on your environment, but the principle is always the same:

  1. Stop pointing clients at the capture proxy.

  2. Point clients directly at the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.

  3. Watch the target closely during the first production traffic window.

In practice, that usually means updating a DNS record, a load balancer backend, an application connection string, or a service-discovery entry. Keep the source cluster available during a rollback window (typically 24–72 hours) before decommissioning. After the rollback window has passed, see Uninstall the solution to remove the Migration Assistant infrastructure.

Assessment and breaking changes

Before running a migration, review breaking changes between your source and target versions. Breaking changes may require modifications to your client applications after the migration completes.

Understanding breaking changes

Between major versions of Elasticsearch and OpenSearch, features may be deprecated or removed. Common breaking changes include:

  • Removal of mapping types (Elasticsearch 6.x to 7.x and later)

  • Changes to field types (for example, string replaced by text and keyword)

  • Query DSL syntax changes

  • REST API endpoint changes

  • Plugin compatibility differences

For a complete list of breaking changes for your migration path, see the Migration Assistant documentation on GitHub.

Impact of data transformations

Any time you apply a transformation to your data — such as changing index names, modifying field mappings, or splitting indexes with type mappings — these changes may need to be reflected in your client configurations. Run production-like queries against the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection before switching over actual production traffic. This verifies that clients can communicate with the target, locate the necessary indexes and fields, and retrieve expected results.

For complex migrations involving multiple transformations, perform a trial migration with representative non-production data to fully test client compatibility.

Creating a snapshot

Once you have your change data capture solution in place or have disabled indexing to your source cluster, create a snapshot. The snapshot captures all the metadata and documents to be migrated to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.

Create a snapshot

Run the following command from the Migration Console pod to initiate snapshot creation:

console snapshot create

Migration Assistant automatically generates a snapshot name and configures the necessary Amazon S3 repository. Alternatively, you can use an existing snapshot — see the snapshot configuration in your workflow configuration for details.

Check snapshot status

To check the snapshot creation status:

console snapshot status

To retrieve detailed information about the snapshot:

console snapshot status --deep-check

Wait for snapshot creation to complete before proceeding to the metadata migration phase. A completed snapshot returns output similar to:

SUCCESS Snapshot is SUCCESS. Percent completed: 100.00% Data GiB done: 29.211/29.211 Total shards: 40 Successful shards: 40 Failed shards: 0 Start time: 2024-07-22 18:21:42 Duration: 0h 13m 4s Anticipated duration remaining: 0h 0m 0s Throughput: 38.13 MiB/sec

Managing slow snapshot speeds

Depending on the data size in the source cluster and the bandwidth allocated for snapshots, the process can take some time. Use the --max-snapshot-rate-mb-per-node option to adjust the maximum rate at which the source cluster’s nodes create the snapshot. Increasing the snapshot rate consumes more source node resources, which may affect the cluster’s ability to handle normal traffic.

Migrating metadata

Metadata migration involves creating a snapshot of your source cluster and then migrating the metadata (index settings, mappings, templates, and aliases) to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.

The metadata migration tool gathers information from the source cluster through a snapshot or HTTP requests. These snapshots are fully compatible with the backfill process for Reindex-from-Snapshot (RFS).

Evaluating metadata

Before migrating, use the evaluate command to preview what will be migrated:

console metadata evaluate

This scans the source cluster, applies filtering and modifications, and produces a list of items that will be migrated. Items not listed will not be migrated. Use this as a safety check before modifying the target.

Example output:

Starting Metadata Evaluation Clusters: Source: Remote Cluster: OpenSearch 1.3.16 Target: Remote Cluster: OpenSearch 2.14.0 Migration Candidates: Index Templates: simple_index_template Component Templates: simple_component_template Indexes: blog_2023, movies_2023 Aliases: alias1, movies-alias Results: 0 issue(s) detected

Running metadata migration

The migrate command applies all evaluated items to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection:

console metadata migrate

If re-run multiple times, items that were previously migrated are not recreated. To re-migrate an item, delete it from the target first, then rerun evaluate and migrate.

Example output:

Starting Metadata Migration Clusters: Source: Snapshot: OpenSearch 1.3.16 Target: Remote Cluster: OpenSearch 2.14.0 Migrated Items: Index Templates: simple_index_template Component Templates: simple_component_template Indexes: blog_2023, movies_2023 Aliases: alias1, movies-alias Results: 0 issue(s) detected

Verifying metadata

Before proceeding to backfill, confirm the target cluster details. Depending on your configuration, check the sharding strategy or verify that index mappings are correctly defined by ingesting a test document:

console clusters curl target /<index>/_mapping?pretty console clusters curl target /<index>/_settings?pretty

Field type transformations

When migrating between versions, some field types may be deprecated or incompatible. Migration Assistant includes built-in transformations:

  • string to text/keyword — Automatically transforms the deprecated string type (Elasticsearch 1.x–5.x) to text or keyword based on the original index property.

  • flattened to flat_object — Automatically transforms flattened field type (Elasticsearch 7.3+) to flat_object (OpenSearch 2.7+).

  • dense_vector to knn_vector — Automatically transforms dense_vector (Elasticsearch 7.x) to knn_vector with appropriate similarity mappings and HNSW algorithm parameters.

Custom field type transformations

For field types not covered by automatic transformations, create a custom JavaScript transformation:

  1. Access the Migration Console pod:

    kubectl exec -it migration-console-0 -n ma -- /bin/bash
  2. Create a JavaScript transformation file:

    cat > /shared-logs-output/field-type-converter.js << 'SCRIPT' function main(context) { const rules = [ { when: { type: "string" }, set: { type: "text" } }, { when: { type: "flattened" }, set: { type: "flat_object" }, remove: ["index"] } ]; function applyRules(node, rules) { if (Array.isArray(node)) { node.forEach((child) => applyRules(child, rules)); } else if (node instanceof Map) { for (const { when, set, remove = [] } of rules) { const matches = Object.entries(when).every(([k, v]) => node.get(k) === v); if (matches) { Object.entries(set).every(([k, v]) => node.set(k, v)); remove.forEach((key) => node.delete(key)); } } for (const child of node.values()) { applyRules(child, rules); } } else if (node && typeof node === "object") { for (const { when, set, remove = [] } of rules) { const matches = Object.entries(when).every(([k, v]) => node[k] === v); if (matches) { Object.assign(node, set); remove.forEach((key) => delete node[key]); } } Object.values(node).forEach((child) => applyRules(child, rules)); } } return (doc) => { if (doc && doc.type && doc.name && doc.body) { applyRules(doc, rules); } return doc; }; } (() => main)(); SCRIPT
  3. Create a transformation descriptor:

    cat > /shared-logs-output/transformation.json << 'EOF' [ { "JsonJSTransformerProvider": { "initializationScriptFile": "/shared-logs-output/field-type-converter.js", "bindingsObject": "{}" } } ] EOF
  4. Run metadata migration with the transformation:

    console metadata migrate --transformer-config-file /shared-logs-output/transformation.json

Handling type mapping deprecation (Elasticsearch 6.x)

In Elasticsearch versions prior to 7.0, an index could contain multiple types. OpenSearch no longer supports multiple mapping types. The TypeMappingSanitizationTransformer manages this deprecation by routing, merging, or dropping types during migration.

To configure type mapping handling, add a transformation configuration with the TypeMappingSanitizationTransformerProvider:

[ { "TypeMappingSanitizationTransformerProvider": { "staticMappings": { "activity": { "user": "new_users", "post": "new_posts" } }, "sourceProperties": { "version": { "major": 6, "minor": 8 } } } } ]

Supported strategies:

  • Route types to separate indexes — Split different types into their own indexes on the Amazon OpenSearch Service domain.

  • Merge all types into one index — Combine multiple types into a single index by mapping them all to the same target index name.

  • Drop specific types — Selectively migrate only specific types; omitted types are not migrated.

  • Keep original structure — Use regex patterns to maintain the same index name while removing the type layer.

Important

Whenever the transformation configuration is updated, the backfill and replayer tools must be stopped and restarted to apply the changes. Previously migrated data and metadata may need to be cleared to avoid an inconsistent state.

Metadata troubleshooting

Accessing detailed logs

Metadata migration creates a detailed log file for each run:

ls -al /shared-logs-output/migration-console-default/*/metadata/ tail /shared-logs-output/migration-console-default/*/metadata/*.log

OpenSearch running in compatibility mode

If you encounter errors about being unable to update an Elasticsearch 7.10.2 cluster, compatibility mode may be enabled on the Amazon OpenSearch Service domain. Disable compatibility mode to continue. See Enable compatibility mode in the Amazon OpenSearch Service documentation.

Running backfill

After metadata has been migrated, use Reindex-from-Snapshot (RFS) to backfill documents from the source cluster snapshot into the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.

Starting the backfill

Start the migration from RFS:

console backfill start

The status reports Running even if all shards have been migrated. See Stopping the backfill for how to finalize.

Scaling the backfill

To speed up the migration, increase the number of RFS worker pods. Each worker processes shards in parallel from the snapshot in Amazon S3, so scaling up does not add live read load to the source cluster — it only increases the indexing rate on the target.

console backfill scale <NUM_WORKERS>

For example, to scale to 5 workers:

console backfill scale 5

It may take a few minutes for additional workers to come online. Scale up gradually while monitoring the health metrics of the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection to avoid oversaturating it. Amazon OpenSearch Service provides cluster health and performance metrics that you can use for this monitoring.

Monitoring the backfill

Check the status of the backfill:

console backfill status

For detailed shard-level progress:

console backfill status --deep-check

Example output:

BackfillStatus.RUNNING Running=9 Pending=1 Desired=10 Shards total: 62 Shards completed: 46 Shards incomplete: 16 Shards in progress: 11 Shards unclaimed: 5

Amazon CloudWatch metrics and dashboard

Migration Assistant creates two Amazon CloudWatch dashboards to visualize migration health and performance:

  • MA-<STAGE>-<REGION>-ReindexFromSnapshot — metrics for the RFS workers and, for Amazon OpenSearch Service targets, the domain’s cluster metrics.

  • MA-<STAGE>-<REGION>-CaptureReplay — metrics for the capture proxy and traffic replayer.

Replace <STAGE> and <REGION> with the values you used during deployment (for example, MA-prod-us-east-1-ReindexFromSnapshot).

Find the dashboards in the Amazon CloudWatch console in the AWS Region where you deployed Migration Assistant. The metric graphs for your target domain are blank until you select the OpenSearch domain you are migrating to from the dropdown menu at the top of the dashboard.

Pausing the backfill

To pause a migration without losing progress:

console backfill pause

This stops all existing workers while leaving the backfill operation in a resumable state. To restart:

  • Run console backfill start, or

  • Scale up the worker count with console backfill scale <NUM_WORKERS>.

Stopping the backfill

Completing the backfill requires manually stopping the migration. Stopping shuts down all workers and cleans up all coordination metadata. After status checks report that data has been completely migrated, stop the migration:

console backfill stop

Example output:

Backfill stopped successfully. Archiving the working state of the backfill operation... Backfill working state archived to: /shared-logs-output/migration-console-default/backfill_working_state/working_state_backup_20241115174822.json
Warning

You cannot restart a stopped migration. If you may need to resume later, use console backfill pause instead.

Validating the backfill

After the backfill completes and workers have stopped, verify the contents of your Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection:

console clusters cat-indices --refresh

This displays the document count for each index. Compare source and target counts to verify completeness:

SOURCE CLUSTER health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open my-index -DqPQDrATw25hhe5Ss34bQ 1 0 3 0 12.7kb 12.7kb TARGET CLUSTER health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open my-index bGfGtYoeSU6U6p8leR5NAQ 1 0 3 0 5.5kb 5.5kb

Verifying no failed documents

Use the following Amazon CloudWatch Logs Insights query to identify failed documents:

fields @message | filter @message like "Bulk request succeeded, but some operations failed." | sort @timestamp desc | limit 10000

Run this query against the OpenSearchMigrations log group in Amazon CloudWatch. If failed documents are identified, you can index them directly rather than re-running the full backfill.

Using Traffic Replayer

Note

This section is only relevant if you are using Capture and Replay to avoid downtime during a migration to Amazon OpenSearch Service or Amazon OpenSearch Serverless. If you are performing backfill only, skip this section.

Traffic Replayer replays captured traffic from the source cluster to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. This verifies that the target can handle requests in the same way as the source and catches up to real-time traffic for a smooth migration.

When to run Traffic Replayer

Start Traffic Replayer only after all metadata and documents have been migrated. Running it before the document migration completes may cause operations to execute out of order. For example, a deletion captured after the snapshot was taken could execute before the document is added to the target.

Starting Traffic Replayer

console replay start

Example output:

Replayer started successfully.

Checking replay status

console replay status

The status reports:

  • Running — How many container instances are actively running.

  • Pending — How many instances are being provisioned.

  • Desired — The total number of instances that should be running.

Stopping Traffic Replayer

console replay stop

Delivery guarantees

Traffic Replayer retrieves traffic from Apache Kafka and updates its commit cursor after sending requests to the target. This provides an "at least once" delivery guarantee. Monitor metrics and validate externally to confirm the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is functioning as expected.

Time scaling

Traffic Replayer sends requests in the same order they were received on each connection. With a speedupFactor greater than 1, requests are sent faster than original timing:

  • speedupFactor 1 — Same rate and idle periods as the source.

  • speedupFactor 2 — Twice as fast. GETs sent every 500 ms instead of every second.

  • speedupFactor 10 — 10x faster, as long as the target responds quickly.

If the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection cannot respond quickly enough, Traffic Replayer waits for the previous request to complete before sending the next one.

Transformations

During migrations, some requests may need to be transformed between versions. Traffic Replayer automatically rewrites host and authentication headers. For more complex transformations, specify custom transformation rules in your workflow configuration using the replayerConfig section.

Elasticsearch content-type header compatibility

Newer Elasticsearch clients (version 7.11 and later, including all 8.x versions) use Elasticsearch-specific media types in Content-Type and Accept headers, such as application/vnd.elasticsearch+json;compatible-with=8. Amazon OpenSearch Service and Amazon OpenSearch Serverless do not support these media types.

If you are migrating traffic from Elasticsearch clients version 7.11 or later, apply a transformation to convert these headers to standard application/json:

  1. Create a JavaScript transformation file in the Migration Console pod:

    cat > /shared-logs-output/content-type-transformer.js << 'SCRIPT' const NEW_CONTENT_TYPE = "application/json"; const ELASTIC_CONTENT_TYPE = "application/vnd.elasticsearch+json"; function transform(request, context) { let headers = request.get("headers"); if (headers) { let contentType = headers.get("Content-Type"); if (Array.isArray(contentType)) { headers.set("Content-Type", contentType.map(v => v.includes(ELASTIC_CONTENT_TYPE) ? NEW_CONTENT_TYPE : v)); } else if (typeof contentType === "string") { if (contentType.includes(ELASTIC_CONTENT_TYPE)) { headers.set("Content-Type", NEW_CONTENT_TYPE); } } let accept = headers.get("Accept"); if (Array.isArray(accept)) { headers.set("Accept", accept.map(v => v.includes(ELASTIC_CONTENT_TYPE) ? NEW_CONTENT_TYPE : v)); } else if (typeof accept === "string") { if (accept.includes(ELASTIC_CONTENT_TYPE)) { headers.set("Accept", NEW_CONTENT_TYPE); } } } return request; } function main(context) { return (request) => { if (Array.isArray(request)) { return request.flat().map(item => transform(item, context)); } return transform(request, context); }; } (() => main)(); SCRIPT
  2. Create a transformation configuration file:

    cat > /shared-logs-output/replayer-transformation.json << 'EOF' [ { "JsonJSTransformerProvider": { "initializationScriptFile": "/shared-logs-output/content-type-transformer.js", "bindingsObject": "{}" } } ] EOF
  3. Reference the transformation in your workflow configuration under replayerConfig or pass as extra args.

Result logs

HTTP transactions from the source capture and those replayed to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection are logged at:

/shared-logs-output/traffic-replayer-default/*/tuples/tuples.log

Each log entry is a newline-delimited JSON object containing source and target request/response pairs along with transaction details such as response times. Previous runs are available in gzipped format.

To view logs in human-readable format:

console tuples show --in /shared-logs-output/traffic-replayer-default/<ID>/tuples/tuples.log > readable-tuples.log
Note

These logs contain the contents of all requests, including authorization headers and HTTP message bodies. Ensure that access to the migration environment is restricted.

Amazon CloudWatch metrics

Traffic Replayer emits OpenTelemetry metrics to Amazon CloudWatch and traces through AWS X-Ray. Key metrics include:

Metric Description

sourceStatusCode

HTTP status codes for source and target, with dimensions for HTTP verb and status code family. Quickly identifies discrepancies between source and target responses.

lagBetweenSourceAndTargetRequests

Delay between requests hitting the source and target. With a speedup factor greater than 1, this value should decrease as replay progresses.

bytesWrittenToTarget / bytesReadFromTarget

Throughput to and from the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.

numRetriedRequests

Requests retried because of status code mismatches between source and target.

Various (*)Count

Event counts for completed operations.

Various (*)Duration

Duration of each processing step.

Various (*)ExceptionCount

Exceptions encountered during each processing phase.

Note

Metrics pushed to Amazon CloudWatch may experience a visibility lag of approximately 5 minutes. Amazon CloudWatch retains higher-resolution data for a shorter period than lower-resolution data. For more information, see Amazon CloudWatch concepts.

What is not migrated automatically

Plan separate work for:

  • security configuration,

  • ISM/ILM policies,

  • ingest pipelines,

  • OpenSearch Dashboards or Kibana saved objects,

  • data streams,

  • and cluster-level tuning.