Use the solution
This section describes how to run a migration using the Migration Assistant for Amazon OpenSearch Service solution after you have deployed it on Amazon EKS. The day-to-day operator interface is the Workflow CLI, which runs in the Migration Console pod (migration-console-0) on Amazon EKS. The supporting console CLI provides component-level inspection and ad-hoc operations during validation and troubleshooting.
Getting started with the Workflow CLI
This sequence is the shortest safe path to your first migration to Amazon OpenSearch Service or Amazon OpenSearch Serverless: load the right schema for your version, prove connectivity, run a small pilot, and only then run the full workflow.
Before you start
Make sure all of the following are true:
-
Migration Assistant is deployed on Amazon EKS. See Deploy the solution.
-
The source cluster and the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection are reachable from the Amazon EKS cluster.
-
Snapshot storage is ready in Amazon S3 if you plan to run backfill.
-
Any basic-auth Kubernetes secrets you need can be created in the
manamespace.
Step 1: Access the Migration Console
kubectl exec -it migration-console-0 -n ma -- /bin/bash
If you are accessing Amazon EKS from a new shell, refresh your kubeconfig first:
aws eks update-kubeconfig --region <REGION> --name migration-eks-cluster-<STAGE>-<REGION>
Step 2: Confirm the installed version
console --version
This matters because the workflow schema can change by release.
Step 3: Load the version-matched sample
workflow configure sample --load
This gives you the safest starting point for your installed release.
Step 4: Edit the workflow configuration
workflow configure edit
Fill in the fields that describe your migration:
-
Source endpoint, version, and authentication.
-
Target endpoint and authentication. For Amazon OpenSearch Serverless, set
service: aossin the SigV4authConfig. For Amazon OpenSearch Service, setservice: es. -
Snapshot repository details if you are running backfill.
-
The migration pattern: backfill only, capture and replay only, or both.
Note
Do not start by editing every possible field. Start with the minimum required fields for your path.
Target configuration for Amazon OpenSearch Serverless
When the target is an Amazon OpenSearch Serverless collection, set the target cluster like this:
{ "targetClusters": { "target": { "endpoint": "https://<collection-id>.<region>.aoss.amazonaws.com", "authConfig": { "sigv4": { "region": "<region>", "service": "aoss" } } } } }
The migration IAM role created by the Amazon EKS deployment must also be added as a principal in your collection’s data access policy. The IAM role is named <eks-cluster-name>-migrations-role. Add it to the collection’s data access policy with both collection-level and index-level permissions before running the workflow.
Target configuration for Amazon OpenSearch Service
When the target is an Amazon OpenSearch Service domain:
{ "targetClusters": { "target": { "endpoint": "https://<domain-endpoint>", "authConfig": { "sigv4": { "region": "<region>", "service": "es" } } } } }
If your domain has fine-grained access control (FGAC) enabled, map the migration IAM role to a security role on the domain (typically all_access during migration, then scoped down). See Troubleshooting.
Step 5: Create Kubernetes secrets if you use basic authentication
kubectl create secret generic source-credentials \ --from-literal=username=<SOURCE_USER> \ --from-literal=password=<SOURCE_PASSWORD> \ -n ma kubectl create secret generic target-credentials \ --from-literal=username=<TARGET_USER> \ --from-literal=password=<TARGET_PASSWORD> \ -n ma
Reference those secret names in authConfig.basic.secretName in your workflow configuration.
Step 6: Verify connectivity before submitting a workflow
console clusters connection-check
The check runs against both source and target by default. To narrow it to one side:
console clusters connection-check --cluster source console clusters connection-check --cluster target
For a direct API check:
console clusters curl source / console clusters curl target /
If these checks fail, stop and fix connectivity or authentication first. Do not start a workflow yet.
Step 7: Verify AWS identity if you use SigV4
If your source or target uses Amazon OpenSearch Service or Amazon OpenSearch Serverless, verify pod identity is working from the Migration Console pod:
aws sts get-caller-identity
If console clusters connection-check works in the Migration Console but the workflow later fails with HTTP 401 or 403, verify that the Argo workflow executor pods are using the argo-workflow-executor service account with its EKS Pod Identity association. On Amazon EKS, both the Migration Console pod and the workflow executor pods get Pod Identity-backed AWS credentials automatically through the bootstrap script.
Step 8: Run a pilot migration first
Use a small allowlist or a representative subset before you attempt the full migration. This is the easiest way to catch mapping issues, authentication issues, and throughput problems early.
workflow submit workflow manage
Use workflow manage to watch the run and approve any gated steps.
Step 9: Validate the pilot
Check counts and basic behavior on the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection before you expand scope:
console clusters cat-indices console clusters curl target /<index>/_count console clusters curl target /<index>/_search?size=5&pretty
If you are migrating applications with live traffic, also validate representative queries against the target.
Step 10: Run the real migration
After the pilot succeeds, widen the configuration to the full index set and submit again:
workflow configure edit workflow submit workflow manage
Step 11: Use logs if anything fails
workflow status workflow output workflow output --follow
workflow submit automatically stops and replaces an existing workflow with the same name, so you do not need to manually clean up between runs. If a previous run left orphaned migration custom resources, use workflow reset instead of deleting Argo workflows directly:
workflow reset # interactive — lists CRDs and prompts before delete workflow reset migration-foo # delete a specific resource by name workflow reset --all # delete everything (capture proxies are protected) workflow reset --all --include-proxies --delete-storage # also remove capture proxies and Apache Kafka PVCs
Core commands
The workflow CLI orchestrates a full migration; the console CLI inspects or manually drives a single component during validation and troubleshooting.
Workflow commands
| Command | Why you use it |
|---|---|
|
|
Shows the sample schema for your installed version |
|
|
Loads that sample as your starting point |
|
|
Opens the workflow config in your editor ( |
|
|
Prints the current config |
|
|
Clears the current config and lets you start over. Use |
|
|
Starts the migration workflow (auto-stops and replaces an existing one with the same name) |
|
|
Submits and blocks until the workflow completes or the timeout is reached |
|
|
Primary day-to-day interface for monitoring, approvals, and logs (interactive TUI) |
|
|
Shows the current workflow tree in a non-interactive form |
|
|
Shows running and completed workflows |
|
|
Shows logs across workflow pods |
|
|
Streams logs live |
|
|
Approves pending gates that match exact names or globs |
|
|
Lists migration custom resources and lets you delete them safely |
Workflow command flags
The flags below cover all options available on each workflow subcommand. Global flags (-v/--verbose) can be placed before any subcommand to increase logging.
workflow submit
| Flag | Purpose |
|---|---|
|
|
Block until the workflow completes or times out (default: return immediately) |
|
|
Timeout when using |
|
|
Polling interval between status checks when using |
|
|
Configuration session name to load parameters from (default: |
|
|
Name of the workflow to replace if one already exists (default: |
|
|
Kubernetes namespace (default: |
workflow status
| Flag | Purpose |
|---|---|
|
|
Show all workflows including completed ones (default: only running) |
|
|
Show status for all workflows regardless of name (mutually exclusive with |
|
|
Show status for a specific workflow (default: |
|
|
Run a current status check for each snapshot and backfill still in progress |
|
|
Argo Server URL (default: auto-detected from environment) |
|
|
Kubernetes namespace (default: |
|
|
Skip TLS certificate verification (default: true) |
|
|
Bearer token for Argo Server authentication |
workflow output
| Flag | Purpose |
|---|---|
|
|
Stream logs live using |
|
|
Include RFC 3339 timestamps in log output |
|
|
Show output for all workflows (mutually exclusive with |
|
|
Filter logs to a specific workflow (default: |
|
|
Label selector to filter pods (e.g. |
|
|
Kubernetes namespace (default: |
|
|
Argo Server URL |
|
|
Skip TLS certificate verification (default: true) |
|
|
Bearer token for authentication |
workflow manage
| Flag | Purpose |
|---|---|
|
|
Workflow to manage (default: |
|
|
Argo Server URL (default: auto-detected from environment) |
|
|
Kubernetes namespace (default: |
|
|
Skip TLS certificate verification (default: true) |
|
|
Bearer token for authentication |
workflow reset
| Flag | Purpose |
|---|---|
|
|
Name or glob pattern of a specific resource to delete (e.g. |
|
|
Delete all migration custom resources |
|
|
Also delete resources that depend on the targeted resource |
|
|
Include capture proxies in deletion (they are protected by default) |
|
|
Delete Kafka PVCs and orphaned PVs during reset |
|
|
Kubernetes namespace (default: |
workflow approve
| Flag | Purpose |
|---|---|
|
|
Exact names or glob patterns of pending approval gates to approve (e.g. |
|
|
Workflow containing the gates (default: |
|
|
Argo Server URL (default: auto-detected from environment) |
|
|
Kubernetes namespace (default: |
|
|
Skip TLS certificate verification (default: true) |
|
|
Bearer token for authentication |
Console commands
The console CLI groups operations by component:
| Command | Why you use it |
|---|---|
|
|
Confirms which schema and behavior your Migration Console is running |
|
|
Verifies the Migration Console can reach and authenticate to source and target |
|
|
Lists indexes on one or both clusters |
|
|
Issues a direct API request against the named cluster |
|
|
Destructive — deletes all indexes on the named cluster |
|
|
Manage snapshots in Amazon S3 |
|
|
Run or preview metadata migration outside the workflow |
|
|
Inspect or drive RFS backfill |
|
|
Inspect or drive Traffic Replayer |
|
|
Inspect Migration Assistant metrics |
|
|
Inspect Strimzi-managed Apache Kafka used by capture and replay |
|
|
Inspect captured request/response tuples for replay validation |
Note
The workflow path drives metadata, backfill, and replay automatically. Reach for the equivalent console command only when you want to inspect state or work around a specific failure (for example, to call console snapshot status while a long-running snapshot is in progress).
Approval gates
Not every migration step should run without human review. Approval gates let the workflow stop at meaningful checkpoints — typically transitions after metadata work, backfill milestones, and cutover-sensitive steps — so you can validate before continuing.
workflow manage workflow approve <STEP_NAME>
Status symbols
| Symbol | Meaning |
|---|---|
|
|
Succeeded |
|
|
Running |
|
|
Pending |
|
|
Failed |
|
|
Waiting for approval |
Migration scenarios
Migration Assistant supports three migration patterns. Pick the one that matches your downtime tolerance.
Scenario 1: Backfill only
Best when you can tolerate a brief write freeze, or when writes can be paused and replayed from an external queue.
Snapshot source → Migrate metadata → Backfill documents → Verify → Switch traffic
Scenario 2: Capture and Replay only
Best when the data is small enough that live replay alone can synchronize the target on Amazon OpenSearch Service or Amazon OpenSearch Serverless, or when you want to replay traffic against multiple targets to compare results.
Reroute traffic to capture proxy → Migrate metadata → Replay traffic → Verify → Switch traffic to target
Scenario 3: Backfill + Capture and Replay (zero-downtime)
The most comprehensive approach. Capture begins first so no writes are lost, then backfill brings over historical data, then replay catches the target up to real-time.
Reroute traffic to capture proxy → Snapshot source → Migrate metadata → Backfill documents → Replay captured traffic → Verify → Switch traffic to target
Backfill tuning
Useful Reindex-from-Snapshot settings include:
-
podReplicas— number of RFS pods running in parallel (one shard per pod). -
maxConnections— bulk-indexer concurrency to the target. -
documentsPerBulkRequest— bulk batch size. -
maxShardSizeBytes— maximum supported shard size (default 80 GiB). Larger shards must be reduced before backfill (force-merge or split). -
initialLeaseDuration— ISO-8601 duration each worker holds a shard lease before re-acquisition (defaultPT10M). -
allowedDocExceptionTypes— list of exception class names from the target’s response that should be counted as success rather than retried. -
allowLooseVersionMatching— bypass the strict source/target version compatibility check.
Because RFS reads from snapshot storage in Amazon S3, increasing worker count does not add live read load to the source cluster. It mainly changes how quickly the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is driven.
Replay tuning
Useful Traffic Replayer settings include:
-
podReplicas— number of replayer pods. -
speedupFactor— default1.1.2.0means twice the original traffic timeline. -
removeAuthHeader— strips the capturedAuthorizationheader before replaying. Useful when the captured traffic carries credentials that would not be valid against the target. -
authHeaderOverride— replaces the capturedAuthorizationheader with a static value. -
dependsOnSnapshotMigrations— ensures replay only starts after backfill completes. -
nonRetryableDocExceptionTypes— list of exception class names that should be counted as failures but not retried.
Warning
Setting both replayerConfig.removeAuthHeader: true and an authConfig block on the same target is rejected by the schema. Pick one — either rely on the target’s authConfig (the Traffic Replayer applies it for you) or strip the captured header.
Cutover and rollback
Switching traffic to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is the cutover step. By this point, capture has already protected writes during backfill, replay has caught the target up, and validation is complete.
Before you switch:
-
replay has reached the live edge,
-
the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is healthy,
-
representative application queries work on the target,
-
the application team is ready to move traffic, and
-
the rollback path is still available.
The exact cutover mechanism depends on your environment, but the principle is always the same:
-
Stop pointing clients at the capture proxy.
-
Point clients directly at the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.
-
Watch the target closely during the first production traffic window.
In practice, that usually means updating a DNS record, a load balancer backend, an application connection string, or a service-discovery entry. Keep the source cluster available during a rollback window (typically 24–72 hours) before decommissioning. After the rollback window has passed, see Uninstall the solution to remove the Migration Assistant infrastructure.
Assessment and breaking changes
Before running a migration, review breaking changes between your source and target versions. Breaking changes may require modifications to your client applications after the migration completes.
Understanding breaking changes
Between major versions of Elasticsearch and OpenSearch, features may be deprecated or removed. Common breaking changes include:
-
Removal of mapping types (Elasticsearch 6.x to 7.x and later)
-
Changes to field types (for example,
stringreplaced bytextandkeyword) -
Query DSL syntax changes
-
REST API endpoint changes
-
Plugin compatibility differences
For a complete list of breaking changes for your migration path, see the Migration Assistant documentation on GitHub
Impact of data transformations
Any time you apply a transformation to your data — such as changing index names, modifying field mappings, or splitting indexes with type mappings — these changes may need to be reflected in your client configurations. Run production-like queries against the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection before switching over actual production traffic. This verifies that clients can communicate with the target, locate the necessary indexes and fields, and retrieve expected results.
For complex migrations involving multiple transformations, perform a trial migration with representative non-production data to fully test client compatibility.
Creating a snapshot
Once you have your change data capture solution in place or have disabled indexing to your source cluster, create a snapshot. The snapshot captures all the metadata and documents to be migrated to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.
Create a snapshot
Run the following command from the Migration Console pod to initiate snapshot creation:
console snapshot create
Migration Assistant automatically generates a snapshot name and configures the necessary Amazon S3 repository. Alternatively, you can use an existing snapshot — see the snapshot configuration in your workflow configuration for details.
Check snapshot status
To check the snapshot creation status:
console snapshot status
To retrieve detailed information about the snapshot:
console snapshot status --deep-check
Wait for snapshot creation to complete before proceeding to the metadata migration phase. A completed snapshot returns output similar to:
SUCCESS Snapshot is SUCCESS. Percent completed: 100.00% Data GiB done: 29.211/29.211 Total shards: 40 Successful shards: 40 Failed shards: 0 Start time: 2024-07-22 18:21:42 Duration: 0h 13m 4s Anticipated duration remaining: 0h 0m 0s Throughput: 38.13 MiB/sec
Managing slow snapshot speeds
Depending on the data size in the source cluster and the bandwidth allocated for snapshots, the process can take some time. Use the --max-snapshot-rate-mb-per-node option to adjust the maximum rate at which the source cluster’s nodes create the snapshot. Increasing the snapshot rate consumes more source node resources, which may affect the cluster’s ability to handle normal traffic.
Migrating metadata
Metadata migration involves creating a snapshot of your source cluster and then migrating the metadata (index settings, mappings, templates, and aliases) to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.
The metadata migration tool gathers information from the source cluster through a snapshot or HTTP requests. These snapshots are fully compatible with the backfill process for Reindex-from-Snapshot (RFS).
Evaluating metadata
Before migrating, use the evaluate command to preview what will be migrated:
console metadata evaluate
This scans the source cluster, applies filtering and modifications, and produces a list of items that will be migrated. Items not listed will not be migrated. Use this as a safety check before modifying the target.
Example output:
Starting Metadata Evaluation Clusters: Source: Remote Cluster: OpenSearch 1.3.16 Target: Remote Cluster: OpenSearch 2.14.0 Migration Candidates: Index Templates: simple_index_template Component Templates: simple_component_template Indexes: blog_2023, movies_2023 Aliases: alias1, movies-alias Results: 0 issue(s) detected
Running metadata migration
The migrate command applies all evaluated items to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection:
console metadata migrate
If re-run multiple times, items that were previously migrated are not recreated. To re-migrate an item, delete it from the target first, then rerun evaluate and migrate.
Example output:
Starting Metadata Migration Clusters: Source: Snapshot: OpenSearch 1.3.16 Target: Remote Cluster: OpenSearch 2.14.0 Migrated Items: Index Templates: simple_index_template Component Templates: simple_component_template Indexes: blog_2023, movies_2023 Aliases: alias1, movies-alias Results: 0 issue(s) detected
Verifying metadata
Before proceeding to backfill, confirm the target cluster details. Depending on your configuration, check the sharding strategy or verify that index mappings are correctly defined by ingesting a test document:
console clusters curl target /<index>/_mapping?pretty console clusters curl target /<index>/_settings?pretty
Field type transformations
When migrating between versions, some field types may be deprecated or incompatible. Migration Assistant includes built-in transformations:
-
stringtotext/keyword— Automatically transforms the deprecatedstringtype (Elasticsearch 1.x–5.x) totextorkeywordbased on the originalindexproperty. -
flattenedtoflat_object— Automatically transformsflattenedfield type (Elasticsearch 7.3+) toflat_object(OpenSearch 2.7+). -
dense_vectortoknn_vector— Automatically transformsdense_vector(Elasticsearch 7.x) toknn_vectorwith appropriate similarity mappings and HNSW algorithm parameters.
Custom field type transformations
For field types not covered by automatic transformations, create a custom JavaScript transformation:
-
Access the Migration Console pod:
kubectl exec -it migration-console-0 -n ma -- /bin/bash -
Create a JavaScript transformation file:
cat > /shared-logs-output/field-type-converter.js << 'SCRIPT' function main(context) { const rules = [ { when: { type: "string" }, set: { type: "text" } }, { when: { type: "flattened" }, set: { type: "flat_object" }, remove: ["index"] } ]; function applyRules(node, rules) { if (Array.isArray(node)) { node.forEach((child) => applyRules(child, rules)); } else if (node instanceof Map) { for (const { when, set, remove = [] } of rules) { const matches = Object.entries(when).every(([k, v]) => node.get(k) === v); if (matches) { Object.entries(set).every(([k, v]) => node.set(k, v)); remove.forEach((key) => node.delete(key)); } } for (const child of node.values()) { applyRules(child, rules); } } else if (node && typeof node === "object") { for (const { when, set, remove = [] } of rules) { const matches = Object.entries(when).every(([k, v]) => node[k] === v); if (matches) { Object.assign(node, set); remove.forEach((key) => delete node[key]); } } Object.values(node).forEach((child) => applyRules(child, rules)); } } return (doc) => { if (doc && doc.type && doc.name && doc.body) { applyRules(doc, rules); } return doc; }; } (() => main)(); SCRIPT -
Create a transformation descriptor:
cat > /shared-logs-output/transformation.json << 'EOF' [ { "JsonJSTransformerProvider": { "initializationScriptFile": "/shared-logs-output/field-type-converter.js", "bindingsObject": "{}" } } ] EOF -
Run metadata migration with the transformation:
console metadata migrate --transformer-config-file /shared-logs-output/transformation.json
Handling type mapping deprecation (Elasticsearch 6.x)
In Elasticsearch versions prior to 7.0, an index could contain multiple types. OpenSearch no longer supports multiple mapping types. The TypeMappingSanitizationTransformer manages this deprecation by routing, merging, or dropping types during migration.
To configure type mapping handling, add a transformation configuration with the TypeMappingSanitizationTransformerProvider:
[ { "TypeMappingSanitizationTransformerProvider": { "staticMappings": { "activity": { "user": "new_users", "post": "new_posts" } }, "sourceProperties": { "version": { "major": 6, "minor": 8 } } } } ]
Supported strategies:
-
Route types to separate indexes — Split different types into their own indexes on the Amazon OpenSearch Service domain.
-
Merge all types into one index — Combine multiple types into a single index by mapping them all to the same target index name.
-
Drop specific types — Selectively migrate only specific types; omitted types are not migrated.
-
Keep original structure — Use regex patterns to maintain the same index name while removing the type layer.
Important
Whenever the transformation configuration is updated, the backfill and replayer tools must be stopped and restarted to apply the changes. Previously migrated data and metadata may need to be cleared to avoid an inconsistent state.
Metadata troubleshooting
Accessing detailed logs
Metadata migration creates a detailed log file for each run:
ls -al /shared-logs-output/migration-console-default/*/metadata/ tail /shared-logs-output/migration-console-default/*/metadata/*.log
OpenSearch running in compatibility mode
If you encounter errors about being unable to update an Elasticsearch 7.10.2 cluster, compatibility mode may be enabled on the Amazon OpenSearch Service domain. Disable compatibility mode to continue. See Enable compatibility mode in the Amazon OpenSearch Service documentation.
Running backfill
After metadata has been migrated, use Reindex-from-Snapshot (RFS) to backfill documents from the source cluster snapshot into the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.
Starting the backfill
Start the migration from RFS:
console backfill start
The status reports Running even if all shards have been migrated. See Stopping the backfill for how to finalize.
Scaling the backfill
To speed up the migration, increase the number of RFS worker pods. Each worker processes shards in parallel from the snapshot in Amazon S3, so scaling up does not add live read load to the source cluster — it only increases the indexing rate on the target.
console backfill scale <NUM_WORKERS>
For example, to scale to 5 workers:
console backfill scale 5
It may take a few minutes for additional workers to come online. Scale up gradually while monitoring the health metrics of the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection to avoid oversaturating it. Amazon OpenSearch Service provides cluster health and performance metrics that you can use for this monitoring.
Monitoring the backfill
Check the status of the backfill:
console backfill status
For detailed shard-level progress:
console backfill status --deep-check
Example output:
BackfillStatus.RUNNING Running=9 Pending=1 Desired=10 Shards total: 62 Shards completed: 46 Shards incomplete: 16 Shards in progress: 11 Shards unclaimed: 5
Amazon CloudWatch metrics and dashboard
Migration Assistant creates two Amazon CloudWatch dashboards to visualize migration health and performance:
-
MA-<STAGE>-<REGION>-ReindexFromSnapshot— metrics for the RFS workers and, for Amazon OpenSearch Service targets, the domain’s cluster metrics. -
MA-<STAGE>-<REGION>-CaptureReplay— metrics for the capture proxy and traffic replayer.
Replace <STAGE> and <REGION> with the values you used during deployment (for example, MA-prod-us-east-1-ReindexFromSnapshot).
Find the dashboards in the Amazon CloudWatch console in the AWS Region where you deployed Migration Assistant. The metric graphs for your target domain are blank until you select the OpenSearch domain you are migrating to from the dropdown menu at the top of the dashboard.
Pausing the backfill
To pause a migration without losing progress:
console backfill pause
This stops all existing workers while leaving the backfill operation in a resumable state. To restart:
-
Run
console backfill start, or -
Scale up the worker count with
console backfill scale <NUM_WORKERS>.
Stopping the backfill
Completing the backfill requires manually stopping the migration. Stopping shuts down all workers and cleans up all coordination metadata. After status checks report that data has been completely migrated, stop the migration:
console backfill stop
Example output:
Backfill stopped successfully. Archiving the working state of the backfill operation... Backfill working state archived to: /shared-logs-output/migration-console-default/backfill_working_state/working_state_backup_20241115174822.json
Warning
You cannot restart a stopped migration. If you may need to resume later, use console backfill pause instead.
Validating the backfill
After the backfill completes and workers have stopped, verify the contents of your Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection:
console clusters cat-indices --refresh
This displays the document count for each index. Compare source and target counts to verify completeness:
SOURCE CLUSTER health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open my-index -DqPQDrATw25hhe5Ss34bQ 1 0 3 0 12.7kb 12.7kb TARGET CLUSTER health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open my-index bGfGtYoeSU6U6p8leR5NAQ 1 0 3 0 5.5kb 5.5kb
Verifying no failed documents
Use the following Amazon CloudWatch Logs Insights query to identify failed documents:
fields @message | filter @message like "Bulk request succeeded, but some operations failed." | sort @timestamp desc | limit 10000
Run this query against the OpenSearchMigrations log group in Amazon CloudWatch. If failed documents are identified, you can index them directly rather than re-running the full backfill.
Using Traffic Replayer
Note
This section is only relevant if you are using Capture and Replay to avoid downtime during a migration to Amazon OpenSearch Service or Amazon OpenSearch Serverless. If you are performing backfill only, skip this section.
Traffic Replayer replays captured traffic from the source cluster to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. This verifies that the target can handle requests in the same way as the source and catches up to real-time traffic for a smooth migration.
When to run Traffic Replayer
Start Traffic Replayer only after all metadata and documents have been migrated. Running it before the document migration completes may cause operations to execute out of order. For example, a deletion captured after the snapshot was taken could execute before the document is added to the target.
Starting Traffic Replayer
console replay start
Example output:
Replayer started successfully.
Checking replay status
console replay status
The status reports:
-
Running — How many container instances are actively running.
-
Pending — How many instances are being provisioned.
-
Desired — The total number of instances that should be running.
Stopping Traffic Replayer
console replay stop
Delivery guarantees
Traffic Replayer retrieves traffic from Apache Kafka and updates its commit cursor after sending requests to the target. This provides an "at least once" delivery guarantee. Monitor metrics and validate externally to confirm the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is functioning as expected.
Time scaling
Traffic Replayer sends requests in the same order they were received on each connection. With a speedupFactor greater than 1, requests are sent faster than original timing:
-
speedupFactor 1 — Same rate and idle periods as the source.
-
speedupFactor 2 — Twice as fast. GETs sent every 500 ms instead of every second.
-
speedupFactor 10 — 10x faster, as long as the target responds quickly.
If the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection cannot respond quickly enough, Traffic Replayer waits for the previous request to complete before sending the next one.
Transformations
During migrations, some requests may need to be transformed between versions. Traffic Replayer automatically rewrites host and authentication headers. For more complex transformations, specify custom transformation rules in your workflow configuration using the replayerConfig section.
Elasticsearch content-type header compatibility
Newer Elasticsearch clients (version 7.11 and later, including all 8.x versions) use Elasticsearch-specific media types in Content-Type and Accept headers, such as application/vnd.elasticsearch+json;compatible-with=8. Amazon OpenSearch Service and Amazon OpenSearch Serverless do not support these media types.
If you are migrating traffic from Elasticsearch clients version 7.11 or later, apply a transformation to convert these headers to standard application/json:
-
Create a JavaScript transformation file in the Migration Console pod:
cat > /shared-logs-output/content-type-transformer.js << 'SCRIPT' const NEW_CONTENT_TYPE = "application/json"; const ELASTIC_CONTENT_TYPE = "application/vnd.elasticsearch+json"; function transform(request, context) { let headers = request.get("headers"); if (headers) { let contentType = headers.get("Content-Type"); if (Array.isArray(contentType)) { headers.set("Content-Type", contentType.map(v => v.includes(ELASTIC_CONTENT_TYPE) ? NEW_CONTENT_TYPE : v)); } else if (typeof contentType === "string") { if (contentType.includes(ELASTIC_CONTENT_TYPE)) { headers.set("Content-Type", NEW_CONTENT_TYPE); } } let accept = headers.get("Accept"); if (Array.isArray(accept)) { headers.set("Accept", accept.map(v => v.includes(ELASTIC_CONTENT_TYPE) ? NEW_CONTENT_TYPE : v)); } else if (typeof accept === "string") { if (accept.includes(ELASTIC_CONTENT_TYPE)) { headers.set("Accept", NEW_CONTENT_TYPE); } } } return request; } function main(context) { return (request) => { if (Array.isArray(request)) { return request.flat().map(item => transform(item, context)); } return transform(request, context); }; } (() => main)(); SCRIPT -
Create a transformation configuration file:
cat > /shared-logs-output/replayer-transformation.json << 'EOF' [ { "JsonJSTransformerProvider": { "initializationScriptFile": "/shared-logs-output/content-type-transformer.js", "bindingsObject": "{}" } } ] EOF -
Reference the transformation in your workflow configuration under
replayerConfigor pass as extra args.
Result logs
HTTP transactions from the source capture and those replayed to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection are logged at:
/shared-logs-output/traffic-replayer-default/*/tuples/tuples.log
Each log entry is a newline-delimited JSON object containing source and target request/response pairs along with transaction details such as response times. Previous runs are available in gzipped format.
To view logs in human-readable format:
console tuples show --in /shared-logs-output/traffic-replayer-default/<ID>/tuples/tuples.log > readable-tuples.log
Note
These logs contain the contents of all requests, including authorization headers and HTTP message bodies. Ensure that access to the migration environment is restricted.
Amazon CloudWatch metrics
Traffic Replayer emits OpenTelemetry metrics to Amazon CloudWatch and traces through AWS X-Ray. Key metrics include:
| Metric | Description |
|---|---|
|
|
HTTP status codes for source and target, with dimensions for HTTP verb and status code family. Quickly identifies discrepancies between source and target responses. |
|
|
Delay between requests hitting the source and target. With a speedup factor greater than 1, this value should decrease as replay progresses. |
|
|
Throughput to and from the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. |
|
|
Requests retried because of status code mismatches between source and target. |
|
Various |
Event counts for completed operations. |
|
Various |
Duration of each processing step. |
|
Various |
Exceptions encountered during each processing phase. |
Note
Metrics pushed to Amazon CloudWatch may experience a visibility lag of approximately 5 minutes. Amazon CloudWatch retains higher-resolution data for a shorter period than lower-resolution data. For more information, see Amazon CloudWatch concepts.
What is not migrated automatically
Plan separate work for:
-
security configuration,
-
ISM/ILM policies,
-
ingest pipelines,
-
OpenSearch Dashboards or Kibana saved objects,
-
data streams,
-
and cluster-level tuning.