Use the solution
This section describes how to run a migration using the Migration Assistant for Amazon OpenSearch Service solution after you have deployed it on Amazon EKS. The day-to-day operator interface is the Workflow CLI, which runs in the Migration Console pod (migration-console-0) on Amazon EKS. The supporting console CLI provides component-level inspection and ad-hoc operations during validation and troubleshooting.
Getting started with the Workflow CLI
This sequence is the shortest safe path to your first migration to Amazon OpenSearch Service or Amazon OpenSearch Serverless: load the right schema for your version, prove connectivity, run a small pilot, and only then run the full workflow.
Before you start
Make sure all of the following are true:
-
Migration Assistant is deployed on Amazon EKS. See Deploy the solution.
-
The source cluster and the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection are reachable from the Amazon EKS cluster.
-
Snapshot storage is ready in Amazon S3 if you plan to run backfill.
-
Any basic-auth Kubernetes secrets you need can be created in the
manamespace.
Step 1: Access the Migration Console
kubectl exec -it migration-console-0 -n ma -- /bin/bash
If you are accessing Amazon EKS from a new shell, refresh your kubeconfig first:
aws eks update-kubeconfig --region <REGION> --name migration-eks-cluster-<STAGE>-<REGION>
Step 2: Confirm the installed version
console --version
This matters because the workflow schema can change by release.
Step 3: Load the version-matched sample
workflow configure sample --load
This gives you the safest starting point for your installed release.
Step 4: Edit the workflow configuration
workflow configure edit
Fill in the fields that describe your migration:
-
Source endpoint, version, and authentication.
-
Target endpoint and authentication. For Amazon OpenSearch Serverless, set
service: aossin the SigV4authConfig. For Amazon OpenSearch Service, setservice: es. -
Snapshot repository details if you are running backfill.
-
The migration pattern: backfill only, capture and replay only, or both.
Note
Do not start by editing every possible field. Start with the minimum required fields for your path.
Target configuration for Amazon OpenSearch Serverless
When the target is an Amazon OpenSearch Serverless collection, set the target cluster like this:
{ "targetClusters": { "target": { "endpoint": "https://<collection-id>.<region>.aoss.amazonaws.com", "authConfig": { "sigv4": { "region": "<region>", "service": "aoss" } } } } }
The migration IAM role created by the Amazon EKS deployment must also be added as a principal in your collection’s data access policy. The IAM role is named <eks-cluster-name>-migrations-role. Add it to the collection’s data access policy with both collection-level and index-level permissions before running the workflow.
Target configuration for Amazon OpenSearch Service
When the target is an Amazon OpenSearch Service domain:
{ "targetClusters": { "target": { "endpoint": "https://<domain-endpoint>", "authConfig": { "sigv4": { "region": "<region>", "service": "es" } } } } }
If your domain has fine-grained access control (FGAC) enabled, map the migration IAM role to a security role on the domain (typically all_access during migration, then scoped down). See Troubleshooting.
Step 5: Create Kubernetes secrets if you use basic authentication
kubectl create secret generic source-credentials \ --from-literal=username=<SOURCE_USER> \ --from-literal=password=<SOURCE_PASSWORD> \ -n ma kubectl create secret generic target-credentials \ --from-literal=username=<TARGET_USER> \ --from-literal=password=<TARGET_PASSWORD> \ -n ma
Reference those secret names in authConfig.basic.secretName in your workflow configuration.
Step 6: Verify connectivity before submitting a workflow
console clusters connection-check
The check runs against both source and target by default. To narrow it to one side:
console clusters connection-check --cluster source console clusters connection-check --cluster target
For a direct API check:
console clusters curl source / console clusters curl target /
If these checks fail, stop and fix connectivity or authentication first. Do not start a workflow yet.
Step 7: Verify AWS identity if you use SigV4
If your source or target uses Amazon OpenSearch Service or Amazon OpenSearch Serverless, verify pod identity is working from the Migration Console pod:
aws sts get-caller-identity
If console clusters connection-check works in the Migration Console but the workflow later fails with HTTP 401 or 403, verify that the Argo workflow executor pods are using the IRSA-backed argo-workflow-executor service account. On Amazon EKS, both the Migration Console pod and the workflow executor pods get IRSA-backed identity automatically through the bootstrap script.
Step 8: Run a pilot migration first
Use a small allowlist or a representative subset before you attempt the full migration. This is the easiest way to catch mapping issues, authentication issues, and throughput problems early.
workflow submit workflow manage
Use workflow manage to watch the run and approve any gated steps.
Step 9: Validate the pilot
Check counts and basic behavior on the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection before you expand scope:
console clusters cat-indices console clusters curl target /<index>/_count console clusters curl target /<index>/_search?size=5&pretty
If you are migrating applications with live traffic, also validate representative queries against the target.
Step 10: Run the real migration
After the pilot succeeds, widen the configuration to the full index set and submit again:
workflow configure edit workflow submit workflow manage
Step 11: Use logs if anything fails
workflow status workflow output workflow output --follow
workflow submit automatically stops and replaces an existing workflow with the same name, so you do not need to manually clean up between runs. If a previous run left orphaned migration custom resources, use workflow reset instead of deleting Argo workflows directly:
workflow reset # interactive — lists CRDs and prompts before delete workflow reset migration-foo # delete a specific resource by name workflow reset --all # delete everything (capture proxies are protected) workflow reset --all --include-proxies --delete-storage # also remove capture proxies and Apache Kafka PVCs
Core commands
The workflow CLI orchestrates a full migration; the console CLI inspects or manually drives a single component during validation and troubleshooting.
Workflow commands
| Command | Why you use it |
|---|---|
|
|
Shows the sample schema for your installed version |
|
|
Loads that sample as your starting point |
|
|
Opens the workflow config in your editor ( |
|
|
Prints the current config |
|
|
Clears the current config and lets you start over |
|
|
Starts the migration workflow (auto-stops and replaces an existing one with the same name) |
|
|
Submits and blocks until the workflow completes or the timeout is reached |
|
|
Primary day-to-day interface for monitoring, approvals, and logs (interactive TUI) |
|
|
Shows the current workflow tree in a non-interactive form |
|
|
Shows running and completed workflows |
|
|
Shows logs across workflow pods |
|
|
Streams logs live |
|
|
Approves pending gates that match exact names or globs |
|
|
Lists migration custom resources and lets you delete them safely |
Console commands
The console CLI groups operations by component:
| Command | Why you use it |
|---|---|
|
|
Confirms which schema and behavior your Migration Console is running |
|
|
Verifies the Migration Console can reach and authenticate to source and target |
|
|
Lists indexes on one or both clusters |
|
|
Issues a direct API request against the named cluster |
|
|
Destructive — deletes all indexes on the named cluster |
|
|
Manage snapshots in Amazon S3 |
|
|
Run or preview metadata migration outside the workflow |
|
|
Inspect or drive RFS backfill |
|
|
Inspect or drive Traffic Replayer |
|
|
Inspect Migration Assistant metrics |
|
|
Inspect Strimzi-managed Apache Kafka used by capture and replay |
|
|
Inspect captured request/response tuples for replay validation |
Note
The workflow path drives metadata, backfill, and replay automatically. Reach for the equivalent console command only when you want to inspect state or work around a specific failure (for example, to call console snapshot status while a long-running snapshot is in progress).
Approval gates
Not every migration step should run without human review. Approval gates let the workflow stop at meaningful checkpoints — typically transitions after metadata work, backfill milestones, and cutover-sensitive steps — so you can validate before continuing.
workflow manage workflow approve <STEP_NAME>
Status symbols
| Symbol | Meaning |
|---|---|
|
|
Succeeded |
|
|
Running |
|
|
Pending |
|
|
Failed |
|
|
Waiting for approval |
Migration scenarios
Migration Assistant supports three migration patterns. Pick the one that matches your downtime tolerance.
Scenario 1: Backfill only
Best when you can tolerate a brief write freeze, or when writes can be paused and replayed from an external queue.
Snapshot source → Migrate metadata → Backfill documents → Verify → Switch traffic
Scenario 2: Capture and Replay only
Best when the data is small enough that live replay alone can synchronize the target on Amazon OpenSearch Service or Amazon OpenSearch Serverless, or when you want to replay traffic against multiple targets to compare results.
Reroute traffic to capture proxy → Migrate metadata → Replay traffic → Verify → Switch traffic to target
Scenario 3: Backfill + Capture and Replay (zero-downtime)
The most comprehensive approach. Capture begins first so no writes are lost, then backfill brings over historical data, then replay catches the target up to real-time.
Reroute traffic to capture proxy → Snapshot source → Migrate metadata → Backfill documents → Replay captured traffic → Verify → Switch traffic to target
Backfill tuning
Useful Reindex-from-Snapshot settings include:
-
podReplicas— number of RFS pods running in parallel (one shard per pod). -
maxConnections— bulk-indexer concurrency to the target. -
documentsPerBulkRequest— bulk batch size. -
maxShardSizeBytes— maximum supported shard size (default 80 GiB). Larger shards must be reduced before backfill (force-merge or split). -
initialLeaseDuration— ISO-8601 duration each worker holds a shard lease before re-acquisition (defaultPT10M). -
allowedDocExceptionTypes— list of exception class names from the target’s response that should be counted as success rather than retried. -
allowLooseVersionMatching— bypass the strict source/target version compatibility check.
Because RFS reads from snapshot storage in Amazon S3, increasing worker count does not add live read load to the source cluster. It mainly changes how quickly the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is driven.
Replay tuning
Useful Traffic Replayer settings include:
-
podReplicas— number of replayer pods. -
speedupFactor— default1.1.2.0means twice the original traffic timeline. -
removeAuthHeader— strips the capturedAuthorizationheader before replaying. Useful when the captured traffic carries credentials that would not be valid against the target. -
authHeaderOverride— replaces the capturedAuthorizationheader with a static value. -
dependsOnSnapshotMigrations— ensures replay only starts after backfill completes. -
nonRetryableDocExceptionTypes— list of exception class names that should be counted as failures but not retried.
Warning
Setting both replayerConfig.removeAuthHeader: true and an authConfig block on the same target is rejected by the schema. Pick one — either rely on the target’s authConfig (the Traffic Replayer applies it for you) or strip the captured header.
Cutover and rollback
Switching traffic to the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is the cutover step. By this point, capture has already protected writes during backfill, replay has caught the target up, and validation is complete.
Before you switch:
-
replay has reached the live edge,
-
the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection is healthy,
-
representative application queries work on the target,
-
the application team is ready to move traffic, and
-
the rollback path is still available.
The exact cutover mechanism depends on your environment, but the principle is always the same:
-
Stop pointing clients at the capture proxy.
-
Point clients directly at the Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection.
-
Watch the target closely during the first production traffic window.
In practice, that usually means updating a DNS record, a load balancer backend, an application connection string, or a service-discovery entry. Keep the source cluster available during a rollback window (typically 24–72 hours) before decommissioning. After the rollback window has passed, see Uninstall the solution to remove the Migration Assistant infrastructure.
What is not migrated automatically
Plan separate work for:
-
security configuration,
-
ISM/ILM policies,
-
ingest pipelines,
-
OpenSearch Dashboards or Kibana saved objects,
-
data streams,
-
and cluster-level tuning.