Provider configuration Supported transformations

Transform Solr traffic

The Solr transform providers translate captured Solr traffic into OpenSearch-compatible requests.

Provider configuration

Use SolrToOpenSearchTransformProvider as a request transform for captured Solr traffic. It accepts Solr schema and config XML so the replayer can translate /solr/<collection>/select and /solr/<collection>/update requests with collection-aware behavior:

managed-schema.xml supplies field metadata. The transformer uses explicit fields and dynamic field patterns to choose match queries for solr.TextField fields, term queries for known non-text fields, and match as the fallback for unknown fields. It does not read the schema uniqueKey for live write replay; write transforms require a literal document id field.
solrconfig.xml supplies request handler parameters. The transformer applies handler defaults only when a request omits the parameter, invariants as overrides, and appends as additional values before translating query parameters. The parser keeps one value per parameter name in each handler block, so repeated same-name entries in defaults, invariants, or appends collapse to the last value read. These settings are applied for the translated endpoint name, such as a requestHandler named /select; custom handler aliases such as /browse are not translated unless you add a custom transform.
targetType controls target-specific behavior. OpenSearch is the default. OpenSearchServerless and NextGenOpenSearchServerless suppress ?refresh=true on target-type-gated write translations such as document ingest, bulk add/delete, and delete-by-query. They do not remove standalone _refresh requests produced from Solr commit commands, and they do not currently suppress ?refresh=true on standalone delete-by-ID translations when the captured request includes commit=true or commitWithin.

For a single schema/config pair, configure the default keys solrSchemaXml and solrConfigXml. For multiple collections or config sets in the same replay stream, prefix the keys with the Solr collection name from the request path:


"context": {
  "values": {
    "targetType": { "value": "OpenSearch" },
    "products.solrSchemaXml": {
      "fromFile": {
        "configMap": "solr-config",
        "path": "products-managed-schema.xml"
      }
    },
    "products.solrConfigXml": {
      "fromFile": {
        "configMap": "solr-config",
        "path": "products-solrconfig.xml"
      }
    },
    "reviews.solrSchemaXml": {
      "fromFile": {
        "configMap": "solr-config",
        "path": "reviews-managed-schema.xml"
      }
    },
    "reviews.solrConfigXml": {
      "fromFile": {
        "configMap": "solr-config",
        "path": "reviews-solrconfig.xml"
      }
    }
  }
}

Collection-specific keys are scoped by collection, not by ZooKeeper config set name or ConfigMap file name. For request paths such as /solr/products/select, the provider looks for keys beginning with products.. Requests for collections without a collection-specific entry fall back to the unprefixed defaults. Schema and config fall back independently, so a collection can override only solrSchemaXml while using the default solrConfigXml, or the other way around. If several collections share one Solr config set, either make that solrConfigXml the default or reference the same ConfigMap file under each collection prefix.

The provider also supports solrSchemaXmlFile, solrConfigXmlFile, and their <collection>.-prefixed forms for direct filesystem-based replayer configuration, but workflow configurations should usually use fromFile with the XML keys so files are loaded from ConfigMaps.

For each scope, configure either the inline XML key or the file key for a given input, not both. For example, do not set both solrSchemaXml and solrSchemaXmlFile, or both products.solrConfigXml and products.solrConfigXmlFile. The provider rejects those combinations during startup. A nonblank targetType must be one of OpenSearch, OpenSearchServerless, or NextGenOpenSearchServerless; invalid values also fail provider startup.

Blank, missing, or unparsable schema/config XML does not stop the replayer. The provider logs the problem and continues with an empty schema or config for that scope. That keeps replay moving, but request translations fall back to generic behavior: unknown fields use the default query handling, and request-handler defaults are not applied. Verify ConfigMap paths and replayer logs before treating Solr replay results as schema-aware.

The example above uses the workflow transform pipeline syntax. In context.values, wrap literal provider values as { "value": "…" } and file-backed values as { "fromFile": … }; the workflow materializes those entries before invoking SolrToOpenSearchTransformProvider. If you bypass the workflow pipeline and use raw transformerConfig directly, use raw provider values instead, such as "targetType": "OpenSearch" and "products.solrSchemaXmlFile": "/path/to/products-schema.xml".

Only Solr select and update traffic is translated. Other Solr endpoints, including admin, schema, config, replication, ping, and metrics endpoints, are not rewritten into OpenSearch equivalents. Suppress that traffic at the capture proxy when it is not needed for validation, or add a custom request transform before replay.

Include SolrTupleTransformProvider under tupleTransforms when you want replay tuple audit records back-translated for Solr/OpenSearch comparison. It adds targetResponsesTransformed to tuple output records so you can compare the captured Solr response with a Solr-shaped view of the replayed target response for translated select and update traffic. Entries are null when the tuple’s source request path is not recognized by the provider, and entries contain an error object when response back-translation fails for that tuple. Recognized but unsupported Solr endpoints, such as admin, schema, config, ping, and metrics traffic, are not reliable Solr-equivalent comparisons; suppress them during capture or add a custom tuple transform if you need to analyze them. SolrTupleTransformProvider does not require configuration, does not affect replayed requests, and can be omitted when you do not need Solr-shaped tuple output. Schema-aware request translation is handled by SolrToOpenSearchTransformProvider.

The tuple transform does not consume the managed-schema.xml or solrconfig.xml values used by the request transform. When it reconstructs Solr-shaped response.docs from OpenSearch hits, it uses a schemaless fallback: existing array values remain arrays, numeric and boolean values remain scalar, and other unknown scalar values are returned as single-item arrays. Treat this as audit output for comparison, not as a byte-for-byte Solr response contract. If your comparison workflow needs exact field cardinality, normalize the tuple output downstream or add a custom tuple transform.

For a full capture-and-replay workflow example, see Capture and replay live traffic from Solr.

Supported transformations

Write operations:

Solr operation	Status
Single document ingest (`POST /update/json/docs {…}`)	✓ Supported
Batch document ingest (`POST /update/json/docs [{…},{…}]`)	✓ Supported
Bare array ingest (`POST /update [{…},{…}]`)	✓ Supported
Add command (`POST /update {"add":{"doc":{…}}}`)	✓ Supported
Bulk add (`POST /update {"add":[{"doc":{…}},{"doc":{…}}]}`)	✓ Supported
Delete by ID (`POST /update {"delete":{"id":"…"}}`)	✓ Supported
Bulk delete by ID (`POST /update {"delete":["1","2"]}`)	✓ Supported
Delete by ID with routing or version fields (`{"delete":{"id":"1","route":"r","version":10}}`)	✗ Not supported. Delete commands must contain only `id` or only `query`.
Delete by query (`POST /update {"delete":{"query":"…"}}`)	✓ Supported
Commit (`POST /update {"commit":{}}`)	✓ Supported
Mixed add, delete-by-ID, and commit commands (`{"add":[…], "delete":[…], "commit":{}}`)	✓ Supported
Delete by query mixed with other update commands (`{"add":[…], "delete":{"query":"…"}}`)	✗ Not supported. Send delete-by-query as a standalone update request. The mixed-command path only flattens add operations, delete-by-ID operations, and commit semantics; it does not apply a delete-by-query embedded in the same update body.
Empty update array (`POST /update []`)	✓ Supported. Translated to `POST /<collection>/_refresh` and returned as a Solr-shaped success response.
XML content type (`POST /update` with `text/xml`)	✗ Not supported
Document with boost (`{"add":{"doc":{…}, "boost": 1.5}}`)	✗ Not supported
Document with `overwrite: false` (`{"add":{"doc":{…}, "overwrite": false}}`)	✗ Not supported
Documents without a literal `id` field, including schemas whose `uniqueKey` is not `id`	✗ Not supported. Live write replay uses the document’s `id` field as the OpenSearch document ID.
Optimize / Rollback (`{"optimize":{}}`)	✗ Not supported

Search and query transformation:

Feature	Status
Query parsing — `q`, `df`, standard parser behavior, `defType=dismax`, `defType=edismax`, `q.op`	✓ Supported
Query fields — `qf=title^3 body^1`, `tie=0.3`	✓ Supported
Minimum match — `mm=75%`, `mm=2`	✓ Supported
Phrase fields and slop — `pf=title^5`, `ps=2`, `qs=2`	✓ Supported
Boost queries — `defType=edismax&bq=category:electronics^2`	✓ Supported with `defType=dismax` or `defType=edismax`
Filter queries — `fq=status:active`	✓ Supported
Standard parser filter syntax — `q=category:books AND filter(inStock:true)`	✓ Supported as a non-scoring filter clause
Field list — `fl=id,title`, `fl=id,na*`	✓ Supported for source fields and glob patterns
Simple sorting — `sort=price asc, score desc`	✓ Supported for field and score sorts
Pagination — `start=10&rows=20`	✓ Supported
JSON Request API body keys — `query`, `limit`, `offset`, `sort`, `filter`, `fields`, `params`, `facet`	✓ Supported when the JSON body includes `query`
JSON Facet API — terms, range, query, nested facets, and metric stat facets such as `avg`, `sum`, `min`, `max`, `unique`, `hll`, `countvals`, and `count`	✓ Supported
Highlighting — `hl=true&hl.fl=title,body`	✓ Supported
Boolean operators — `AND`, `OR`, `NOT`, `&&`, `
, `!, `+`, `-`	✓ Supported
Lowercase eDisMax operators — `lowercaseOperators=true`	✓ Supported
Range queries — `price:[10 TO 100]`, `stock:{0 TO 50}`, `event_date:[NOW-7DAYS TO NOW]`, `field:[* TO *]`	✓ Supported. Inclusive and exclusive bounds map to `gte`, `gt`, `lte`, and `lt`; unbounded bounds are omitted, and `[` becomes an OpenSearch `exists` query.
Wildcard and fuzzy — `title:java*`, `title:jav~2`	✓ Supported
Solrconfig defaults from request handlers	✓ Supported

The defType=dismax path follows DisMax query parsing semantics: explicit field syntax such as title:java, ranges, and fuzzy markers are treated as literal query text unless the request uses defType=edismax or the standard parser. Boost queries in bq are parsed with standard query syntax even when the main q parameter uses DisMax or eDisMax, matching Solr’s behavior.

For JSON Request API bodies, normalization runs only when the body contains query. The transform maps top-level JSON keys to their URL-parameter equivalents (query to q, limit to rows, offset to start, filter to fq, and fields to fl) and treats top-level facet as JSON Facet API input. The mapped top-level values take precedence over URL parameters. Entries under the JSON body’s params object fill in only parameters that are not already present in the URL. Top-level facet is used only when the URL does not already include json.facet. For facet-only JSON bodies, include "query": ":" so the body is normalized before replay.

The JSON body filter key is normalized as one fq value. If you need multiple filter queries, send repeated URL fq parameters or add a custom request transform; the JSON Request API normalization does not expand a JSON filter array into multiple fq clauses.

For highlighting, hl=true creates an OpenSearch highlight block and response transformation moves per-hit highlights back to Solr’s top-level highlighting object. The transform maps hl.fl, hl.method (unified, original, fastVector), hl.snippets, hl.fragsize, hl.simple.pre, hl.simple.post, hl.tag.pre, hl.tag.post, hl.requireFieldMatch, hl.encoder, hl.maxAnalyzedChars, and hl.q. The exact highlighted fragment boundaries can differ from Solr because OpenSearch performs its own passage selection.

Query features with limitations:

Feature	Limitation
`cursorMark` continuation tokens	Not supported in Traffic Replayer. Requests that include `cursorMark` are rejected.
Boost functions (`bf`) and multiplicative boost (`boost`)	Not supported. Function-based scoring requires dedicated transform.
`mm.autoRelax` and advanced eDisMax minimum-match edge cases	`mm.autoRelax` is not supported. Minimum-match behavior with mixed explicit/implicit operators or per-field stopword removal can differ from Solr.
`sow=true` (split on whitespace)	Not supported. No OpenSearch per-token analysis split.
Negative boost in `bq` (for example, `^-10`)	Not supported. OpenSearch does not support negative boost values.
`bq` on standard-parser requests	Not supported. The transform rejects `bq` unless the request sets `defType=dismax` or `defType=edismax`, matching Solr’s boost-query parser scope.
Terms facet `offset`	Approximated by requesting `size = offset + limit`. Clients must trim leading buckets.
Multi-unit date range gaps (for example, `+2MONTHS`)	Approximated using fixed intervals. Bucket boundaries may drift.
Filter query local params (`cache`, `cost`, `frange`, `geofilt`)	Not supported. Requests using these local params are rejected because OpenSearch has no equivalent cache, cost, post-filter, function-range, or Solr geospatial filter controls.
Range facet boundary options	OpenSearch range aggregations use inclusive `from` and exclusive `to` boundaries. Solr range boundary variants such as `(10,20]` cannot be represented exactly.
Range facet `hardend`, `include`, and `other`	No direct OpenSearch histogram equivalent. The transform still translates the range facet, but these options are not applied exactly; validate bucket boundaries and extra before/after/between counts.
JSON facet `type=query`	Translated as an OpenSearch `query_string` filter aggregation. Complex Solr-only query syntax inside the facet query can behave differently from the main `q` translation path.
Advanced JSON facet controls and unsupported facet types	Facet `domain`, tag/exclusion behavior, refinement controls, `allBuckets`, `numBuckets`, `overrequest`, and `prelim_sort` are not translated. Unknown facet keys are logged as warnings and ignored when the facet type itself can be translated. Facet types other than `terms`, `range`, `query`, and supported string stat facets are rejected.
Local params syntax (`{!…}`) in `q`, `sort`, `fl`, or `bq`	Not supported. These forms are rejected instead of being passed through.
Filter-query local params (`fq={!cache=…}`, `fq={!cost=…}`, `fq={!frange …}`, `fq={!geofilt …}`)	Not supported. Plain `fq` values are translated, but these Solr-specific filter execution hints and specialized filter parsers are rejected.
Function-based sorting (`sort=div(popularity,price) desc`, `sort=field(categories,min) asc`)	Not supported. Use simple field or `score` sorting, or add a custom request transform.
Classic Solr faceting (`facet=true`, `facet.field`, `facet.range`)	Not supported. Use the JSON Facet API through `json.facet` or a JSON Request API body with `facet`.
Field-list pseudo-fields and document transformers (`fl=score`, `fl=[explain]`, `fl=[child]`)	Not translated into Solr response fields. The transform ignores pseudo-fields and bracketed document transformer entries when building the OpenSearch `_source` filter.
`json.<param>` URL prefix	Generic `json.<param>` passthrough is not supported. Use JSON body keys or standard URL params. `json.facet` and `json.facet.*` are supported for JSON facets.
JSON Request API `queries` key	Not supported. Named sub-queries with local param references are not translated; remove them or add a custom request transform before replay.
JSON Request API body without `query`	Not normalized. Include `query` in the JSON body, using `:` for match-all requests, or express the request with URL parameters.
Response writer parameters (`wt`, `indent`, `echoParams`)	Accepted only for JSON-compatible replay. The transform does not render XML or CSV output for `wt=xml` or `wt=csv`, and `indent` and `echoParams` are treated as compatibility no-ops.
Unknown Solr select URL parameters	Rejected during URL-parameter validation unless they are explicitly supported by the transform or are private parameters whose names start with `_`. Strip unsupported query-string parameters before replay or add a custom request transform.
Highlighting: `hl.alternateField`, `hl.maxAlternateFieldLength`, `hl.mergeContiguous`, `hl.preserveMulti`, `hl.fragmenter`, `hl.tag.ellipsis`, `hl.fragListBuilder`, `hl.boundaryScanner`	No OpenSearch equivalent. Skipped during transformation.
XML update requests	Not supported. Use JSON format.

Behavioral differences:

Solr behavior	OpenSearch translation	Impact
`commitWithin=N` (timed batch commit)	`?refresh=true` (immediate refresh; suppressed on Serverless targets for document ingest, bulk add/delete, and delete-by-query)	More aggressive than Solr’s batched approach. May increase refresh overhead under high write throughput. For Serverless targets, avoid `commit=true` or `commitWithin` on delete-by-ID traffic because the current delete-by-ID translation still appends `?refresh=true`.
`commit` vs `softCommit`, including empty update arrays	Both map to `_refresh`	OpenSearch has no distinction. Durability is automatic via the translog. `targetType` does not turn commit-only update commands into no-ops; suppress or custom-transform commit-only traffic if your target rejects `_refresh`.
Batch failure semantics (strict mode)	Tolerant mode (processes independently)	Partial failures possible. No rollback of successful items.
`version` optimistic concurrency	Not translated	Conditional writes based on `version` are not enforced. If `version` appears inside an add or `/update/json/docs` document body, it is treated like an ordinary field. Delete commands with version or routing fields are rejected because only plain delete-by-ID and standalone delete-by-query are supported.
Atomic update modifiers (`{"field":{"set":"value"}}`, `inc`, `add`, `remove`)	Not translated to partial update semantics	Update requests are replayed as full document index operations. Convert atomic updates to full documents before replay or add a custom transform; otherwise modifier objects can be indexed as field values.
Custom Solr `uniqueKey`	Not read by the live traffic transform	For replayed add and delete operations, the transform expects `id` in the captured request body or delete command. If your collection’s unique key uses another field name, add a custom request transform that maps it to `id` before `SolrToOpenSearchTransformProvider` runs.
Delete by query	Synchronous `_delete_by_query` with `wait_for_completion=true`	Standalone delete-by-query requests must translate without query-string passthrough. Version conflicts are reported as partial failures rather than aborting the whole operation. Do not mix delete-by-query with add, delete-by-ID, or commit commands in the same Solr update body.
Terms facet counts (exact in Solr)	Approximate in OpenSearch	Multi-shard indexes produce approximate counts. Inspect `doc_count_error_upper_bound`.
Query parse or transform failure	Request rejected	Unsupported query syntax fails fast instead of sending the raw Solr query to OpenSearch’s `query_string` parser.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Transform live traffic (Traffic Replayer)

Monitor the solution