Step 1. Map primitive fields Step 2. Map text fields Step 3. Map custom field types Step 4. Map copy fields Step 5. Map dynamic fields Handling unique keys Handling similarity configurations Best practices

Migration flow

This section describes how you can apply an iterative approach to migrating your Solr schema to an Amazon OpenSearch Service index.

Solr and OpenSearch organize search configurations differently, but their core concepts align closely. We recommend that you fully refactor your search solution to optimize it for OpenSearch.

The migration process starts with primitive field mappings and progressively handles more complex configurations, as follows:

Primitive field mappings
Text field mappings
1. Custom dictionary mappings
2. Analyzer mappings
Custom field type mappings
Copy field mappings
Dynamic field mappings

The mappings and configurations in the following tables compare Apache Solr 9.x with OpenSearch 2.x.

Field mappings:

Solr field type class	OpenSearch field type	Analyzer support	Use case
`solr.TextField` `solr.SortableTextField`	`text`	Yes	Full-text search. For `SortableTextField`, map a `subfield` keyword with the `ignore_above` parameter set to 1000.
`solr.StrField`	`keyword`	No	Exact matching.
`solr.IntPointField`	`integer`	No	Numeric values.
`solr.LongPointField`	`long`	No	Large numbers.
`solr.FloatPointField`	`float`	No	Decimal numbers.
`solr.DoublePointField`	`double`	No	High-precision decimals.
`solr.DatePointField`	`date`	No	Date/time values.
`solr.BoolField`	`boolean`	No	True/false values.
`solr.BinaryField`	`binary`	No	Binary data.
`solr.LatLonPointSpatialField`	`geo_point`	No	Geographic coordinates.
`solr.BBoxField`	`geo_shape`	No	Storing and querying complex geographic shapes.
`solr.PointType`	`xy_point`	No	N dimensional point.
`solr.NestPathField`	`nested`	No	Complex objects.
`solr.RankField`	`rank_feature`	No	Boosting or decreasing the relevance score of documents.
`solr.CurrencyField`	No direct mapping.	N/A	N/A
`solr.EnumFieldType`	No direct mapping.	N/A	N/A

Field attribute mappings:

Solr attribute	OpenSearch mapping parameter	Description	Example
`indexed="true"`	`"index": true`	Field is searchable.	Text search, filtering.
`stored="true"`	`"store": true`	Original value is stored.	Highlighting, retrieval.
`docValues="true"`	`"doc_values": true`	Field supports sorting and aggregation.	Faceting, sorting.
`multiValued="true"`	Native array support.	Field accepts multiple values.	Tags, categories.
`required="true"`	`"required": true`	Field must have a value.	Validation.
`useDocValuesAsStored="true"`	`"doc_values": true`	Use DocValues for storage.	Memory optimization.
`omitNorms="true"`	`"norms": false`	Skip scoring normalization.	Exact match fields.
`termVectors="true"`	Not supported.	Store term vectors.	Logged as unknown.
`termPositions="true"`	Not supported.	Include position information.	Logged as unknown.
`termOffsets="true"`	Not supported.	Include offset information.	Logged as unknown.

Tokenizer mappings:

Solr class	OpenSearch type	Solr parameter	Maps to
`solr.ClassicTokenizerFactory`	`standard`	`maxTokenLength`	`max_token_length` (default: 255)
`solr.KeywordTokenizerFactory`	`keyword`	`maxTokenLen`	`buffer_size` (default: 256)
`solr.LetterTokenizerFactory`	`letter`	No parameters.	N/A
`solr.LowerCaseTokenizerFactory`	`lowercase`	No parameters.	N/A
`solr.NGramTokenizerFactory`	`ngram`	`minGramSize` `maxGramSize`	`min_gram` (default: 1) `max_gram` (default: 2)
`solr.EdgeNGramTokenizerFactory`	`edge_ngram`	`minGramSize` `maxGramSize`	`min_gram` (default: 1) `max_gram` (default: 2)
`solr.PathHierarchyTokenizerFactory`	`path_hierarchy`	`reverse` `skip` `delimiter` `replace`	`reverse` (default: false) `skip` (default: 0) `delimiter` (default: "/") `replace` (default: "/")
`solr.PatternTokenizerFactory`	`pattern`	`pattern` `group`	`pattern` (default: "") `group` (default: -1)
`solr.SimplePatternTokenizerFactory`	`simple_pattern`	`pattern`	`pattern` (default: "")
`solr.SimplePatternSplitTokenizerFactory`	`simple_pattern_split`	`pattern`	`pattern` (default: "")
`solr.StandardTokenizerFactory`	`standard`	`maxTokenLength`	`max_token_length` (default: 255)
`solr.UAX29URLEmailTokenizerFactory`	`uax_url_email`	`maxTokenLength`	`max_token_length` (default: 255)
`solr.Whitespace` `TokenizerFactory`	`whitespace`	No parameters.	N/A

Filter mappings:

Solr factory class	OpenSearch type	Solr parameter	Maps to
`solr.ASCIIFoldingFilterFactory`	`asciifolding`	`preserveOriginal`	`preserve_original` (default: false)
`solr.ApostropheFilterFactory`	`apostrophe`	No parameters.	N/A
`solr.CommonGramsFilterFactory`	`common_grams`	`ignoreCase` `words`	If `query_mode:false` (default): `ignore_case` (default: false) `common_words_path` (package) If `query_mode:true`, a separate filter (`solr.CommonGramsQueryFilterFactory`) is used
`solr.CJKWidthFilterFactory`	`cjk_width`	No parameters.	N/A
`solr.ClassicFilterFactory`	`classic`	No parameters.	N/A
`solr.DecimalDigitFilterFactory`	`decimal_digit`	No parameters.	N/A
`solr.EdgeNGramFilterFactory`	`edge_ngram`	`minGramSize` `maxGramSize` `preserveOriginal`	`min_gram` (default: 1) `max_gram` (default: 1) `preserve_original` (default: false)
`solr.FingerprintFilterFactory`	`fingerprint`	`maxOutputTokenSize` `separator`	`max_output_size` (default: 255) `separator` (default: " ")
`solr.FlattenGraphFilterFactory`	`flatten_graph`	No parameters.	N/A
`solr.KeepWordFilterFactory`	`keep`	`words` `ignoreCase`	`keep_words` (package) `keep_words_case` (default: false)
`solr.KeywordMarkerFilterFactory`	`keyword_marker`	`protected`	`keywords_path` (package)
`solr.KStemFilterFactory`	`kstem`	No parameters.	N/A
`solr.LengthFilterFactory`	`length`	`min` `max`	`min` (default: 0) `max` (default: 2147483647)
`solr.LimitTokenCountFilterFactory`	`limit`	`maxTokenCount` `consumeAllTokens`	`max_token_count` (default: 1) `consume_all_tokens` (default: false)
`solr.LowerCaseFilterFactory`	`lowercase`	No parameters.	N/A
`solr.NGramFilterFactory`	`ngram`	`minGramSize` `maxGramSize` `preserveOriginal`	`min_gram` (default: 1) `max_gram` (default: 2) `preserve_original` (default: false)
`solr.PatternReplaceFilterFactory`	`pattern_replace`	`pattern` `replacement`	`pattern` (default: "") `replacement` (default: "")
`solr.PhoneticFilterFactory`	`phonetic`	`encoder`	`encoder` (default: "metaphone")
`solr.PorterStemFilterFactory`	`porter_stem`	No parameters.	N/A
`solr.RemoveDuplicatesTokenFilterFactory`	`remove_duplicates`	No parameters.	N/A
`solr.ReverseStringFilterFactory`	`reverse`	No parameters.	N/A
`solr.ShingleFilterFactory`	`shingle`	`mingShingleSize` `maxShingleSize` `outputUnigrams` `outputUnigramsIfNoShingles` `tokenSeparator` `fillerToken`	`min_shingle_size` (default: 2) `max_shingle_size` (default: 2) `output_unigrams` (default: true) `output_unigrams_if_no_shingles` (default: false) `token_separator` (default: " ") `filler_token` (default: "_")
`solr.SnowballPorterFilterFactory`	`snowball`	`language`	`language` (default: "English")
`solr.StopFilterFactory`	`stop`	`ignoreCase` `words` `stopwords`	`ignore_case` (default: false) `stopwords_path` (package) `stopwords` (default: "none")
`solr.SynonymGraphFilterFactory`	`synonym`	`expand` `synonyms`	`expand` (default: true) `synonyms_path` (package)
`solr.TrimFilterFactory`	`trim`	No parameters.	N/A
`solr.UpperCaseFilterFactory`	`uppercase`	No parameters.	N/A
`solr.WordDelimiterGraphFilterFactory`	`word_delimiter_graph`	No parameters.	N/A
`solr.StemmerOverrideFilterFactory`	`stemmer_override`	dictionary	`rules_path` (package)

CharFilter mappings:

Solr factory class OpenSearch type Solr parameter Maps to

Solr factory class	OpenSearch type	Solr parameter	Maps to
`solr.HTMLStripCharFilterFactory`	`html_strip`	No parameters.	N/A
`solr.MappingCharFilterFactory`	`mapping`	`mapping`	`mappings_path` (package) `mappings` (array)
`solr.PatternReplaceCharFilterFactory`	`pattern_replace`	`pattern` `replacement`	`pattern` (required) `replacement` (default: "")

solr.HTMLStripCharFilterFactory

html_strip

No parameters.

N/A

solr.MappingCharFilterFactory

mapping

mappings_path (package)

mappings (array)

solr.PatternReplaceCharFilterFactory

pattern_replace

pattern

replacement

pattern (required)

replacement (default: "")

The following sections describe these mappings in detail and also explain how OpenSearch automatically handles unique keys and similarity search configurations.

Step 1. Map primitive fields

Start by analyzing your Solr schema, and focus first on straightforward field mappings. This creates a foundation for fields that have more complex transformations. OpenSearch has a simpler field configuration than Solr and handles many Solr field attributes automatically without explicit configuration.

For each field, identify the field name (such as product_id, price), which serves as the field identifier, the field type reference (such as string, float), and any field attributes, such as indexed or stored properties. After you identify these field components, identify the referenced field type, and map the Solr field type to its OpenSearch type equivalent.

Map Solr field attributes to OpenSearch field mapping parameters only when necessary. For example, OpenSearch fields are indexed by default, so you don't have to map the Solr attribute indexed="true" explicitly.

The following example demonstrates the migration of the Solr fields named product_id, price, category, and brand to Amazon OpenSearch Service. It shows how solr.StrField maps to keyword and solr.FloatPointField maps to float.

Solr basic fields:


<!--field types -->
<fieldType name="string" class="solr.StrField"/>
<fieldType name="float" class="solr.FloatPointField"/>

<!--fields -->
<field name="product_id" type="string" indexed="true" stored="true"/>
<field name="price" type="float" indexed="true" stored="true"/>
<field name="category" type="string" indexed="true" stored="true"/>
<field name="brand" type="string" indexed="true" stored="true"/>

After mapping to Amazon OpenSearch Service:


{
  "mappings": {
    "properties": {
      "product_id": {"type": "keyword"},
      "price": {"type": "float"},
      "category": {"type": "keyword"},
      "brand": {"type": "keyword"}
    }
  }
}

Step 2. Map text fields

In this step, you identify text fields that require text analysis and field types that have the analyzer element defined.

In the following example, you'll migrate the field named title. This field uses the text_general type, which has two analyzers defined.


<!-- Solr Text Field with Analysis -->
<fieldType name="text_general" class="solr.TextField">
    <analyzer type="index">
       <tokenizer class="solr.StandardTokenizerFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.StopFilterFactory" words="stopwords.txt"/>
       <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="3"/>
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.StandardTokenizerFactory"/>
       <filter class="solr.StopFilterFactory" words="stopwords.txt"/>
       <filter class="solr.LowerCaseFilterFactory"/>
     </analyzer>
</fieldType>

<field name="title" type="text_general" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true"/>

Review the built-in analyzers in the OpenSearch documentation to find the best match.

Use built-in OpenSearch analyzers if they provide similar functionality. However, direct one-to-one mappings might not exist between Solr and OpenSearch components. Create custom analyzers if the built-in options don't meet your requirements.

Mapping custom dictionaries

Your Solr analyzers might depend on external files (such as stopwords.txt or synonyms.txt). You'll need to handle these dependencies when migrating to Amazon OpenSearch Service.

You have two options for handling custom dictionaries in Amazon OpenSearch Service: inline and by uploading files.

Inline configuration: Include word lists directly in your index settings.


"filter": {
        "custom_stop": {
          "type": "stop",
          "stopwords": ["the", "is", "at", "which", "on"]
        },
 }

Uploading files: This is the option we recommend. You can upload custom dictionary files, such as stopwords.txt and synonyms.txt, and associate them with your domain. Create custom packages by copying your Solr dictionary files to an S3 bucket, and then create an OpenSearch package and associate it with your domain. After you associate a file with a domain, you can use it in parameters such as synonyms_path and stopwords_path:


"filter": {
        "custom_stop": {
          "type": "stop",
          "stopwords_path": "analyzers/Fxxxxxxx"
        },
}

Mapping analyzers

To create a custom analyzer, identify the tokenizer, filter, and charFilter sections under the analyzer element in your Solr fieldType element.

To migrate your text field analyzers, identify the analyzer configuration for your field type and determine whether your field type has distinct analyzers for indexing and querying. Separate the index and query analyzer configurations and document all components within each analyzer. For each analyzer, carefully examine the configuration to identify the tokenizer, filters, and character filters.

Map each tokenizer, filter, and character filter to its OpenSearch equivalent. For example, StandardTokenizerFactory maps to the OpenSearch standard tokenizer, and LowerCaseFilterFactory maps to the OpenSearch lowercase filter. For detailed component mapping information, see the tables earlier in this section.

Establish a predictable naming strategy that combines the field type name with the analyzer type by using the format {field_type_name}_{analyzer_type}. For example, your index analyzer becomes text_general_index and your query analyzer becomes text_general_search. You can then refer to the analyzer in OpenSearch fields by using the analyzer or search_analyzer field mapping parameters.

The following example demonstrates an Amazon OpenSearch Service index with custom analysis settings for text fields. The configuration includes two analyzers: text_general_index for indexing and text_general_search for searching. Both analyzers use a standard tokenizer with custom filters. The index analyzer includes lowercase conversion, custom stop words referenced from OpenSearch packages, and N-gram filtering with token sizes ranging from 2 to 3 characters. The search analyzer uses only lowercase and stop word filtering to process queries more efficiently.

In the mappings section, both title and description fields are configured as text types with distinct analyzer settings. The analyzer parameter specifies text_general_index for processing text during document indexing, and the search_analyzer parameter specifies text_general_search for processing search queries:


{
  "settings": {
    "analysis": {
      "analyzer": {
        "text_general_index": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "custom_stop",
            "ngram_filter"
          ]
        },
        "text_general_search": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "custom_stop",
            "lowercase"
          ]
        }
      },
      "filter": {
        "custom_stop": {
          "type": "stop",
          "stopwords_path": "analyzers/FXXXXXXX"
        },
        "ngram_filter": {
          "type": "ngram",
          "min_gram": 2,
          "max_gram": 3
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "text_general_index",
        "search_analyzer": "text_general_search"
      },
      "description": {
        "type": "text",
        "analyzer": "text_general_index",
        "search_analyzer": "text_general_search"
      }
    }
  }
}

Validating text analysis

After you create your analyzer configuration in OpenSearch, validate that it works as expected before you index large amounts of data. Amazon OpenSearch Service provides the _analyze API to test your analyzers:


POST /your-index/_analyze{
  "analyzer": "text_general_index",
  "text": "The Quick Brown Fox Jumps"
}

Step 3. Map custom field types

To convert Solr custom fields to OpenSearch, evaluate whether OpenSearch native features can achieve the desired functionality before you consider custom development.

In the following example, you'll migrate the field named custom_title, which uses the custom_text_general type. This fieldType uses a custom implementation of the tokenizer com.mycompany.CustomTokenizerFactory.


<!-- Custom Field Types -->
<fieldType name="custom_text_general" class="solr.TextField">
    <analyzer type="index">
        <tokenizer class="com.mycompany.CustomTokenizerFactory"/>
    </analyzer>
</fieldType>
<field name="custom_title" type="custom_text_general" indexed="true" stored="true"/>

To migrate custom field types from Solr to OpenSearch, you can choose from two approaches: using OpenSearch built-in tokenizers and analyzers, or developing custom plugins.

The preferred option is to use OpenSearch built-in tokenizers and analyzers, which you can configure through JSON settings. This involves creating a custom analyzer definition that combines existing components such as tokenizers, token filters, and character filters to achieve the desired text analysis behavior. For example, you might use the pattern tokenizer with specific patterns, combine it with lowercase filters, or use other built-in components to replicate your custom Solr custom tokenizer's functionality.

We recommend that you consider the second option only if OpenSearch built-in components don't meet your requirements. This involves creating a custom plugin that implements your custom tokenizer text analysis logic, and installing the plugin in OpenSearch. The plugin approach requires more development effort and ongoing maintenance but provides maximum flexibility for implementing complex text analysis logic.

To choose between these options, consider factors such as maintenance overhead, performance requirements, and the complexity of your text analysis. We recommend that you thoroughly evaluate whether the rich set of built-in analysis components in OpenSearch can meet your requirements before you develop a custom plugin.

The following example demonstrates an Amazon OpenSearch Service index with a custom text analyzer configuration. The configuration includes a single custom analyzer named custom_text_analyzer that uses a specialized tokenizer defined as custom_tokenizer. In the mapping section, a field named custom_title is configured as a text type with the custom analyzer setting. The analyzer parameter specifies custom_text_analyzer for processing text during both document indexing and search operations.


// Index settings
PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_text_analyzer": {
          "type": "custom",
          "tokenizer": "custom_tokenizer"
        }
      }
    }
  }
}

// Field mapping
PUT /my_index/_mapping
{
  "properties": {
    "custom_title": {
      "type": "text",
      "analyzer": "custom_text_analyzer"
    }
  }
}

Step 4. Map copy fields

When you convert your Solr schema to Amazon OpenSearch Service, you can implement the copyField directive by using the OpenSearch copy_to parameter.

For example, the following Solr elements:


<!-- Unified search field - copy multiple fields to one destination -->
<copyField source="title" dest="text"/>
<copyField source="description" dest="text"/>
<copyField source="brand" dest="text"/>
<copyField source="category" dest="text"/>

are converted to:


"title": {
  "type": "text",
  "copy_to": "title_sort"
}

Step 5. Map dynamic fields

Amazon OpenSearch Service implements dynamic fields by using dynamic templates, which match field patterns that are similar to dynamicField in Solr.

For example, the following Solr element:


<dynamicField name="attr_*" type="text_general"/>

transforms into a dynamic template in OpenSearch:


"dynamic_templates": [{
  "attributes": {
    "match": "attr_*",
    "mapping": {
      "type": "text",
      "analyzer": "text_general"
    }
  }
}]

This pattern-based mapping automatically applies specified settings to any new field that matches the pattern, so it maintains the same flexible schema behavior as Solr dynamic fields.

Handling unique keys

Amazon OpenSearch Service and Solr handle unique identifiers differently. In Solr, <uniqueKey>product_id</uniqueKey> requires explicit configuration, whereas OpenSearch automatically provides a unique identifier through its _id field for each document. You can still use the product_id field value as the document's _id when you index documents.

Handling similarity configurations

In Solr, the similarity configuration controls scoring algorithms for search relevance. This feature maps to the similarity settings in OpenSearch. Amazon OpenSearch Service uses BM25 as the default ranking framework, but it supports other similarities such as Boolean as well.

Best practices

Migrating from Solr to Amazon OpenSearch Service offers a straightforward path through one-to-one mapping of fields, analyzers, and configurations. It also presents a valuable opportunity to reassess and optimize your search infrastructure.

Instead of lifting and shifting your existing Solr configurations, we recommend that you take the time to evaluate each field's necessity, validate data types for optimal performance, and simplify complex configurations where possible.

Consider whether custom Solr field types could be replaced with OpenSearch native functionality. This strategic approach not only ensures a successful migration but also takes advantage of the strengths in Amazon OpenSearch Service to help you build a more efficient, maintainable search solution. The goal isn't only to replicate Solr's functionality, but to enhance your search capabilities while reducing unnecessary complexity.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

OpenSearch index

Migrating your configuration