Automatic semantic enrichment for Amazon OpenSearch Service
Amazon OpenSearch Service uses word-to-word matching (lexical search) to find results, similar to other traditional search engines. This approach works well for specific queries like product codes or model numbers, but struggles with abstract searches where understanding user intent becomes crucial. For example, when you search for "shoes for the beach," lexical search matches individual words "shoes," "beach," "for," and "the" in catalog items, potentially missing relevant products like "water-resistant sandals" or "surf footwear" that don't contain the exact search terms.
Automatic Semantic Enrichment solves this limitation by considering both keyword matches and the contextual meaning behind searches. This feature understands search intent and improves search relevance by up to 20%. Enable this feature for text fields in your index to enhance search results.
Note
AAutomatic semantic enrichment is available for OpenSearch Service domains running version 2.19 or later. Additionally, domains with OpenSearch version 2.19 also need to be on the latest service software version update.
How it works
The enrichment process analyzes designated text fields and generates semantic embeddings that capture meaning and context. These embeddings help the search engine understand relationships between concepts, synonyms, and related terms even when they don't appear in your search query. For example, if a user searches for "how to treat a headache", a semantic search system might return the following results:
-
Migraine remedies
-
Pain management techniques
-
Over-the-counter pain relievers
-
Natural headache relief methods
The system understands the underlying intent even when these exact phrases aren't in the original query.
Automatic semantic enrichment offers the following benefits:
- Simplified implementation
-
You don't need machine learning expertise or complex integrations to implement semantic search capabilities.
- Index-level configuration
-
Semantic enrichment is configured at the index level during creation, giving you granular control over which data receives semantic processing.
- Minimal impact on search latency
-
Automatic Semantic Enrichment stores sparse encodings directly in your index during indexing. You don't need separate KNN indices. Your searches maintain their original speed while delivering enhanced results.
- Automated process
-
Semantic enrichment happens automatically during data ingestion without requiring manual intervention.
- Improved search relevance
-
Semantic enrichment enhances the quality and contextual accuracy of search results by understanding user intent.
- Scalability
-
Semantic enrichment applies semantic search capabilities to large datasets without manual intervention.
Requirements and considerations
Before implementing automatic semantic enrichment, consider the following requirements and limitations:
- Version requirements
-
Automatic semantic enrichment is available for Amazon OpenSearch Service version 2.19 and later. For existing domains running Amazon OpenSearch Service version 2.19 or 3.1, you must update to the latest patch version to use this feature.
- Public domains only
-
Automatic Semantic Enrichment is available only for public domains. You can't use it with VPC domains.
- Processing overhead
-
The enrichment process adds minimal processing time during data ingestion as the system generates semantic embeddings for designated fields.
- Storage implications
-
Enriched data requires additional storage space for the semantic embeddings generated alongside your original data.
- Language support
-
Automatic semantic enrichment for managed domains offers the following language options:
- English-only option
-
-
Ideal for applications primarily dealing with English text
-
- Multi-lingual option
-
-
Supports the following languages: Arabic, Bengali, Chinese, English, Finnish, French, Hindi, Indonesian, Japanese, Korean, Persian, Russian, Spanish, Swahili, and Telugu
-
Perfect for diverse, international content or multilingual applications
-
Pricing
With Automatic Semantic Enrichment, you pay only for the resources your
workload consumes. The compute capacity is measured in OpenSearch Compute Units
(OCUs). Check the pricing details for your specific Region and pricing
illustration on the https://aws.amazon.com/opensearch-service/pricing/
Index set up example
For a practical example, refer to the blog post https://aws.amazon.com/blogs/big-data/boosting-search-relevance-automatic-semantic-enrichment-in-amazon-opensearch-serverless/
Configuring permissions for automatic semantic enrichment
Before creating an index with automatic semantic enrichment, you need to configure the required permissions. This section explains the permissions needed for different index operations and how to set them up for both AWS Identity and Access Management (IAM) and fine-grained access control scenarios.
IAM permissions
The following IAM permissions are required for automatic semantic enrichment operations. These permissions vary depending on the specific index operation you want to perform.
CreateIndex API permissions
To create an index with automatic semantic enrichment, you need the following IAM permissions:
-
es:CreateIndex– Create an index with semantic enrichment capabilities. -
es:ESHttpHead– Perform HEAD requests to check index existence. -
es:ESHttpPut– Perform PUT requests for index creation. -
es:ESHttpPost– Perform POST requests for index operations.
UpdateIndex API permissions
To update an existing index with automatic semantic enrichment, you need the following IAM permissions:
-
es:UpdateIndex– Update index settings and mappings. -
es:ESHttpPut– Perform PUT requests for index updates. -
es:ESHttpGet– Perform GET requests to retrieve index information. -
es:ESHttpPost– Perform POST requests for index operations.
GetIndex API permissions
To retrieve information about an index with automatic semantic enrichment, you need the following IAM permissions:
-
es:GetIndex– Retrieve index information and settings. -
es:ESHttpGet– Perform GET requests to retrieve index data.
DeleteIndex API permissions
To delete an index with automatic semantic enrichment, you need the following IAM permissions:
-
es:DeleteIndex– Delete an index and its semantic enrichment components. -
es:ESHttpDelete– Perform DELETE requests for index removal.
Sample IAM policy
The following sample identity-based access policy provides the permissions necessary for a user to manage indexes with automatic semantic enrichment:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowSemanticEnrichmentIndexOperations", "Effect": "Allow", "Action": [ "es:CreateIndex", "es:UpdateIndex", "es:GetIndex", "es:DeleteIndex", "es:ESHttpHead", "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:ESHttpDelete" ], "Resource": "arn:aws:es:aws-region:111122223333:domain/domain-name/*" } ] }
Replace aws-region, 111122223333, and
domain-name with your specific values. You can
further restrict access by specifying particular index patterns in the resource
ARN.
Fine-grained access control permissions
If your Amazon OpenSearch Service domain has fine-grained access control enabled, you need additional permissions beyond the IAM permissions. The following permissions are required for each index operation.
CreateIndex API permissions
When fine-grained access control is enabled, the following additional permissions are required for creating an index with automatic semantic enrichment:
-
indices:admin/create– Create index operations. -
indices:admin/mapping/put– Create and update index mappings. -
cluster:admin/opensearch/ml/create_connector– Create machine learning connectors for semantic processing. -
cluster:admin/opensearch/ml/register_model– Register machine learning models for semantic enrichment. -
cluster:admin/ingest/pipeline/put– Create ingest pipelines for data processing. -
cluster:admin/search/pipeline/put– Create search pipelines for query processing.
UpdateIndex API permissions
When fine-grained access control is enabled, the following additional permissions are required for updating an index with automatic semantic enrichment:
-
indices:admin/get– Retrieve index information. -
indices:admin/settings/update– Update index settings. -
indices:admin/mapping/put– Update index mappings. -
cluster:admin/opensearch/ml/create_connector– Create machine learning connectors. -
cluster:admin/opensearch/ml/register_model– Register machine learning models. -
cluster:admin/ingest/pipeline/put– Create ingest pipelines. -
cluster:admin/search/pipeline/put– Create search pipelines. -
cluster:admin/ingest/pipeline/get– Retrieve ingest pipeline information. -
cluster:admin/search/pipeline/get– Retrieve search pipeline information.
GetIndex API permissions
When fine-grained access control is enabled, the following additional permissions are required for retrieving information about an index with automatic semantic enrichment:
-
indices:admin/get– Retrieve index information. -
cluster:admin/ingest/pipeline/get– Retrieve ingest pipeline information. -
cluster:admin/search/pipeline/get– Retrieve search pipeline information.
DeleteIndex API permissions
When fine-grained access control is enabled, the following additional permission is required for deleting an index with automatic semantic enrichment:
-
indices:admin/delete– Delete index operations.