# Using CloudFormation to set up remote inference for semantic search
CloudFormation template integrations

Starting with OpenSearch version 2.9, you can use remote inference with [semantic search](https://opensearch.org/docs/latest/search-plugins/semantic-search/) to host your own machine learning (ML) models. Remote inference uses the [ML Commons plugin](https://opensearch.org/docs/latest/ml-commons-plugin/index/).

With Remote inference, you can host your model inferences remotely on ML services, such as Amazon SageMaker AI and Amazon Bedrock, and connect them to Amazon OpenSearch Service with ML connectors. 

To ease the setup of remote inference, Amazon OpenSearch Service provides an [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) template in the console. CloudFormation is an AWS service where you can, provision, and manage AWS and third-party resources by treating infrastructure as code. 

The OpenSearch CloudFormation template automates the model provisioning process for you, so that you can easily create a model in your OpenSearch Service domain and then use the model ID to ingest data and run neural search queries.

When you use neural sparse encoders with OpenSearch Service version 2.12 and onwards, we recommend that you use the tokenizer model locally instead of deploying remotely. For more information, see [Sparse encoding models](https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#sparse-encoding-models) in the OpenSearch documentation. 

**Topics**
+ [

## Available CloudFormation templates
](#cfn-template-list)
+ [

## Prerequisites
](#cfn-template-prereq)
+ [

# Amazon Bedrock templates
](cfn-template-bedrock.md)
+ [

# Configuring Agentic Search with Bedrock Claude
](cfn-template-agentic-search.md)
+ [

# MCP server integration templates
](cfn-template-mcp-server.md)
+ [

# Amazon SageMaker templates
](cfn-template-sm.md)
+ [

## Remote inference for semantic highlighting templates
](#cfn-template-semantic-highlighting)

## Available CloudFormation templates


The following AWS CloudFormation machine learning (ML) templates are available for use:[Amazon Bedrock templates](cfn-template-bedrock.md)

**Amazon Titan Text Embeddings Integration**  
Connects to Amazon Bedrock's hosted ML models, eliminates the need for separate model deployment, and uses predetermined Amazon Bedrock endpoints. For more information, see [Amazon Titan Text Embeddings](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html) in the *Amazon Bedrock User Guide*.

**Cohere Embed Integration**  
Provides access to Cohere Embed models, and is optimized for specific text processing workflows. For more information, see [Embed](https://docs.cohere.com/docs/cohere-embed) on the *Cohere docs* website.

**Amazon Titan Multimodal Embeddings**  
Supports both text and image embeddings, and enables multimodal search capabilities. For more information, see [Amazon Titan Multimodal Embeddings](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html) in the *Amazon Bedrock User Guide*.[MCP server integration templates](cfn-template-mcp-server.md)

**MCP server integration**  
Deploys an [Amazon Bedrock AgentCore Runtime](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html), provides an agent endpoint, handles inbound and outbound authentication, and supports OAuth for enterprise authentication.[Amazon SageMaker templates](cfn-template-sm.md)

**Integration with text embedding models through Amazon SageMaker**  
Deploys text embedding models in Amazon SageMaker Runtime, creates IAM roles for model artifact access, and establishes ML connectors for semantic search.

**Integration with Sparse Encoders through SageMaker**  
Sets up sparse encoding models for neural search, creates AWS Lambda functions for connector management, and returns model IDs for immediate use.

## Prerequisites


To use a CloudFormation template with OpenSearch Service, complete the following prerequisites.

### Set up an OpenSearch Service domain


Before you can use a CloudFormation template, you must set up an [Amazon OpenSearch Service domain](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/osis-get-started.html) with version 2.9 or later and fine-grained access control enabled. [Create an OpenSearch Service backend role](fgac.md#fgac-roles) to give the ML Commons plugin permission to create your connector for you. 

The CloudFormation template creates a Lambda IAM role for you with the default name `LambdaInvokeOpenSearchMLCommonsRole`, which you can override if you want to choose a different name. After the template creates this IAM role, you need to give the Lambda function permission to call your OpenSearch Service domain. To do so, [map the role](fgac.md#fgac-mapping) named `ml_full_access` to your OpenSearch Service backend role with the following steps:

1. Navigate to the OpenSearch Dashboards plugin for your OpenSearch Service domain. You can find the Dashboards endpoint on your domain dashboard on the OpenSearch Service console. 

1. From the main menu choose **Security**, **Roles**, and select the **ml\$1full\$1access** role.

1. Choose **Mapped users**, **Manage mapping**. 

1. Under **Backend roles**, add the ARN of the Lambda role that needs permission to call your domain.

   ```
   arn:aws:iam::account-id:role/role-name
   ```

1. Select **Map** and confirm the user or role shows up under **Mapped users**.

After you've mapped the role, navigate to the security configuration of your domain and add the Lambda IAM role to your OpenSearch Service access policy.

### Enable permissions on your AWS account


Your AWS account must have permission to access CloudFormation and Lambda, along with whichever AWS service you choose for your template – either SageMaker Runtime or Amazon Bedrock. 

If you're using Amazon Bedrock, you must also register your model. See [Model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) in the *Amazon Bedrock User Guide* to register your model. 

If you're using your own Amazon S3 bucket to provide model artifacts, you must add the CloudFormation IAM role to your S3 access policy. For more information, see [Adding and removing IAM identity permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html) in the *IAM User Guide*.

# Amazon Bedrock templates


The Amazon Bedrock CloudFormation templates provision the AWS resources needed to create connectors between OpenSearch Service and Amazon Bedrock. 

First, the template creates an IAM role that allows the future Lambda function to access your OpenSearch Service domain. The template then creates the Lambda function, which has the domain create a connector using the ML Commons plugin. After OpenSearch Service creates the connector, the remote inference set up is finished and you can run semantic searches using the Amazon Bedrock API operations.

**Note**  
Since Amazon Bedrock hosts its own ML models, you don’t need to deploy a model to SageMaker Runtime. Instead, the template uses a predetermined endpoint for Amazon Bedrock and skips the endpoint provision steps.

**To use the Amazon Bedrock CloudFormation template**

1. Open the [Amazon OpenSearch Service console](https://console.aws.amazon.com/aos/home ).

1. In the left navigation pane, choose **Integrations**.

1. Under **Integrate with Amazon Titan Text Embeddings model through Amazon Bedrock**, choose **Configure domain**, **Configure public domain**.

1. Follow the prompt to set up your model.

**Note**  
OpenSearch Service also provides a separate template to configure an Amazon VPC domain. If you use this template, you need to provide the Amazon VPC ID for the Lambda function.

In addition, OpenSearch Service provides the following Amazon Bedrock templates to connect to the Cohere model and the Amazon Titan Multimodal Embeddings model:
+ `Integration with Cohere Embed through Amazon Bedrock`
+ `Integrate with Amazon Bedrock Titan Multi-modal`

# Configuring Agentic Search with Bedrock Claude


Agentic search leverages autonomous agents to execute complex searches on your behalf by understanding user intent, orchestrating the right tools, generating optimized queries, and providing transparent summaries of their decisions through a natural language interface. These agents are powered by reasoning models, such as Bedrock Claude.

Follow the steps below to open and run a CloudFormation template that automatically configures Bedrock Claude models for agentic search, and how to configure and create your agents in the AI Search Flows plugin on OpenSearch Dashboards.

## Enabling Bedrock Claude Access


1. **Prerequisite:** If your domain uses fine-grained access control, map `arn:aws:iam::your-account-id:role/LambdaInvokeOpenSearchMLCommonsRole` as a backend role to the `ml_full_access` role before running the template. This IAM role will be created automatically by CloudFormation if it doesn't already exist. For more information on how to configure the mapping, see [Map the ML role in OpenSearch Dashboards (if using fine-grained access control)](ml-external-connector.md#connector-external-fgac).

1. Open the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the left navigation, choose **Integrations**.

1. Under **Integration with Bedrock Claude for Agentic Search**, choose **Configure domain**. Ensure your domain is on version 3.3 or greater.

1. In the CloudFormation template, enter your OpenSearch Service domain endpoint and select a model. The remaining fields are optional or pre-filled. Click **Create Stack** and wait for the provisioning to complete.

1. From the Amazon OpenSearch Service console, select **Domains**, and select your domain. Click the **OpenSearch Dashboards URL** to access OpenSearch Dashboards.

## Building agents and running Agentic Search


1. From OpenSearch Dashboards, open the menu on the left-hand side. Select **OpenSearch Plugins** > **AI Search Flows** to access the plugin.

1. On the **Workflows** page, select the **New workflow** tab, and under the **Agentic Search** card, click **Create**.

1. Provide a unique name for your search configuration, and click **Create**.

1. Under **Configure agent**, click **Create new agent**. Select your newly-created Bedrock Claude model, then click **Create agent**. If the button is disabled, check **Advanced Settings** > **LLM Interface**, and ensure there is a valid interface selected. All models from CloudFormation will be Bedrock Claude models, so you can select **Bedrock Claude**, if it isn't already, then click **Create agent**.

1. Under **Test flow**, try running agentic searches. Provide a natural language search query, and click **Search**.

For complete documentation of the AI Search Flows plugin, see [Configuring Agentic Search](https://docs.opensearch.org/latest/vector-search/ai-search/building-agentic-search-flows/) in the OpenSearch documentation.

For more information about how Agentic Search works, see [Agentic Search](https://opensearch.org/docs/latest/vector-search/ai-search/agentic-search/) in the OpenSearch documentation.

# MCP server integration templates


With the Model Context Protocol (MCP) server templates, you can deploy an OpenSearch hosted MCP server on Amazon Bedrock AgentCore, reducing the integration complexity between AI agents and OpenSearch tools. For more information, see [What is Amazon Bedrock AgentCore?](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html).

## Template features


This template includes the following key features for deploying and managing your MCP server.

**Managed MCP server deployment**  
Deploys **opensearch-mcp-server-py** using Amazon Bedrock AgentCore Runtime, and provides an agent endpoint that proxies requests to the underlying MCP server. For more information, see [opensearch-mcp-server-py](https://github.com/opensearch-project/opensearch-mcp-server-py) on *GitHub*.

**Authentication and security**  
Handles both inbound authentication (from users to MCP server) and outbound authentication (from MCP server to OpenSearch), and supports OAuth for enterprise authentication.

**Note**  
The MCP server template is only available in the following AWS Regions:  
US East (N. Virginia)
US West (Oregon)
Europe (Frankfurt)
Asia Pacific (Sydney)

## To use the MCP server template
Use the template

Follow these steps to deploy the MCP server template and connect it to your OpenSearch domain.

1. Open the [Amazon OpenSearch Service console](https://console.aws.amazon.com//aos/home ). 

1. In the left navigation pane, choose **Integrations**.

1. Locate the **MCP server integration** template.

1. Choose **Configure domain**. Then, enter your OpenSearch domain endpoint.

The template creates an AgentCore Runtime and the following components, if the corresponding optional parameters are not specified:
+ An Amazon ECR repository
+ An Amazon Cognito user pool as the OAuth authorizer
+ An execution role used by the AgentCore Runtime

After you complete this procedure, you should follow these post-creation steps:

1. **For Amazon OpenSearch Service**: Map your execution role ARN to an OpenSearch backend role to control access to your domain.

   **For Amazon OpenSearch Serverless**: Create a data access policy that allows your execution role to access your collection.

1. Get an OAuth access token from your authorizer. Then use this token to access the MCP server at the URL listed in your CloudFormation stack output.

For more information, see [Policy actions for OpenSearch Serverless](security-iam-serverless.md#security-iam-serverless-id-based-policies-actions).

## Integration with AI agents


After deployment, you can integrate the MCP server with any MCP compatible agent. For more information, see [Invoke your deployed MCP server](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-mcp.html#runtime-mcp-invoke-server) in the *Amazon Bedrock Developer Guide*. 

**Developer Integration**  
You can add the MCP server endpoint to your agent configuration. You can also use it with the Amazon Q Developer CLI, custom agents, or other MCP-compatible agents.

**Enterprise Deployment**  
Central hosted agents can connect to multiple services with OpenSearch as one component. This agent supports OAuth and enterprise authentication systems, and scales to support multiple users and use cases.

### Example using the Strands Agents framework


```
import os
import requests
from strands import Agent
from strands.tools.mcp import MCPClient
from mcp.client.streamable_http import streamablehttp_client

def get_bearer_token(discovery_url: str, client_id: str, client_secret: str):
    response = requests.get(discovery_url)
    discovery_data = response.json()
    token_endpoint = discovery_data['token_endpoint']

    data = {
        'grant_type': 'client_credentials',
        'client_id': client_id,
        'client_secret': client_secret
    }
    headers = {
        'Content-Type': 'application/x-www-form-urlencoded'
    }

    response = requests.post(token_endpoint, data=data, headers=headers)
    token_data = response.json()
    return token_data['access_token']

if __name__ == "__main__":
    discovery_url = os.environ["DISCOVERY_URL"]
    client_id = os.environ["CLIENT_ID"]
    client_secret = os.environ["CLIENT_SECRET"]
    mcp_url = os.environ["MCP_URL"]

    bearer_token = get_bearer_token(discovery_url, client_id, client_secret)

    opensearch_mcp_client = MCPClient(lambda: streamablehttp_client(mcp_url, {
        "authorization": f"Bearer {bearer_token}",
        "Content-Type": "application/json"
    }))

    with opensearch_mcp_client:
        tools = opensearch_mcp_client.list_tools_sync()
        agent = Agent(tools=tools)
        agent("list indices")
```

For more information, see [Hosting OpenSearch MCP Server with Amazon Bedrock AgentCore](https://opensearch.org/blog/hosting-opensearch-mcp-server-with-amazon-bedrock-agentcore/) on the *OpenSearch website*.

# Amazon SageMaker templates


The Amazon SageMaker CloudFormation templates define multiple AWS resources in order to set up the neural plugin and semantic search for you. 

Begin by using the **Integration with text embedding models through Amazon SageMaker** template to deploy a text embedding model in SageMaker Runtime as a server. If you don't provide a model endpoint, CloudFormation creates an IAM role that allows SageMaker Runtime to download model artifacts from Amazon S3 and deploy them to the server. If you provide an endpoint, CloudFormation creates an IAM role that allows the Lambda function to access the OpenSearch Service domain or, if the role already exists, updates and reuses the role. The endpoint serves the remote model that is used for the ML connector with the ML Commons plugin. 

Then, use the **Integration with Sparse Encoders through Amazon SageMaker** template to create a Lambda function that has your domain set up remote inference connectors. After the connector is created in OpenSearch Service, the remote inference can run semantic search using the remote model in SageMaker Runtime. The template returns the model ID in your domain back to you to so you can start searching.

**To use the Amazon SageMaker AI CloudFormation templates**

1. Open the [Amazon OpenSearch Service console](https://console.aws.amazon.com//aos/home ).

1. In the left navigation pane, choose **Integrations**.

1. Under each of the Amazon SageMaker AI templates, choose **Configure domain**, **Configure public domain**.

1. Follow the prompt in the CloudFormation console to provision your stack and set up a model.

**Note**  
OpenSearch Service also provides a separate template to configure VPC domain. If you use this template, you need to provide the VPC ID for the Lambda function.

## Remote inference for semantic highlighting templates


Semantic highlighting is an advanced search feature that enhances result relevance by analyzing the meaning and context of queries rather than relying solely on exact keyword matches. This capability uses machine learning models to evaluate semantic similarity between search queries and document content, identifying and highlighting the most contextually relevant sentences or passages within documents. Unlike traditional highlighting methods that focus on exact term matches, semantic highlighting leverages AI models to assess each sentence using contextual information from both the query and surrounding text, enabling it to surface pertinent information even when exact search terms aren' t present in the highlighted passages. This approach is particularly valuable for AI-driven search implementations where users prioritize semantic meaning over literal word matching, allowing search administrators to deliver more intelligent and contextually aware search experiences that highlight meaningful content spans rather than just keyword occurrences. For more information, see [Using semantic highlighting](https://docs.opensearch.org/latest/tutorials/vector-search/semantic-highlighting-tutorial/).

Use the following procedure open and run an CloudFormation template that automatically configures Amazon SageMaker models for semantic highlighting.

**To use the semantic highlighting CloudFormation template**

1. Open the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home ).

1. In the left navigation, choose **Integrations**.

1. Under **Enable Semantic Highlighting through Amazon SageMaker integration**, choose **Configure domain**, **Configure public domain**.

1. Follow the prompt to set up your model.

**Note**  
OpenSearch Service also provides a separate template to configure VPC domain. If you use this template, you need to provide the VPC ID for the Lambda function.