

# Vector search


Vector search in Amazon OpenSearch Service enables you to search for semantically similar content using machine learning embeddings rather than traditional keyword matching. Vector search converts your data (text, images, audio, etc.) into high-dimensional numerical vectors (embeddings) that capture the semantic meaning of the content. When you perform a search, OpenSearch compares the vector representation of your query against the stored vectors to find the most similar items.

Vector search includes the following key components. 

**Vector fields**  
OpenSearch supports the `knn_vector` field type to store dense vectors with configurable dimensions (up to 16,000).

**Search methods**  
+ **k-NN (k-nearest neighbors)**: Finds the k most similar vectors
+ **Approximate k-NN**: Uses algorithms like HNSW (Hierarchical Navigable Small World) for faster searches on large datasets

**Distance metrics**  
Supports various similarity calculations including:  
+ Euclidean distance
+ Cosine similarity
+ Dot product

**Common use cases**  
Vector search supports the following common use cases.
+ **Semantic search**: Find documents with similar meaning, not just matching keywords
+ **Recommendation systems**: Suggest similar products, content, or users
+ **Image search**: Find visually similar images
+ **Anomaly detection**: Identify outliers in data patterns
+ **RAG (Retrieval Augmented Generation)**: Enhance LLM responses with relevant context

**Integration with machine learning**  
OpenSearch integrates with the following machine learning services and models:
+ **Amazon Bedrock**: For generating embeddings using foundation models
+ **Amazon SageMaker AI**: For custom ML model deployment
+ **Hugging Face models**: Pre-trained embedding models
+ **Custom models**: Your own trained embedding models

With vector search, you can build sophisticated AI-powered applications that understand context and meaning, going far beyond traditional text matching capabilities.

# Import from Amazon S3 Vectors to OpenSearch Serverless


Amazon S3 Vectors delivers the first cloud object store with native support to store and query vectors. S3 Vectors provides cost-effective, elastic, and durable vector storage that can be queried based on semantic meaning and similarity. It delivers sub-second query response times and up to 90% lower costs for uploading, storing, and querying vectors.

Amazon S3 Vectors introduces S3 vector buckets, which you can use to store, access, and query vector data without provisioning any infrastructure. Inside a vector bucket, you can organize your vector data within vector indexes. Your vector bucket can have multiple vector indexes, and each vector index can hold millions of vectors. For more information, see [Working with Amazon S3 Vectors and vector buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html) in the *Amazon S3 User Guide*. 

Each vector consists of:
+ A unique key
+ Vector data
+ Optional metadata in JSON format

Vector indexes support Euclidean and Cosine distance functions for similarity search operations.

**Note**  
The primary advantage of vector buckets is their ability to store massive datasets at extremely low cost while providing direct API access for vector operations.

For more information about Amazon S3 vector buckets, including how to create one, see [Working with Amazon S3 Vectors and vector buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html) in the *Amazon S3 User Guide*. For more information about integration with OpenSearch Service beyond what's described in this topic, see [Using S3 Vectors with OpenSearch Service](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-integrating-opensource.html)

You can use S3 Vectors with Amazon OpenSearch Service to lower the cost of vector storage when queries are less frequent, and then quickly move those datasets to OpenSearch as demands increase or to enhance search capabilities. 

OpenSearch Service integrates with Amazon S3 Vectors to provide enhanced performance and functionality beyond what Amazon S3 vector buckets offer by themselves. Consider this integration when you need:
+ Higher query throughput
+ Sub-second search latency
+ Advanced analytics capabilities like aggregations
+ Hybrid search combining text and vector data

This integration is particularly useful when multiple applications consume the same vector data with different performance requirements. You can have some applications interact directly with Amazon S3 vector buckets for cost-sensitive use cases, while others leverage OpenSearch integration for performance-critical operations.

## Integration architecture


The integration uses Amazon OpenSearch Ingestion (OSI) as the data pipeline between Amazon S3 vector indexes and Amazon OpenSearch Serverless vector collections. OpenSearch Ingestion automatically exports vector data from your specified vector index and ingests it into OpenSearch Serverless vector collections for high-performance search operations.

**Note**  
After export, your data is still present in the S3 vector index. You have two copies of the data.

Each vector index maps to a corresponding index in the OpenSearch Serverless collection. The integration:
+ Preserves vector dimensions
+ Retains metadata
+ Optimizes data structure for OpenSearch's vector search capabilities

After configuration, OpenSearch Ingestion begins the data export process by consuming vectors from the specified vector index using the Amazon S3 ListVectors API. The service processes vectors in parallel to optimize ingestion speed while respecting the scaling limits of both OpenSearch Ingestion and Amazon OpenSearch Serverless.

During ingestion, the service:
+ Transforms vector data to match the expected format for OpenSearch Service
+ Preserves essential information including vector values, metadata, and distance metrics
+ Handles failure scenarios through intelligent retry mechanisms
+ Places problematic records in an Amazon S3 bucket used as a dead letter queue for later analysis

The integration handles massive datasets efficiently, with performance depending on vector dimensions, dataset size, and configured scaling limits. OSI can scale up to 16 workers per pipeline, while OpenSearch Serverless automatically adjusts capacity based on ingestion demands. By default, OpenSearch increases the `maxSearch` OpenSearch Computational Unit (OCU) on the OpenSearch Serverless side to 100.

**Note**  
The integration prioritizes cost efficiency through:  
Automatic pipeline shutdown after export completion
OpenSearch Serverless collection scaling
Pay-per-use resource model

## Required IAM permissions


The integration requires careful configuration of IAM permissions to enable secure communication between services. OpenSearch Ingestion needs permissions to read from Amazon S3 vector indexes, write to OpenSearch Service vector collections, and manage associated security policies.

When you enable integration using the procedure later in this topic, you can choose one of the following options for permissions management:
+ Allow the system to automatically create a service role with the necessary permissions
+ Provide an existing role that meets the requirements

The automatically created role includes policies for:
+ Accessing Amazon S3 vector index APIs
+ Managing OpenSearch Service collection operations
+ Handling dead letter queue operations for failed ingestion attempts

If you choose to specify an existing role, verify that the role has the following IAM permissions:

**(Required)**: Data pipeline permissions between OpenSearch Ingestion and OpenSearch Serverless

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "allowAPIs",
            "Effect": "Allow",
            "Action": [ "aoss:APIAccessAll", "aoss:BatchGetCollection" ],
            "Resource": [ "arn:aws:aoss:*:111122223333:collection/collection-id" ]
        },
        {
            "Sid": "allowSecurityPolicy",
            "Effect": "Allow",
            "Action": [
                "aoss:CreateSecurityPolicy",
                "aoss:UpdateSecurityPolicy",
                "aoss:GetSecurityPolicy"
            ],
            "Resource": "*",
            "Condition":{
               "StringLike":{
                  "aoss:collection": [ "collection-name" ]
                },
               "StringEquals": {
                  "aws:ResourceAccount": [ "111122223333" ]
               }
            }
        }
    ]
}
```

------

**(Required)**: Data ingestion permissions between OpenSearch Ingestion and Amazon S3 dead-letter queue

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "s3Access",
            "Effect": "Allow",
            "Action": [
              "s3:PutObject"
            ],
            "Resource": [ "arn:aws:s3:::bucket/*" ]
        }
    ]
}
```

------

**(Required)**: Data ingestion permissions between OpenSearch Ingestion and Amazon S3 Vectors

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AllowS3VectorIndexAccess",
            "Effect": "Allow",
            "Action": [
               "s3vectors:ListVectors",
               "s3vectors:GetVectors"
            ],
            "Resource": [
                "arn:aws:s3vectors:us-east-1:111122223333:bucket/bucket-name/index/index-name"
            ]
        }
    ]
}
```

------

**(Required if AWS KMS encryption is enabled)**: Decryption permissions for communication between OpenSearch Ingestion and Amazon S3 Vectors

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "allowS3VectorDecryptionOfCustomManagedKey",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt"
            ],
            "Resource": [ "arn:aws:kms:us-east-1:111122223333:key/key-id" ],
            "Condition": {
                "StringEquals": {
                    "kms:ViaService": "s3vectors.us-east-1.amazonaws.com",
                    "kms:EncryptionContext:aws:s3vectors:arn": [
                        "arn:aws:s3vectors:us-east-1:111122223333:bucket/example-bucket",
                        "arn:aws:s3vectors:us-east-1:111122223333:bucket/example-bucket/index/example-index"
                        ]
                 }
             }
        }
    ]
}
```

------

## Configuring Amazon S3 Vectors integration with OpenSearch


Use the following procedure to configure Amazon S3 Vectors integration with OpenSearch Serverless.

**Note**  
If you started the process of configuring integration from the Amazon S3 console by choosing the **Export to OpenSearch** option in the **Vector buckets** page, some of the steps in the following procedure aren't applicable, as noted in the procedure.

**To configure Amazon S3 Vectors integration with OpenSearch Serverless**

1. Open the **Import S3 vector index to OpenSearch vector engine** page in the Amazon OpenSearch Service console. The page automatically displays if you clicked **Export to OpenSearch** in the Amazon S3 console. If you are starting in the OpenSearch console, choose **Integration** in the left navigation and then choose **Import S3 vector index**.

1. In the **Source** section, if you started in the Amazon S3 console, verify that the name of vector index and its Amazon Resource Name (ARN) are already specified. If you started in the OpenSearch console, enter the index ARN in the **S3 vector index ARN** field.

1. In the **Service access** section, choose an option. If you choose an existing role, verify it has all required permissions for integration as described in [Required IAM permissions](#vector-search-iam-permissions).

1. (Optional) Expand **Additional settings**. For **Enable redundancy (active replicas)** we recommend leaving this option selected for production environments. When you create your first collection, OpenSearch Serverless instantiates two OCUs—one for indexing and one for search. To ensure high availability, it also launches a standby set of nodes in another Availability Zone. For development and testing purposes, you can disable the **Enable redundancy** setting for a collection, which eliminates the two standby replicas and only instantiates two OCUs. By default, the redundant active replicas are enabled, which means that a total of four OCUs are instantiated for the first collection in an account.

   For **Add customer-managed AWS KMS key for Amazon OpenSearch Serverless vector**, choose this option to encrypt data in the vector collection using a customer managed key. By default, OpenSearch uses an AWS managed key.

1. If you started this process by clicking the **Export to OpenSearch** option in the Amazon S3 console, the **Export details** section lists the steps OpenSearch will take next. When you're ready, choose **Export**.

   If you started this process in the OpenSearch Service console, the **Import details** section lists the steps OpenSearch will take next. When you're ready, choose **Import**.

   OpenSearch opens the history page to display all exports/imports of Amazon S3 vector indexes to OpenSearch Serverless indexes.

After successful ingestion, OSI automatically stops the pipeline to prevent unnecessary costs while maintaining exported data in OpenSearch. You can monitor integration progress through CloudWatch metrics and access detailed logs for troubleshooting.

The OpenSearch collection remains active and available for queries after initial ingestion is completed. You can perform:
+ Similarity searches
+ Aggregations
+ Analytics operations

# Advanced search capabilities with an Amazon S3 vector engine


Amazon OpenSearch Service offers the ability to use Amazon S3 as a vector engine for vector indexes. This feature allows you to offload vector data to Amazon S3 while maintaining sub-second vector search capabilities at low cost.

With this feature, OpenSearch stores vector embeddings in an Amazon S3 vector index while keeping other document fields in the OpenSearch cluster's storage. This architecture offers the following benefits:
+ **Durability**: Data written to S3 Vectors is stored on S3, which is designed for 11 9s of data durability.
+ **Scalability**: Offload large vector datasets to S3 without consuming cluster storage.
+ **Cost-effectiveness**: Optimize storage costs for vector-heavy workloads.

OpenSearch has the following requirements for using S3 vector indexes:
+ OpenSearch version 2.19 or later
+ OpenSearch Optimized instances
+ Latest patch version for your OpenSearch release

## Enabling S3 Vectors


When [creating a new domain](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/createupdatedomains.html) or updating an existing domain, you can choose the **Enable S3 Vectors as an engine option** in the **Advanced features** section. This setting allows OpenSearch to create an S3 vector bucket when you leverage S3 Vectors as your engine. When you enable this option, OpenSearch configures S3 Vectors for your domain by:

1. Creating two new grants on the AWS KMS key configured with your domain:
   + A grant for the S3 Vectors background indexing jobs with decrypt privileges
   + A grant for OpenSearch to create S3 vectors buckets with `GenerateDataKey` permissions

1. Configuring the KMS key used by your OpenSearch domain as the CMK for encryption at rest of all vector index data.

## Creating indexes with S3 vector engine


After you configure a domain, you can create one or more k-NN indexes with fields using `s3vector` as the backend vector engine in the index mappings. You can configure different vector fields with different engine types based on your use case.

**Important**  
You can only use the `s3vector` engine in mapping a field definition during index creation. You can't add or update the mapping with `s3vector` engine after index creation.

Here are some examples that create S3 vector engine indexes.

**Example: Creating a k-NN index with S3 vector engine**

```
PUT my-first-s3vector-index
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
        "my_vector_1": {
          "type": "knn_vector",
          "dimension": 2,
          "space_type": "l2",
          "method": {
            "engine": "s3vector"
          }
        },
        "price": {
          "type": "float"
        }
    }
  }
}
```

**Example: Creating a k-NN index with both S3 vector and FAISS engines**

This example highlights the fact you can use multiple vector engines within the same index.

```
PUT my-vector-index
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
        "my_vector_1": {
          "type": "knn_vector",
          "dimension": 2,
          "space_type": "l2",
          "method": {
            "engine": "s3vector"
          }
        },
        "price": {
          "type": "float"
        },
        "my_vector_2": {
            "type": "knn_vector",
            "dimension": 2,
            "space_type": "cosine",
            "method": {
                "name": "hnsw",
                "engine": "faiss",
                "parameters": {
                    "ef_construction": 128,
                    "m": 24
                }
            }
        }
    }
  }
}
```

**Unsupported example: Adding S3 vector engine after index creation**

The following approach is not supported and will fail.

```
PUT my-first-s3vector-index
{
  "settings": {
    "index": {
      "knn": true
    }
  }
}

PUT my-first-s3vector-index/_mapping
{
  "properties": {
        "my_vector_1": {
          "type": "knn_vector",
          "dimension": 2,
          "space_type": "l2",
          "method": {
            "engine": "s3vector"
          }
        },
        "price": {
          "type": "float"
        }
    }
}
```

## Functional limitations


Consider the following limitations before using `s3vector` engine in an index:


**Features and behaviors not supported with s3vector engine**  

| Feature | Behavior | 
| --- | --- | 
| Split/Shrink/Clone index | These APIs fail when used with an index configured with `s3vector` engine in `knn_vector` field. | 
| Snapshots |  Indices using `s3vector` engine don't support snapshots. For managed domains: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/s3-vector-opensearch-integration-engine.html)  While snapshots aren't supported for point-in-time recovery, `s3vector` engine, along with OpenSearch Optimized instances, provide 11 nines of durability.   | 
| UltraWarm tier | Indices configured with `s3vector` engine can't migrate to UltraWarm tier. | 
| Cross-cluster replication | Indices configured with `s3vector` engine don't support cross-cluster replication. | 
| Accidental delete protection |  Because snapshots aren't supported for indices using `s3vector` engine, accidental delete protection isn't available. You can still restore other indices in the domain.  | 
| Radial search | Queries with radial search aren't supported on fields using `s3vector` engine. | 

## Indexing documents


After creating an index with S3 vector engine, you can ingest documents using the standard `_bulk` API. OpenSearch automatically offloads vector data of `knn_vector` fields using the `s3vector` engine to the S3 vector index in real time. Data belonging to other fields or `knn_vector` fields using different engines will be persisted by OpenSearch in its own storage layer.

For all bulk requests that are acknowledged, OpenSearch guarantees that all data (vector and non-vector) is durable. If a request receives a negative acknowledgment, there are no guarantees on the durability of the documents in that bulk request. You should retry such requests preferably after deleting the previous failed request using the document id to avoid duplicate documents in these rare cases.

**Example bulk indexing**

```
POST _bulk
{ "index": { "_index": "my-first-s3vector-index", "_id": "1" } }
{ "my_vector_1": [1.5, 2.5], "price": 12.2 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "2" } }
{ "my_vector_1": [2.5, 3.5], "price": 7.1 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "3" } }
{ "my_vector_1": [3.5, 4.5], "price": 12.9 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "4" } }
{ "my_vector_1": [5.5, 6.5], "price": 1.2 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "5" } }
{ "my_vector_1": [4.5, 5.5], "price": 3.7 }
```

## Searching documents


You can search your index using the standard `_search` API to execute text, k-NN, or hybrid queries. For queries on `knn_vector` fields configured with `s3vector` engine, OpenSearch automatically offloads the query to the corresponding S3 vectors index.

**Note**  
With `s3vector` engine, k-NN search queries support a maximum `k` value of 100. This means a maxmium of 100 nearest neighbors can be returned in the search results.

**Example search query**

```
GET my-first-s3vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector_1": {
        "vector": [2.5, 3.5],
        "k": 2
      }
    }
  }
}
```

You can run filtered vector search on OpenSearch kNN index using s3vector engine. OpenSearch applies the filter as the post filter and uses oversampling mechanism based on certain heuristics to balance recall vs latency.

**Example search query with filter:**

```
GET my-index/_search
{
  "size": 10,
  "query": {
    "knn": {
      "my_vector_field": {
        "vector": [2.5, 3.5, 1.2, 4.8],
        "k": 10,
        "filter": {
          "range": {
            "price": {
              "gte": 10,
              "lte": 100
            }
          }
        }
      }
    }
  }
}
```

## Supported mapping parameters


With `s3vector` engine, the `knn_vector` field supports the following parameters in the mappings.


**Vector field parameters**  

| Parameter | Required | Description | Supported values | 
| --- | --- | --- | --- | 
| type | Yes | The type of field present in the document. | knn\$1vector | 
| dimension | Yes | The dimension of each vector that will be ingested into the index. | >0, <=4096 | 
| space\$1type | No | The vector space used to calculate the distance between vectors. | l2, cosinesimil | 
| method.engine | Yes | The approximate k-NN engine to use for indexing and search. | s3vector | 
| method.name | No | The nearest neighbor method | "" | 
| store | N/A | Enabling or disabling this mapping parameter is no-op as knn\$1vector data is not stored in OpenSearch. | Not Supported | 
| doc\$1values | N/A | Enabling or disabling this mapping parameter is no-op as knn\$1vector data is not stored in OpenSearch. | Not Supported | 

**Important**  
Nested `knn_vector` field types are unsupported using `s3vector` engine

## Metering and billing


For information about metering and billing for this feature, see [Amazon OpenSearch Service pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Disabling the s3vector engine


Before you disable the `s3vector` engine, delete *all* indexes that are currently using it. If you don't, any attempt to disable the engine fails.

Also note that enabling or disabling the `s3vector` engine triggers a [blue/green deployment](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-configuration-changes.html) on your domain.

To disable the `s3vector` engine, [edit your domain configuration](https://docs.aws.amazon.com/cli/latest/reference/opensearch/update-domain-config.html) and set `S3VectorsEngine.Enabled: false`.

# k-Nearest Neighbor (k-NN) search in Amazon OpenSearch Service
k-NN search

Short for its associated *k-nearest neighbors* algorithm, k-NN for Amazon OpenSearch Service lets you search for points in a vector space and find the "nearest neighbors" for those points by Euclidean distance or cosine similarity. Use cases include recommendations (for example, an "other songs you might like" feature in a music application), image recognition, and fraud detection.

**Note**  
This documentation provides a brief overview of the k-NN plugin, as well as limitations when using the plugin with managed OpenSearch Service. For comprehensive documentation of the k-NN plugin, including simple and complex examples, parameter references, and the complete API reference, see the open source [OpenSearch documentation](https://opensearch.org/docs/latest/search-plugins/knn/index/). The open source documentation also covers performance tuning and k-NN-specific cluster settings. 

## Getting started with k-NN


To use k-NN, you must create an index with the `index.knn` setting and add one or more fields of the `knn_vector` data type.

```
PUT my-index

{
  "settings": {
    "index.knn": true
  },
  "mappings": {
    "properties": {
      "my_vector1": {
        "type": "knn_vector",
        "dimension": 2
      },
      "my_vector2": {
        "type": "knn_vector",
        "dimension": 4
      }
    }
  }
}
```

The `knn_vector` data type supports a single list of up to 10,000 floats, with the number of floats defined by the required `dimension` parameter. After you create the index, add some data to it.

```
POST _bulk

{ "index": { "_index": "my-index", "_id": "1" } }
{ "my_vector1": [1.5, 2.5], "price": 12.2 }
{ "index": { "_index": "my-index", "_id": "2" } }
{ "my_vector1": [2.5, 3.5], "price": 7.1 }
{ "index": { "_index": "my-index", "_id": "3" } }
{ "my_vector1": [3.5, 4.5], "price": 12.9 }
{ "index": { "_index": "my-index", "_id": "4" } }
{ "my_vector1": [5.5, 6.5], "price": 1.2 }
{ "index": { "_index": "my-index", "_id": "5" } }
{ "my_vector1": [4.5, 5.5], "price": 3.7 }
{ "index": { "_index": "my-index", "_id": "6" } }
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 10.3 }
{ "index": { "_index": "my-index", "_id": "7" } }
{ "my_vector2": [2.5, 3.5, 5.6, 6.7], "price": 5.5 }
{ "index": { "_index": "my-index", "_id": "8" } }
{ "my_vector2": [4.5, 5.5, 6.7, 3.7], "price": 4.4 }
{ "index": { "_index": "my-index", "_id": "9" } }
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 8.9 }
```

Then you can search the data using the `knn` query type.

```
GET my-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector2": {
        "vector": [2, 3, 5, 6],
        "k": 2
      }
    }
  }
}
```

In this case, `k` is the number of neighbors you want the query to return, but you must also include the `size` option. Otherwise, you get `k` results for each shard (and each segment) rather than `k` results for the entire query. k-NN supports a maximum `k` value of 10,000.

If you mix the `knn` query with other clauses, you might receive fewer than `k` results. In this example, the `post_filter` clause reduces the number of results from 2 to 1.

```
GET my-index/_search

{
  "size": 2,
  "query": {
    "knn": {
      "my_vector2": {
        "vector": [2, 3, 5, 6],
        "k": 2
      }
    }
  },
  "post_filter": {
    "range": {
      "price": {
        "gte": 6,
        "lte": 10
      }
    }
  }
}
```

If you need to handle a large volume of queries while maintaining optimal performance, you can use the [https://opensearch.org/docs/latest/api-reference/multi-search/](https://opensearch.org/docs/latest/api-reference/multi-search/) API to construct a bulk search with JSON and send a single request to perform multiple searches:

```
GET _msearch

{ "index": "my-index"}
{ "query": { "knn": {"my_vector2":{"vector": [2, 3, 5, 6],"k":2 }} } }
{ "index": "my-index", "search_type": "dfs_query_then_fetch"}
{ "query": { "knn": {"my_vector1":{"vector": [2, 3],"k":2 }} } }
```

The following video demonstrates how to set up bulk vector searches for K-NN queries.

[![AWS Videos](http://img.youtube.com/vi/https://www.youtube.com/embed/Umi67JCfCbU/0.jpg)](http://www.youtube.com/watch?v=https://www.youtube.com/embed/Umi67JCfCbU)


## k-NN differences, tuning, and limitations


OpenSearch lets you modify all [k-NN settings](https://docs.opensearch.org/latest/vector-search/settings/) using the `_cluster/settings` API. On OpenSearch Service, you can change all settings except `knn.memory.circuit_breaker.enabled` and `knn.circuit_breaker.triggered`. k-NN statistics are included as [Amazon CloudWatch metrics](managedomains-cloudwatchmetrics.md).

In particular, check the `KNNGraphMemoryUsage` metric on each data node against the `knn.memory.circuit_breaker.limit` statistic and the available RAM for the instance type. OpenSearch Service uses half of an instance's RAM for the Java heap (up to a heap size of 32 GiB). By default, k-NN uses up to 50% of the remaining half, so an instance type with 32 GiB of RAM can accommodate 8 GiB of graphs (32 \$1 0.5 \$1 0.5). Performance can suffer if graph memory usage exceeds this value.

You can migrate a k-NN index created on version 2.x or later to [UltraWarm](ultrawarm.md) or [cold storage](cold-storage.md) on a domain with version 2.17 or later.

Clear cache api and warmup apis for k-NN indices are blocked for warm indices. When the first query is initiated for the index, it downloads the graph files from Amazon S3 and loads the graph to memory. Similarly, when TTL is expired for the graphs, the files are automatically evicted from memory.

# Vector ingestion
Vector ingestion

Vector ingestion helps you quickly ingest and index OpenSearch domains and OpenSearch Serverless collections. The service examines your domain or collection and creates an ingestion pipeline on your behalf to load your data into OpenSearch. The ingestion and indexing of your domain or collection are managed for you by Vector ingestion.

You can accelerate and optimize the indexing process by enabling [GPU-acceleration for vector indexing](gpu-acceleration-vector-index.md) and [Auto-optimize](serverless-auto-optimize.md) features. With Vector ingestion, you don't need to manage the underlying infrastructure, patch software, or scale clusters to support your vector database indexing and ingestion. This allows you to quickly build your vector database to meet your needs.

## How it works


Vector ingestion examines your domain or collection and their index. You can manually configure your vector index fields or allow OpenSearch to use automatic configuration.

Vector ingestion uses OpenSearch Ingestion (OSI) as the data pipeline between Amazon S3 and OpenSearch. The service processes vectors in parallel to optimize ingestion speed while respecting the scaling limits of both OSI and OpenSearch.

## OpenSearch Vector ingestion pricing


At any specific time, you only pay for the number of vector ingestion OCUs that are allocated to a pipeline, regardless of whether there's data flowing through the pipeline. OpenSearch vector ingestion immediately accommodates your workloads by scaling pipeline capacity up or down based on usage.

For full pricing details, see [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Prerequisites


Before using vector ingestion, ensure you have the following resources:
+ Amazon S3 bucket containing your OpenSearch JSON documents in Parquet or JSONL format
+ OpenSearch resource - either a domain or collection
+ OpenSearch version `2.19` or later (required for auto-optimize integration)

## Create vector database


Use the vector ingestion job creation workflow to set up automated vector index tuning and accelerate large-scale index builds.

**Note**  
The procedural content in this section is subject to change as the user interface is finalized. The workflow may be updated in future releases to reflect the latest console experience.

**To create a vector ingestion job**

1. In the **Vector ingestion job details** section, for **Name**, enter a name for your ingestion job.

1. In the **Data source** section, configure the following:

   1. For **Amazon S3 URI**, enter the Amazon S3 bucket location containing your OpenSearch Service JSON documents.

   1. Choose **Browse Amazon S3** to select from available buckets, or choose **View** to preview the bucket contents.

   1. For **Content type**, select one of the following:
      + **Vectors** - Documents already contain vectors and doesn't require further vector embedding generation.
      + **Text, image, or audio** - Documents contain content such as text, images or audio bytes that need to be encoded into vector embeddings.

1. In the **Data source permissions** section, configure access permissions:

   1. For **IAM role**, choose one of the following:
      + **Create a new role**
      + **Use an existing role**

   1. For **IAM role name**, enter a name for the role.

1. In the **Destination** section, configure the OpenSearch Service endpoint:

   1. For **Endpoint**, choose **Choose an option** to select from your compatible domains or collections in the current region.

   1. Choose **Next** to proceed with the selected endpoint.

1. Choose **Next** to continue to the next step, or choose **Cancel** to exit without saving.

## Related features


Vector ingestion works with the following Amazon OpenSearch Service features to optimize your vector database performance:

[GPU-acceleration for vector indexing](gpu-acceleration-vector-index.md)  
GPU-acceleration reduces the time needed to create, update, and delete vector indexes. When used with vector ingestion, you can significantly accelerate the ingestion and indexing process for large-scale vector databases.

[Auto-optimize](serverless-auto-optimize.md)  
Auto-optimize automatically discovers optimal trade-offs between search latency, quality, and memory requirements. Vector ingestion can apply auto-optimize recommendations during the ingestion process to ensure your vector indexes are optimally configured.

For best results, consider enabling both GPU-acceleration and Auto-optimize when using vector ingestion to build large-scale vector databases.

# Export Amazon S3 vector index to OpenSearch Service vector engine


A point-in-time export for your selected Amazon S3 vector index to OpenSearch Service. The OpenSearch Service vector engine provides a simple and scalable vector store with advanced search functionality.

**To export Amazon S3 vector index to OpenSearch Service vector engine**

1. In the **Source** section, verify the Amazon S3 vector index details:
   + **Amazon S3 vector index** - The name of your source index
   + **Amazon S3 vector index ARN** - The Amazon Resource Name of your index

1. In the **Service access** section, configure OpenSearch Service authorization:

   1. For **Choose a method to authorize OpenSearch Service**, select one of the following:
      + **Create and use a new service role**
      + **Use an existing service role**

   1. For **Service role name**, enter a name for the service role.
**Note**  
Service role name must be 1 to 64 characters. Valid characters are a-z, A-Z, 0-9, and periods (.).

   1. Choose **View permission details** to review the required permissions.

1. Expand **Additional settings - optional** to configure advanced options if needed.

1. In the **Export details** section, configure the following options:
   + **Automate OpenSearch Service vector collection creation** - OpenSearch Service collections are used to store vector data. Serverless compute capacity is measured in OpenSearch Service Compute Units (OCUs), by default the Max OCU capacity is 50.
   + **Automate IAM role creation for service access** - This role is used by OpenSearch Service to read the Amazon S3 vector index and write to the OpenSearch Service collection.
   + **Automate OpenSearch Service ingestion pipeline creation** - OpenSearch Service ingestion pipelines are used to ingest data. An Amazon S3 bucket is created as a best practice to capture and store failed events in an Amazon S3 bucket Dead Letter Queue (DLQ), enabling easy access for troubleshooting and analysis.

1. Choose **Export** to start the export process, or choose **Cancel** to exit without exporting.

# Import Amazon S3 vector namespace to OpenSearch Service vector engine


Analyzing your vector data with OpenSearch Service requires a one-time OpenSearch Service collection and IAM permission setup.

**To import Amazon S3 vector namespace to OpenSearch Service vector engine**

1. In the **Source** section, configure the Amazon S3 vector index:

   1. For **Amazon S3 vector index ARN**, enter the ARN of your Amazon S3 vector index.
**Note**  
Must be in format arn:aws:iam::account-id:vector-bucket-name/\$1:index

1. In the **Service access** section, configure OpenSearch Service authorization:

   1. For **Choose a method to authorize OpenSearch Service**, select one of the following:
      + **Create and use a new service role**
      + **Use an existing service role**

   1. For **Service role name**, enter a name for the service role.
**Note**  
Service role name must be 1 to 64 characters. Valid characters are a-z, A-Z, 0-9, and periods (.).

   1. Choose **View permission details** to review the required permissions.

1. Expand **Additional settings - optional** to configure advanced options if needed.

1. In the **Import steps** section, configure the following automation options:
   + **Automate OpenSearch Service vector collection creation** - OpenSearch Service collections are used to store vector data. Serverless compute capacity is measured in OpenSearch Service Compute Units (OCUs), by default the Max OCU capacity is 50.
   + **Automate IAM role creation for service access** - This role is used by OpenSearch Service to read the Amazon S3 vector index and write to the OpenSearch Service collection.
   + **Automate OpenSearch Service ingestion pipeline creation** - OpenSearch Service ingestion pipelines are used to ingest data. An Amazon S3 bucket is created as a best practice to capture and store failed events in an Amazon S3 bucket Dead Letter Queue (DLQ), enabling easy access for troubleshooting and analysis.

1. Choose **Import** to start the import process, or choose **Cancel** to exit without importing.

# View vector ingestion jobs and import history


Vector ingestion jobs create a pipeline for vectorizing data sets, automating vector index tuning and accelerating large-scale index builds.

**To view vector ingestion jobs**

1. In the **Vector ingestion jobs** section, view the summary information:
   + **Jobs** - Total number of ingestion jobs
   + Choose **Create vector database** to create a new ingestion job

1. In the **Amazon S3 vectors imports** section, view the import summary:
   + **Total imports** - Number of completed imports
   + Choose **Import Amazon S3 vectors** to start a new import

1. In the **Vector ingestion jobs** table, monitor active jobs with the following information:
   + **Name** - The job name
   + **Status** - Current job status (e.g., Active)
   + **Data source** - Source location (e.g., s3://location)
   + **Destination** - Target destination
   + **Last updated** - Most recent update timestamp

1. Use the search box to **Find vector ingestion job** to locate specific jobs.

1. To manage jobs, choose from the following actions:
   + Choose **Delete** to remove selected jobs
   + Choose **Create vector database** to create additional jobs

1. In the **Amazon S3 vectors import history** section, track import events:

   1. Use the **Date range** filter to specify a time period for import history.

   1. Use the **Status** dropdown to filter by import status (e.g., Any status).

   1. Use the search box to **Find imports by Amazon S3 vector index na...** to locate specific imports.

   1. View import details including:
      + **Import initiated on (UTC\$15:30)** - When the import started
      + **Import status** - Current status (In progress, Complete, Failed, Partially complete)
      + **Amazon S3 vector index ARN** - Source index identifier
      + **OpenSearch Service vector collection** - Destination collection

1. Choose **Import Amazon S3 vector** to start a new import process.

# Auto-optimize
Auto-optimize

Auto-optimize is a service that automates vector index optimizations, enabling users to balance search quality, speed, and cost without requiring weeks of manual expert tuning. It evaluates index configurations based on user-defined latency and recall requirements and generates optimization recommendations, so minimal expertise is required. Recommendations are typically delivered within 30-60 minutes.

Traditional vector index configuration requires significant expertise and experimentation to achieve optimal performance. Parameters like `ef_construction` (which controls index build quality), `m` (which determines the number of graph connections), `ef_search` (which controls HNSW search), and quantization methods (Binary Quantization (32x, 16x, 8x), Scalar quantization (4x)) significantly impact both search accuracy and resource utilization. Auto-optimize uses hyperparameter optimization algorithms to discover index configurations that are uniquely optimal for your dataset within your defined latency and recall requirements.

## Benefits


Auto-optimize for OpenSearch provides the following benefits:
+ **Automated parameter tuning** - Eliminates manual experimentation with algorithm (HNSW), quantization, rescoring and engine parameters, saving time and reducing the learning curve for vector search optimization.
+ **Optimize search speed** - By default, OpenSearch is configured for in-memory performance. Auto-optimize discovers favorable trade-offs that improve search quality and cost savings while maintaining acceptable search speed.
+ **Cost optimization** - Reduces cost by finding options to reduce your index memory requirements while minimizing search quality and speed trade-offs.
+ **Optimize search quality** - Potentially deliver higher recall than default settings, or discover favorable trade-offs that deliver significant cost savings with minimal recall loss.

Auto-optimize works alongside other OpenSearch features such as [GPU-acceleration for vector indexing](gpu-acceleration-vector-index.md) to provide comprehensive performance optimization for vector search workloads.

## How it works


Auto-optimize operates through a job-based architecture that analyzes your vector data and provides optimization recommendations. Key points:
+ Users share their datasets in Parquet or JSONL format in an Amazon S3 bucket.
+ They configure serverless auto-optimize jobs by configuring their acceptable recall and latency thresholds. More relaxed thresholds allow the service to discover more significant cost optimizations.
+ Auto-optimize jobs run on infrastructure fully-managed by Amazon OpenSearch Service. Jobs don't consume resources on your domain or collections. Workers run in parallel to evaluate index configurations, and use sampling on large datasets to deliver results typically within 30-60 minutes.
+ Each job is billed on a predictable flat rate. For pricing information, see [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Prerequisites

+ **Dataset format and permissions** - You must have your dataset available as one or more Parquet or JSONL files in an Amazon S3 bucket folder. For example:
  + Parquet: `s3://dataset-bucket-us-east-1/dataset_folder/first_half.parquet` and `s3://dataset-bucket-us-east-1/dataset_folder/second_half.parquet`
  + JSONL: `s3://dataset-bucket-us-east-1/dataset_folder/data.jsonl`

  Provide the enclosing folder URI (for example, `s3://dataset-bucket-us-east-1/dataset_folder/`). The folder must contain files of a single format — do not mix Parquet and JSONL files in the same folder. This dataset will be used to generate the recommendations. Ensure that your federated role has the following Amazon S3 permissions on that resource: `"s3:Get*", "s3:List*", "s3:Describe*"`.
+ **Specify correct dataset metadata** - The provided dataset must contain rows of float values. The name of each column and dimensionality of each vector must match the options provided in the console. For example, if the dataset contains vectors that are named `train_data` which are each `768` dimension, these values must match the auto-optimize console.
+ **(If using vector ingestion feature)** - If you plan to utilize the ingestion feature (taking auto-optimize recommendations to automatically create index and ingest data), you must configure your OpenSearch cluster to give auto-optimize permission to ingest your dataset into the OpenSearch cluster. For OpenSearch domains with a domain access policy, grant the newly created role access through that policy. For OpenSearch domains with fine-grained access control, add the pipeline role as a backend role. For OpenSearch Serverless collections, add the pipeline role to the data access policy.
+ **IAM permissions** - You need the following IAM permissions to use auto-optimize:
  + `opensearch:SubmitAutoOptimizeJob`
  + `opensearch:GetAutoOptimizeJob`
  + `opensearch:DeleteAutoOptimizeJob`
  + `opensearch:CancelAutoOptimizeJob`
  + `opensearch:ListAutoOptimizeJobs`
**Note**  
These are identity-based policies. Auto-optimize does not support resource-based policies.
+ **Credential expiry** - Configure your federated user session to have a minimum credential expiry of at least 1 hour. For very large datasets or high dimensions, consider increasing the expiration duration up to 3 hours.

## Use cases for auto-optimize


Auto-optimize is particularly valuable in the following scenarios:

**Initial configuration optimization**  
When first deploying vector search applications, determining optimal HNSW parameters often requires extensive testing and domain expertise. Auto-optimize eliminates this trial-and-error process by analyzing your data and workload characteristics to recommend production-ready configurations.

This use case is ideal for teams new to vector search or those migrating from other vector database platforms who need to establish baseline configurations quickly.

**Scaling optimization**  
As your vector dataset grows from thousands to millions of vectors, parameters that worked well initially may become suboptimal. Auto-optimize recommends adjustments to maintain performance at scale.

**Cost reduction**  
Vector indexes can consume significant compute and storage resources, especially with high-dimensional embeddings. Auto-optimize identifies opportunities to reduce costs by finding more efficient parameter configurations that maintain your required performance levels while using fewer resources.

For example, auto-optimize might discover that your current `m` value (graph connectedness) is higher than necessary for your accuracy requirements, allowing you to reduce indexing time and storage without impacting search quality.

**Performance troubleshooting**  
When experiencing slow query performance or high latency in vector search operations, auto-optimize can analyze your dataset and identify a more optimal configuration. The service provides specific recommendations to address performance bottlenecks, such as adjusting graph connectivity or search parameters.

## Limitations

+ **Regional availability** - Auto-optimize is available only in the following AWS Regions:
  + ap-south-1
  + eu-west-1
  + us-west-2
  + us-east-2
  + us-east-1
  + eu-central-1
  + ap-southeast-2
  + ap-northeast-1
  + ap-southeast-1
+ **Collection types** - Auto-optimize is supported only for Vector Search Collections and OpenSearch Domains (2.19, 3.1, and 3.3).
+ **Engine support**  
**Engine support by deployment type**    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-auto-optimize.html)
+ **Algorithm support** - Auto-optimize supports only HNSW-based vector indexes.
+ **Concurrent jobs** - You can run up to 10 concurrent optimization jobs per account per Region. No new jobs can be accepted if limit is reached.
+ **Job duration** - Optimization jobs can take from 15 minutes to several hours depending on dataset size, dimension, and required performance metrics.
+ **Recommendations** - Auto-optimize suggests only up to 3 recommendations.
+ **Dataset**
  + Supported formats: Parquet, JSONL
  + Data store: Amazon S3

## Billing and costs


Auto-optimize uses a per-job pricing model where you pay for each successful optimization job irrespective of dataset size and optimization configurations. You won't be charged for failed or cancelled jobs. Additionally, auto-optimize runs on separate infrastructure than managed or serverless OpenSearch clusters, so it does not affect the resource utilization of preexisting clusters.

**Pricing model**  
Auto-optimize costs are billed separately from standard OpenSearch Serverless or OpenSearch Managed domain compute and storage costs.

For pricing information, see [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## Supported data formats


Auto-optimize supports the following data formats for vector datasets stored in Amazon S3:

### Parquet format


Parquet is a columnar storage format optimized for analytical workloads. Each Parquet file should contain a column of float arrays representing your vector data.

Example Parquet file structure (viewed as a table):

```
| id  | train_data                     |
|-----|--------------------------------|
| 1   | [0.12, 0.45, 0.78, ..., 0.33] |
| 2   | [0.56, 0.89, 0.12, ..., 0.67] |
| 3   | [0.34, 0.67, 0.90, ..., 0.11] |
```

### JSONL format


JSONL (JSON Lines) is a text format where each line is a valid JSON object. Each line should contain a field with a float array representing your vector data.

Example JSONL file:

```
{"id": 1, "train_data": [0.12, 0.45, 0.78, 0.33]}
{"id": 2, "train_data": [0.56, 0.89, 0.12, 0.67]}
{"id": 3, "train_data": [0.34, 0.67, 0.90, 0.11]}
```

### Converting between formats


If your data is in a different format, you can use the following Python scripts to convert it.

**Convert JSON or JSONL to Parquet**  


```
#!/usr/bin/env python3
import json
import pyarrow as pa
import pyarrow.parquet as pq
from pathlib import Path
from typing import Any, Dict, List


def load_json_any(path: Path) -> List[Dict[str, Any]]:
    """
    Load JSON that can be:
      - a list of objects
      - a single object
      - JSON Lines (one object per line)
    Returns list[dict].
    """
    text = path.read_text().strip()

    # Try full JSON file
    try:
        obj = json.loads(text)
        if isinstance(obj, list):
            return obj
        if isinstance(obj, dict):
            return [obj]
    except json.JSONDecodeError:
        pass

    # Fallback → JSON Lines
    records = []
    for i, line in enumerate(text.splitlines(), start=1):
        line = line.strip()
        if not line:
            continue
        try:
            rec = json.loads(line)
        except json.JSONDecodeError as e:
            raise ValueError(f"Invalid JSON on line {i}: {e}")
        if not isinstance(rec, dict):
            raise ValueError(f"Line {i} must contain a JSON object")
        records.append(rec)

    return records


def json_to_parquet(json_path: str, parquet_path: str, compression: str = "snappy"):
    """Convert ANY JSON to Parquet (schema inferred)."""
    records = load_json_any(Path(json_path))
    table = pa.Table.from_pylist(records)
    pq.write_table(table, parquet_path, compression=compression)
    print(f"Wrote {len(records)} rows to {parquet_path}")


if __name__ == "__main__":
    INPUT_JSON = "vectors.jsonl"
    OUTPUT_PARQUET = "vectors.parquet"
    json_to_parquet(INPUT_JSON, OUTPUT_PARQUET)
```

**Convert Parquet to JSONL**  


```
#!/usr/bin/env python3
import json
import pyarrow.parquet as pq


def parquet_to_jsonl(parquet_path: str, jsonl_path: str):
    """Convert a Parquet file to JSONL format."""
    table = pq.read_table(parquet_path)
    rows = table.to_pylist()
    with open(jsonl_path, "w") as f:
        for row in rows:
            f.write(json.dumps(row) + "\n")
    print(f"Wrote {len(rows)} rows to {jsonl_path}")


if __name__ == "__main__":
    INPUT_PARQUET = "vectors.parquet"
    OUTPUT_JSONL = "vectors.jsonl"
    parquet_to_jsonl(INPUT_PARQUET, OUTPUT_JSONL)
```

# Using auto-optimize in the console


You can use the Amazon OpenSearch Service console to create vector ingestion jobs, monitor their progress, view optimization recommendations, and build indexes based on those recommendations.

## Prerequisites


Before you can use auto-optimize in the console, you must have the following:
+ An active AWS account with access to the OpenSearch console.
+ An existing OpenSearch Serverless collection of type *vector search* or a Managed OpenSearch domain.
+ IAM permissions for the following actions:
  + `opensearch:SubmitAutoOptimizeJob`
  + `opensearch:GetAutoOptimizeJob`
  + `opensearch:DeleteAutoOptimizeJob`
  + `opensearch:CancelAutoOptimizeJob`
  + `opensearch:ListAutoOptimizeJobs`
**Note**  
These are identity-based policies. AWS does not support resource-based policies for auto-optimize resources.
+ Configure your federated user session to have a minimum credential expiry of at least 1 hour. For very large datasets or high dimensions, consider increasing the expiration duration up to 3 hours.

## Creating a vector ingestion job


A vector ingestion job analyzes your vector data and provides optimization recommendations for index configuration.

**To create a vector ingestion job**

1. Sign in to the Amazon OpenSearch Service console at [AWS Management Console](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-Optimize**.

1. Choose **Create vector ingestion job**.

1. Under **Job details**, enter a name for your vector ingestion job. This name helps you identify the job in the console.

1. Under **Data source**, configure the following:

   1. For **Amazon S3 URI**, enter the Amazon S3 URI of the folder containing your data files (Parquet or JSONL). The URI must point to the enclosing folder, not individual files. For example, if your file is at `s3://my-bucket/my-folder/file1.parquet` or `s3://my-bucket/my-folder/data.jsonl`, enter `s3://my-bucket/my-folder/`.
**Note**  
The folder must contain files of a single format. Do not mix Parquet and JSONL files in the same folder.

   1. For **Region**, select the AWS Region where your Amazon S3 bucket is located. The Region must match the bucket location.

1. Under **OpenSearch domain**, select an existing domain or collection, or choose **Create new** to create one.
**Note**  
You can specify either an OpenSearch Managed domain or an OpenSearch Serverless serverless collection.

1. Under **Data source permissions**, specify the IAM role that has permissions to access your Amazon S3 bucket and OpenSearch domain or collection. The role must have the necessary permissions based on your domain or collection configuration:
   + For OpenSearch domains with a domain access policy, grant the role access through that policy.
   + For OpenSearch domains with fine-grained access control, add the role as a backend role.
   + For OpenSearch Serverless collections, add the role to the data access policy.

1. Choose **Next**.

1. Under **Configure index**, specify the following:

   1. For **Field name**, enter the field name from your dataset that contains the vector data.

   1. For **Space type**, select the distance metric used to calculate the distance between vectors:
      + **l2** - Euclidean distance
      + **cosinesimil** - Cosine similarity
      + **innerproduct** - Inner product

   1. For **Dimension**, enter the number of floating point values in each vector.

1. Under **Performance requirements**, configure the following:

   1. For **Recall**, specify your desired search quality as a decimal value between 0 and 1. Higher recall values return more relevant results. For example:
      + 0.95 indicates that on average 19 of the 20 true nearest document vectors to a query vector are returned
      + 0.9 indicates 9 in 10
      + 0.8 indicates 8 in 10

   1. For **Search latency requirements**, select your latency tolerance. Modest requirements allow for more cost savings through compression methods that decrease memory requirements.

1. Choose **Next**.

1. Review your configuration and choose **Create**.

The job begins processing. You can monitor its progress in the **Vector Ingestion Jobs** table.

## Monitoring optimization jobs


You can monitor the status of your vector ingestion jobs from the auto-optimize landing page.

**To monitor optimization jobs**

1. Sign in to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-optimize**.

1. The **Vector Ingestion Jobs** table displays all jobs with their current status. Refresh the page to see updated status information.
**Note**  
There is no automatic refresh or notification mechanism. You must manually refresh the console to see when a job completes.

### Understanding job status states


Auto-optimize jobs can have the following status values:

Pending  
The job is queued and waiting to start.

Running  
The auto-optimize job is actively analyzing your data and generating recommendations.

Completed  
The auto-optimize job has finished successfully. All analysis, evaluation, and recommendations are complete and available for viewing.

Failed  
The job encountered an error. View the error details in the job details page to determine the cause.

Active  
An index has been created in the attached cluster and data has been ingested.

Job duration depends primarily on dataset size and current service load. Typical jobs complete within 15 minutes to several hours.

## Viewing job details


You can view detailed information about a specific optimization job, including its configuration and status.

**To view job details**

1. Sign in to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-Optimize**.

1. In the **Vector Ingestion Jobs** table, choose the job name.

1. The job details page displays the following information:
   + Job name and status
   + Data source configuration (Amazon S3 URI and Region)
   + OpenSearch domain or collection
   + Index configuration (field name, space type, dimension)
   + Performance requirements (recall and latency)
   + Error messages (if the job failed)

## Viewing and understanding results


After a job completes successfully, you can view the optimization recommendations.

**To view optimization results**

1. Sign in to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-Optimize**.

1. In the **Vector Ingestion Jobs** table, choose a job with **Completed** status.

1. The results page displays the following sections:
   + **Results overview** - Shows the estimated search quality recall compared to your requirement and the index memory footprint compared to the top recommended configuration.
   + **Recommendations** - Lists up to three optimization recommendations, ordered with the top recommendation as the best match for your configuration. Each recommendation includes:
     + Index configuration parameters
     + Search configuration parameters
     + Expected performance metrics
     + Memory footprint estimates
**Note**  
While recommendations are ordered by best match, you can select any recommendation that better fits your specific use case. Auto-optimize attempts to find the closest matches to your chosen recall criteria.

## Building an index from recommendations


After reviewing the optimization recommendations, you can either manually create an index using the recommended configuration or automatically build an index with the selected recommendation.

**To build an index automatically**

1. Sign in to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-Optimize**.

1. In the **Vector Ingestion Jobs** table, choose a job with **Completed** status.

1. Review the recommendations and select the one you want to use.

1. Choose **Build index**.

1. The system automatically creates an index in your cluster using the selected recommendation and ingests the vector data from your dataset.

**To build an index manually**

1. Sign in to the Amazon OpenSearch Service console at [https://console.aws.amazon.com/aos/home](https://console.aws.amazon.com/aos/home).

1. In the navigation pane, choose **Auto-Optimize**.

1. In the **Vector Ingestion Jobs** table, choose a job with **Completed** status.

1. Review the recommendations and note the index configuration and search configuration parameters for your chosen recommendation.

1. Use the OpenSearch API or console to manually create an index with the recommended parameters.

## Related features


Auto-optimize works together with other Amazon OpenSearch Service features to help you build and optimize vector search applications:
+ [GPU-acceleration for vector indexing](gpu-acceleration-vector-index.md) - Accelerate vector index builds using GPU-acceleration to reduce indexing time and costs.
+ [Vector ingestion](serverless-vector-ingestion.md) - Quickly ingest and index vector data from Amazon S3 into your domain or collection.

# GPU-acceleration for vector indexing
GPU-acceleration for vector indexing

GPU-acceleration helps you build large-scale vector databases faster and more efficiently. You can enable this feature on new or existing OpenSearch domains and OpenSearch Serverless collections. This feature uses GPU-acceleration to reduce the time needed to index data into vector indexes.

With GPU-acceleration, you can increase vector indexing speed by up to 10X at a quarter of the indexing cost.

## Prerequisites


GPU-acceleration is supported on OpenSearch domains running OpenSearch version `3.1` or later, and OpenSearch Serverless collections. For more information, see [Upgrading Amazon OpenSearch Service domains](version-migration.md), [UpdateDomainConfig](https://docs.aws.amazon.com/opensearch-service/latest/APIReference/API_UpdateDomainConfig.html), and [UpdateCollection](https://docs.aws.amazon.com/opensearch-service/latest/ServerlessAPIReference/API_UpdateCollection.html) APIs.

## How it works


Vector indexes require significant compute resources to build data structures such as Hierarchical Navigable Small Worlds (HNSW) graphs. When you enable GPU-acceleration on your domain or collection, OpenSearch automatically detects opportunities to accelerate your index builds and offloads the index builds to GPU instances. OpenSearch Service manages the GPU instances on your behalf, assigning them to your domain or collection when needed. This means you don't manage utilization or pay for idle time.

You pay only for useful processing through Compute Units (OCU) - Vector Acceleration. Each Vector Acceleration OCU is a combination of approximately 8 GiB of CPU memory, 2 vCPUs, and 6 GiB of GPU memory. For more information, see [GPU Acceleration Pricing](#gpu-acceleration-pricing).

To enable GPU acceleration for your domain or collection, see [Enabling GPU-acceleration](gpu-acceleration-enabling.md).

## GPU Acceleration Pricing


AWS charges you when OpenSearch detects opportunities to accelerate your domain's or collection's index build workloads. Each Vector Acceleration OCU is a combination of approximately 8 GiB of CPU memory, 2 vCPUs, and 6 GiB of GPU memory.

AWS bills OCU with second-level granularity. In your account statement, you'll see an entry for compute in OCU-hours.

For example, when you use GPU-acceleration for one hour to create an index, using 2 vCPU and 1 GiB of GPU memory, you're billed 1 OCU. If you use 9 GiB of CPU memory while using GPU-acceleration, you're billed 2 OCU.

OpenSearch Serverless adds additional OCUs in increments of 1 OCU based on the compute power and storage needed to support your collections. You can configure a maximum number of OCUs for your account in order to control costs.

**Note**  
The number of OCUs provisioned at any time can vary and isn't exact. Over time, the algorithm that OpenSearch and OpenSearch Serverless uses will continue to improve in order to better minimize system usage.

For full pricing details, see [Amazon OpenSearch Service Pricing](https://aws.amazon.com/opensearch-service/pricing/).

## GPU-acceleration and write operations


GPU-acceleration is activated when OpenSearch's vector ingestion rate (MB/sec) is within a range. On OpenSearch domains, you have the flexibility to [configure this range](https://docs.opensearch.org/3.2/vector-search/remote-index-build/#using-the-remote-index-build-service) through `index.knn.remote_index_build.size.min` and `index.knn.remote_index_build.size.max`. For example, with the lower range default of 50 MB, writing 15,000 full-precision vectors with 768 dimension between [refresh intervals](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/bp.html#bp-perf) will trigger GPU-acceleration by default.

Data is written with the following API operations:
+ [Flush](https://docs.opensearch.org/latest/api-reference/index-apis/flush/)
+ [Bulk](https://docs.opensearch.org/latest/api-reference/document-apis/bulk/)
+ [Reindex](https://docs.opensearch.org/latest/api-reference/document-apis/reindex/)
+ [Index](https://docs.opensearch.org/latest/api-reference/index-apis/index/)
+ [Update](https://docs.opensearch.org/latest/api-reference/document-apis/update-document/)
+ [Delete](https://docs.opensearch.org/latest/api-reference/document-apis/delete-document/)
+ [Force Merge](https://docs.opensearch.org/latest/api-reference/index-apis/force-merge/)

GPU-acceleration is activated with both automatic and [manual](https://docs.opensearch.org/latest/api-reference/index-apis/force-merge/) segment merges.

## Supported index configurations


The [Faiss](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#faiss-engine) engine supports GPU-acceleration.

The following configurations do not support GPU-acceleration:
+ [Faiss product quantization](https://docs.opensearch.org/latest/vector-search/optimizing-storage/faiss-product-quantization/)
+ [Inverted File Index (IVF)](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#ivf-parameters)
+ [Non-Metric Space Library](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#nmslib-engine-deprecated)
+ [Lucene engine](https://docs.opensearch.org/latest/field-types/supported-field-types/knn-methods-engines/#lucene-engine)

## Supported AWS Regions


GPU-acceleration is available in the following AWS Regions:
+ US East (N. Virginia)
+ US West (Oregon)
+ Asia Pacific (Sydney)
+ Asia Pacific (Tokyo)
+ Europe (Ireland)

# Enabling GPU-acceleration
Enable GPU-acceleration

You can enable GPU-acceleration when creating or updating an OpenSearch domain or OpenSearch Serverless collection with the AWS Management Console, AWS CLI, or AWS SDK.

Once you enable GPU-acceleration on your domain or collection, this feature is enabled by default on all indexes. If you need to disable this feature at the index level, see [Creating GPU-accelerated vector indexes](gpu-acceleration-creating-indexes.md).

## Console


The following procedures enable GPU-acceleration for OpenSearch domains and OpenSearch Serverless collections using the OpenSearch Serverless management console.

------
#### [ Create new domain ]

To create an OpenSearch domain with GPU-acceleration enabled, see [Creating OpenSearch Service domains](createupdatedomains.md#createdomains).

------
#### [ Edit existing domain ]

1. Open the [OpenSearch Service](https://console.aws.amazon.com/aos/home ) management console.

1. In the navigation pane, choose **Domains**.

1. Choose your domain name to open the domain details page.

1. Choose **Actions**, then **Edit domain**.

1. In the **Advanced features** section, select **Enable GPU acceleration**. Once this feature is enabled, your vector indexing operations are [accelerated](gpu-acceleration-vector-index.md#gpu-acceleration-write-operations).

1. Choose **Save changes**.

------
#### [ Create new collection ]

To create an OpenSearch Serverless collection with GPU-acceleration enabled, see [Tutorial: Getting started with Amazon OpenSearch Serverless](serverless-getting-started.md). During collection creation, ensure you select the **Vector search** collection type and enable GPU-acceleration in the vector search configuration.

------
#### [ Edit existing collection ]

1. Open the [OpenSearch Service](https://console.aws.amazon.com/aos/home ) management console.

1. In the navigation pane, choose **Collections**.

1. Choose your collection name to open the collection details page.

1. In the **Deployment options** section, **Edit** Vector GPU acceleration.

1. Disable or enable GPU acceleration.

1. Choose **Save changes**.

------

### AWS CLI


------
#### [ Create new domain ]

The following AWS CLI example creates an OpenSearch domain with GPU-acceleration enabled in US East (N. Virginia). Replace the *text* with that of your own configuration.

```
aws opensearch create-domain \
    --domain-name my-domain \
    --engine-version OpenSearch_3.1 \
    --cluster-config InstanceType=r6g.xlarge.search,\
        InstanceCount=1,\
        DedicatedMasterEnabled=true,\
        DedicatedMasterCount=3,\
        DedicatedMasterType=m6g.large.search \
    --ebs-options "EBSEnabled=true,\
        VolumeType=gp3,\
        VolumeSize=2000" \
    --encryption-at-rest-options '{"Enabled":true}' \
    --aiml-options '{"ServerlessVectorAcceleration": {"Enabled": true}}' \
    --node-to-node-encryption-options '{"Enabled":true}' \
    --domain-endpoint-options '{"EnforceHTTPS":true,\
        "TLSSecurityPolicy":"Policy-Min-TLS-1-0-2019-07"}' \
    --access-policies '{"Version": "2012-10-17",		 	 	 
        "Statement": [{
            "Effect": "Allow",
            "Principal": {"AWS": "*"},
            "Action": "es:*",
            "Resource": "arn:aws:es:us-east-1:123456789012:domain/my-domain/*"
        }]}' \
    --advanced-security-options '{
        "Enabled":true,
        "InternalUserDatabaseEnabled":true,
        "MasterUserOptions": {
            "MasterUserName":"USER_NAME",
            "MasterUserPassword":"PASSWORD"
        }}' \
    --region us-east-1
```

------
#### [ Edit existing domain ]

The following AWS CLI example enables GPU-acceleration for an existing OpenSearch domain. Replace the *text* with that of your own configuration.

```
aws opensearch update-domain-config \
    --domain-name my-domain \
    --cluster-config InstanceType=r7g.16xlarge.search,InstanceCount=3 \
    --aiml-options '{"ServerlessVectorAcceleration": {"Enabled": true}}'
```

------
#### [ Create new collection ]

The following AWS CLI example creates an OpenSearch Serverless collection with GPU-acceleration enabled in US East (N. Virginia). Replace the *text* with that of your own configuration.

```
aws opensearchserverless create-collection \
    --name "my-collection" \
    --type "VECTORSEARCH" \
    --description "My vector collection with GPU acceleration" \
    --vector-options '{"ServerlessVectorAcceleration": "ENABLED"}' \
    --region us-east-1
```

------
#### [ Edit existing collection ]

The following AWS CLI example enables GPU-acceleration for an existing OpenSearch Serverless collection. Replace the *text* with that of your own configuration.

```
aws opensearchserverless update-collection \
    --id 07tjusf2h91cunochc \
    --vector-options '{"ServerlessVectorAcceleration": "ENABLED"}' \
    --region us-east-1
```

------

# Creating GPU-accelerated vector indexes
Create GPU-accelerated vector indexes

After enabling GPU-acceleration on your domain or collection, create vector indexes that can take advantage of GPU processing.

**Note**  
When you create a domain with GPU-acceleration enabled, the `index.knn.remote_index_build.enabled` setting is `true` by default. You don't need to explicitly set this setting when creating indexes. For collections, you must explicitly specify a value for this setting.

------
#### [ Creating index with GPU-acceleration ]

The following example creates a vector index optimized for GPU processing. This index stores 768-dimensional vectors (common for text embeddings).

```
PUT my-vector-index
{
  "settings": {
    "index.knn": true,
    "index.knn.remote_index_build.enabled": true
  },
  "mappings": {
    "properties": {
      "vector_field": {
        "type": "knn_vector",
        "dimension": 768
      },
      "text": {
        "type": "text"
      }
    }
  }
}
```

Key configuration elements:
+ `"index.knn": true` - Enables k-nearest neighbor functionality
+ `"index.knn.remote_index_build.enabled": true` - Enables GPU processing for this index. When the domain has GPU-acceleration enabled, not specifying this setting defaults to `true`. For collections, you must explicitly specify a value for this setting.
+ `"dimension": 768` - Specifies vector size (adjust based on your embedding model)

------
#### [ Creating index without GPU-acceleration ]

The following example creates a vector index where GPU processing is disabled. This index stores 768-dimensional vectors (common for text embeddings).

```
PUT my-vector-index
{
  "settings": {
    "index.knn": true,
    "index.knn.remote_index_build.enabled": false
  },
  "mappings": {
    "properties": {
      "vector_field": {
        "type": "knn_vector",
        "dimension": 768
      },
      "text": {
        "type": "text"
      }
    }
  }
}
```

------

# Indexing vector data and force-merging
Index and force-merge

Once you've created a GPU-accelerated vector index on your domain or collection, you can add vector data and optimize your index using standard OpenSearch operations. GPU-acceleration automatically enhances both indexing performance and force-merge operations, making it faster to build and maintain large-scale vector search applications without requiring changes to your existing workflows.

## Indexing vector data


Index vector data as you normally would. The GPU-acceleration automatically applies to indexing and force-merge operations. The following example demonstrates how to add vector documents to your index using the [bulk](https://docs.opensearch.org/latest/api-reference/document-apis/bulk/#index) API. Each document contains a vector field with numerical values and associated text content:

```
POST _bulk
{"index": {"_index": "my-vector-index"}}
{"vector_field": [0.1, 0.2, 0.3, ...], "text": "Sample document 1"}
{"index": {"_index": "my-vector-index"}}
{"vector_field": [0.4, 0.5, 0.6, ...], "text": "Sample document 2"}
```

### Force-merge operations


GPU-acceleration also applies to [force-merge](https://docs.opensearch.org/latest/api-reference/index-apis/force-merge/) operations, which can significantly reduce the time required to optimize vector indexes. Note that force-merge operations aren't supported on collections. The following example demonstrates how to optimize your vector index by consolidating all segments into a single segment:

```
POST my-vector-index/_forcemerge?max_num_segments=1
```

## Best practices


Follow these best practices to maximize the benefits of GPU-acceleration for your vector search workloads:
+ **Increase index clients** - To take full advantage of GPUs during the index build, increase the number of index clients that are ingesting data into OpenSearch. This allows for better parallelization and utilization of GPU resources.
+ **Adjust approximate threshold** - Change the `index.knn.advanced.approximate_threshold` setting to ensure that smaller segment index builds are not happening, which improves the overall speed of ingestion. A value of 10,000 is a good starting point. For collections, you must explicitly specify a value for this setting.
+ **Optimize shard size** - Try creating shards that have at least 1 million documents. Shards with fewer than this number of documents may not see overall benefits from GPU-acceleration.