Metadata filtering
Note
Amazon S3 Vectors is in preview release for Amazon Simple Storage Service and is subject to change.
Metadata filtering allows you to filter query results based on specific attributes attached to your vectors. You can use metadata filters with query operations to find vectors that match both similarity criteria and specific metadata conditions.
S3 Vectors supports two types of metadata: filterable metadata and non-filterable metadata. The key difference is that filterable metadata can be used in query filters but has stricter size limitations, while non-filterable metadata can't be used in filters but can store larger amounts of data within its size limits. For more information about metadata limits, including size limits per vector and maximum metadata keys per vector, see Limitations and restrictions.
Filterable metadata
Filterable metadata allows you to filter query results based on specific metadata values. By default, all metadata fields are filterable in a similarity query unless explicitly specified as non-filterable during vector index creation. S3 Vectors supports string, number, boolean, and list types of metadata with a size limit per vector. The metadata type is ideal for attributes that you want to filter on, such as categories, timestamps, or status values.
If the metadata size exceeds the supported limits, the PutVectors
API
operation will return a 400 Bad Request
error. For more information about
the filterable metadata size limit per vector, see Limitations and restrictions.
The following operations can be used with filterable metadata.
Operator | Valid Input Types | Description |
---|---|---|
$eq |
String, Number, Boolean | Exact match comparison for single values. When comparing with an
array metadata value, returns true if the input value matches any
element in the array. For example, |
$ne |
String, Number, Boolean | Not equal comparison |
$gt |
Number | Greater than comparison |
$gte |
Number | Greater than or equal comparison |
$lt |
Number | Less than comparison |
$lte |
Number | Less than or equal comparison |
$in |
Non-empty array of primitives | Match any value in array |
$nin |
Non-empty array of primitives | Match none of the values in array |
$exists |
Boolean | Check if field exists |
$and |
Non-empty array of filters | Logical AND of multiple conditions |
$or |
Non-empty array of filters | Logical OR of multiple conditions |
Examples of valid filterable metadata
- Simple equality
-
{"genre": "documentary"}
This filter matches vectors where the genre metadata key equals "documentary". When you don't specify an operator, S3 Vectors automatically uses the $eq operator.
- Explicit equality
-
// Example: Exact match {"genre": {"$eq": "documentary"}}
// Example: Not equal to {"genre": {"$ne": "drama"}}
- Numeric comparison
-
{"year": {"$gt": 2019}}
{"year": {"$gte": 2020}}
{"year": {"$lt": 2020}}
{"year": {"$lte": 2020}}
- Array operations
-
{"genre": {"$in": ["comedy", "documentary"]}}
{"genre": {"$nin": ["comedy", "documentary"]}}
- Existence check
-
{"genre": {"$exists": true}}
The
$exists
filter matches vectors that have a "genre" metadata key, regardless of the value that's stored for that metadata key. - Logical operations
-
{"$and": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}
{"$or": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}
- Price range (Multiple conditions on the same field)
-
{"price": {"$gte": 10, "$lte": 50}}
For more information about how to query vectors with metadata filtering, see Metadata filtering.
Non-filterable metadata
Non-filterable metadata can't be used in query filters but can store larger amounts of contextual data than filterable metadata. It's ideal for storing large text chunks, detailed descriptions, or other contextual information that doesn't need to be searchable but can be returned with query results. For example, you might store full document text, image descriptions, or detailed product specifications as non-filterable metadata.
Non-filterable metadata keys must be explicitly configured during vector index creation. Once a metadata key is designated as non-filterable during index creation, it can't be changed to filterable later. You can configure multiple metadata keys as non-filterable per vector index, with each metadata key name limited to 63 characters. For more information about the maximum number of non-filterable metadata keys that's allowed per vector index, see Limitations and restrictions.
While you can't filter on non-filterable metadata, you can retrieve it alongside query results using the return-metadata
parameter.
You can use non-filterable metadata for some use cases as follows.
Use it to provide context for your application without parsing separate data sources.
Store larger text chunks that would exceed the filterable metadata size limits.
Include it in vector exports by using the
ListVectors
API operation.
For more information about configuring non-filterable metadata, see Creating a vector index in a vector bucket.