# Vector database comparison
<a name="vector-db-comparison"></a>

AWS provides multiple approaches to implementing vector search capabilities, ranging from individual vector databases to Amazon Bedrock Knowledge Bases, which is a fully managed service. When evaluating these options, organizations must consider various aspects including architecture, scalability, integration capabilities, performance characteristics, and security features.

## Individual vector databases
<a name="individual-vector-databases"></a>

The following table provides an overview of key features of several AWS individual vector database solutions, focusing on their architectures, scaling capabilities, data source integrations, and performance characteristics.


| 
| 
| **Feature** | **Amazon Kendra** | **Amazon OpenSearch Service** | **Amazon RDS for PostgreSQLwith pgvector** | **Amazon DocumentDB** | **Amazon MemoryDB** | **Amazon Neptune Analytics** | **Amazon S3 Vectors** | 
| --- |--- |--- |--- |--- |--- |--- |--- |
| Primary use case | Enterprise search and RAG | Distributed search and analytics | Relational DB with vector support | Document DB with vector search | Real-time in-memory vector search | Graph analytics with vector search | Cost-optimized vector storage | 
| Architecture | Fully managed | Distributed cluster | Relational database | Document-oriented | In-memory database | Graph analytics engine | Serverless object storage | 
| Data model | Document-based | JSON documents | Relational tables | JSON documents | Key-value with JSON | Property graph | Object storage | 
| Vector dimensions | Managed automatically | Up to 16,000 | Configurable | Up to 2,000 (indexed); 16,000 (unindexed) | Up to 32,768 | Configurable | Up to 4,096 | 
| Indexing methods | Automatic | HNSW, IVF | HNSW, IVFFlat | HNSW, IVFFlat | HNSW | Native graph and vector | Automatic | 
| Distance metrics | Automatic | Cosine, Euclidean, dot product | Cosine, Euclidean, inner product | Cosine, Euclidean, dot product | Cosine, Euclidean, inner product | Cosine, Euclidean | Cosine, Euclidean | 
| Query latency | Sub-second | Sub-10 ms (GPU-accelerated) | 10-100 ms | Millisecond | Sub-millisecond | Sub-second | Sub-100 ms | 
| Scaling model | Automatic | Horizontal (add nodes) | Vertical and read replicas | Horizontal (add instances) | Vertical and replicas | Automatic | Automatic (serverless) | 
| Maximum vectors | Managed | Billions (cluster-dependent) | Millions (instance-dependent) | Millions per collection | Millions per database | Billions | 2 billion per index; 10,000 indexes per bucket | 
| Throughput | High | Very high (thousands of QPS) | Medium | High | Very high (millions of requests per day) | High | Medium (optimized for infrequent queries) | 
| Data durability | 99.999999999% (11 9s) | Configurable with replicas | 99.99% (Multi-AZ) | 99.99% (Multi-AZ) | 99.99% (Multi-AZ) | 99.99% | 99.999999999% (11 9s) | 
| Consistency model | Eventual | Eventual (configurable) | Strong (ACID) | Eventual | Strong | Strong | Strong | 
| Additional capabilities | 40 or more data connectors, NLP | Full-text search, analytics, dashboards | SQL queries, ACID transactions | MongoDB API compatibility | Redis API compatibility, caching | Graph algorithms, traversals | Amazon S3 integration, lifecycle policies | 
| Pricing model | Pay per query and storage | Instance hours and storage | Instance hours and storage | Instance hours and storage | Instance hours and storage | Capacity units and storage | Storage, queries, and data transfer | 
| Cost optimization | Usage-based | Reserved instances, auto-scaling | Reserved instances, Aurora Serverless | Reserved instances | Reserved instances | Auto-scaling | Up to 90% savings vs specialized DBs | 
| Best for | Enterprise search with minimal setup | High-throughput, low-latency queries | Hybrid SQL and vector workloads | MongoDB-compatible apps needing vectors | Real-time, ultra-low latency apps | GraphRAG and knowledge graphs | Long-term, cost-effective storage | 
| Ideal query pattern | Frequent enterprise searches | High-frequency real-time queries | Mixed SQL and vector queries | Document queries with semantic search | Millions of requests per day | Graph traversals with vector search | Infrequent queries (minutes to hours) | 
| Setup complexity | Low (fully managed) | Medium (cluster configuration) | Medium (extension setup) | Medium (cluster configuration) | Medium (cluster configuration) | Low (fully managed) | Low (serverless) | 
| Team expertise required | Minimal | OpenSearch or Elasticsearch | PostgreSQL, SQL | MongoDB | Redis | Graph databases | Amazon S3, basic vector concepts | 

## Managed service – Amazon Bedrock Knowledge Bases
<a name="managed-service"></a>

Amazon Bedrock Knowledge Bases provides a fully managed solution with multiple vector storage options. The following table compares these storage options.


| 
| 
| **Feature** | **Aurora PostgreSQLwith pgvector** | **Neptune Analytics** | **OpenSearch Service Serverless** | **Amazon S3 vectors** | **Pinecone** | **RedisEnterprise Cloud** | 
| --- |--- |--- |--- |--- |--- |--- |
| Primary use case | Relational DB with vector RAG | Graph-based vector search for GraphRAG | Knowledge management RAG | Cost-optimized vector RAG | High-performance vector search | In-memory vector search | 
| Architecture | Fully managed relational | Fully managed graph analytics | Fully managed serverless | Serverless object storage | Fully managed hybrid cloud | Fully managed in-memory | 
| Data model | Relational tables | Property graph | JSON documents | Object storage | Purpose-built vectors | Key-value with vectors | 
| Vector storage | Through pgvector extension | Native graph vectors | Through OpenSearch engine | Native Amazon S3 vector storage | Native vector database | In-memory vectors | 
| Amazon Bedrock integration | Native | Native | Native | Native | Native | Native | 
| Automatic ingestion | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | 
| Automatic vectorization | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | Yes (via Amazon Bedrock) | 
| Scaling | Auto-scaling (Aurora Serverless) | Automatic graph scaling | Automatic serverless | Automatic (billions of vectors) | Auto-scaling pods | Auto-scaling clusters | 
| Query performance | High for relational or vector | High for graph vectors | High | Medium (100 ms or more latency) | Very high | Very high | 
| Maximum vectors | Millions (instance-dependent) | Billions | Billions | 2 billion per index | Billions | Millions (memory-dependent) | 
| Additional capabilities | SQL queries, ACID transactions | Graph algorithms, traversals | Full-text search, analytics | Amazon S3 lifecycle, tiering | Metadata filtering, namespaces | Redis data structures, caching | 
| Cost optimization | Moderate (Aurora Serverless) | Moderate (capacity units) | High (serverless, pay-per-use) | Very high (up to 90% savings) | Moderate (pod-based pricing) | Low (in-memory premium) | 
| Best for | Hybrid SQL/vector workloads | Connected knowledge graphs | Full-text with vector search | Long-term, infrequent-access vectors | Real-time vector search at scale | Ultra-low latency needs | 
| Ideal query pattern | Mixed SQL and vector queries | Graph traversals with vectors | Frequent searches with analytics | Infrequent retrieval (minutes to hours) | High-frequency real-time queries | Millions of requests per second | 
| Setup with Amazon Bedrock | Simple (managed by Amazon Bedrock) | Simple (managed by Amazon Bedrock) | Simple (managed by Amazon Bedrock) | Simple (managed by Amazon Bedrock) | Simple (managed by Amazon Bedrock) | Simple (managed by Amazon Bedrock) | 
| Data residency | AWS Regions | AWS Regions | AWS Regions | AWS Regions | Multi-cloud (AWS and others) | Multi-cloud (AWS and others) | 
| Pricing model | Instance hours and storage | Capacity units and storage | Compute and storage (serverless) | Storage, queries, and transfer | Pod hours and storage | Node hours and storage | 

## Choosing between individual and managed options
<a name="decision-matrix"></a>


| 
| 
| **Consideration** | **Choose individual vector DB** | **Choose Amazon Bedrock Knowledge Bases (managed)** | 
| --- |--- |--- |
| RAG implementation | You want full control over RAG pipeline | You want fully managed RAG with minimal setup | 
| Customization | You need custom retrieval logic and preprocessing | Standard RAG patterns meet your needs | 
| Existing infrastructure | You already have the database deployed | You're starting fresh or want simplified management | 
| Team expertise | Your team has database administration expertise | You prefer to focus on application logic, not infrastructure | 
| Integration complexity | You need deep integration with existing systems | You want quick integration with Amazon Bedrock models | 
| Operational overhead | You can manage database operations | You want AWS to handle operations | 
| Cost structure | You prefer direct database pricing | You prefer unified Amazon Bedrock pricing | 
| Time to market | You have time for custom implementation | You need rapid deployment |