This page provides details about vector search features and limitations.
Availability
Vector search is available on all Memorystore for Redis Cluster versions across all tiers and all the supported regions.
Only instances created after the launch date of September 13, 2024 have vector search enabled.
Index Restrictions
The following outlines the limitations of the index:
- The maximum number of attributes in an index cannot exceed 10.
- A vector's dimension cannot exceed 32,768.
- The M value for HNSW must not go beyond 2M.
- The EF Construct value for HNSW must not exceed 4096.
- The EF Runtime value for HNSW must also not surpass 4096.
Impacts on Performance
When considering performance of vector search, there are some important variables to consider.
Node Type
Vector search facilitates vertical scaling through the integration of thread pools dedicated to executing vector search operations. This means that performance will be tied to the number of vCPUs on each node in your cluster. For details on the number of vCPUs available on each node type, see Cluster and node specification.
Number of shards
Memorystore for Redis Cluster implements a local indexing technique for all vectors. This means that the index stored on each shard contains only the documents contained on that shard. Because of this, the speed of indexing and the number of total vectors will scale linearly with the number of shards in the cluster.
Since each local index only contains the contents of a single shard, searching the cluster requires searching each shard in the cluster and aggregating the results. With a stable amount of vectors, increasing the number of shards will improve the search performance logarithmically for HNSW indexes and linearly for FLAT indexes due to fewer vectors being contained on each local index.
Note that due to the increased amount of work needed to search all shards, the observable latency to complete a given search request may increase as more shards are added. Despite this, even the largest clusters support single digit millisecond latencies.
Number of replicas
Adding additional replicas will increase search throughput linearly by allowing load balancing of search requests to read replicas.
Scaling Events
Upon resizing your Redis cluster, the documents within your indexes will be moved to uniformly distribute the data across the new shard count. When this happens, documents that are moved across nodes will be indexed in the background. After the scaling operation completes, you can monitor the value of mutation_queue_size
in the FT.INFO output to see the progress of re-indexing due to resizing your cluster.
Memory Consumption
Vectors are duplicated, being stored both within the Redis keyspace and the vector search algorithm.
Transactions
Owing to the asynchronous nature of executing tasks by thread pools, vector search operations don't adhere to transactional semantics.