Module vectorstore (0.3.0)

API documentation for vectorstore module.

Classes

MySQLVectorStore

MySQLVectorStore(engine: langchain_google_cloud_sql_mysql.engine.MySQLEngine, embedding_service: langchain_core.embeddings.embeddings.Embeddings, table_name: str, content_column: str = 'content', embedding_column: str = 'embedding', metadata_columns: typing.List[str] = [], ignore_metadata_columns: typing.Optional[typing.List[str]] = None, id_column: str = 'langchain_id', metadata_json_column: typing.Optional[str] = 'langchain_metadata', k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, query_options: langchain_google_cloud_sql_mysql.indexes.QueryOptions = QueryOptions(num_partitions=None, num_neighbors=10, distance_measure=<DistanceMeasure.L2_SQUARED: 'l2_squared'>, search_type=<SearchType.KNN: 'KNN'>))

Constructor for MySQLVectorStore.

Parameters
Name Description
engine MySQLEngine

Connection pool engine for managing connections to Cloud SQL for MySQL database.

embedding_service Embeddings

Text embedding model to use.

table_name str

Name of an existing table or table to be created.

content_column str

Column that represent a Document's page_content. Defaults to "content".

embedding_column str

Column for embedding vectors. The embedding is generated from the document value. Defaults to "embedding".

metadata_columns List[str]

Column(s) that represent a document's metadata.

ignore_metadata_columns List[str]

Column(s) to ignore in pre-existing tables for a document's metadata. Can not be used with metadata_columns. Defaults to None.

id_column str

Column that represents the Document's id. Defaults to "langchain_id".

metadata_json_column str

Column to store metadata as JSON. Defaults to "langchain_metadata".

k int

The number of documents to return as the final result of a similarity search. Defaults to 4.

fetch_k int

The number of documents to initially retrieve from the database during a similarity search. These documents are then re-ranked using MMR to select the final k documents. Defaults to 20.

lambda_mult float

The weight used to balance relevance and diversity in the MMR algorithm. A higher value emphasizes diversity more, while a lower value prioritizes relevance. Defaults to 0.5.

Modules Functions

cosine_similarity

cosine_similarity(
    X: typing.Union[
        typing.List[typing.List[float]], typing.List[numpy.ndarray], numpy.ndarray
    ],
    Y: typing.Union[
        typing.List[typing.List[float]], typing.List[numpy.ndarray], numpy.ndarray
    ],
) -> numpy.ndarray

Row-wise cosine similarity between two equal-width matrices.

maximal_marginal_relevance

maximal_marginal_relevance(
    query_embedding: numpy.ndarray,
    embedding_list: list,
    lambda_mult: float = 0.5,
    k: int = 4,
) -> typing.List[int]

Calculate maximal marginal relevance.