embeddings_filter.py — langchain Source File
Architecture documentation for embeddings_filter.py, a python file in the langchain codebase. 10 imports, 0 dependents.
Entity Profile
Dependency Diagram
graph LR 56627fd7_312b_b0f6_2695_523285254063["embeddings_filter.py"] cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7["collections.abc"] 56627fd7_312b_b0f6_2695_523285254063 --> cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7 f3bc7443_c889_119d_0744_aacc3620d8d2["langchain_core.callbacks"] 56627fd7_312b_b0f6_2695_523285254063 --> f3bc7443_c889_119d_0744_aacc3620d8d2 c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"] 56627fd7_312b_b0f6_2695_523285254063 --> c554676d_b731_47b2_a98f_c1c2d537c0aa bc46b61d_cfdf_3f6b_a9dd_ac2a328d84b3["langchain_core.embeddings"] 56627fd7_312b_b0f6_2695_523285254063 --> bc46b61d_cfdf_3f6b_a9dd_ac2a328d84b3 f4d905c6_a2b2_eb8f_be9b_7808b72f6a16["langchain_core.utils"] 56627fd7_312b_b0f6_2695_523285254063 --> f4d905c6_a2b2_eb8f_be9b_7808b72f6a16 6e58aaea_f08e_c099_3cc7_f9567bfb1ae7["pydantic"] 56627fd7_312b_b0f6_2695_523285254063 --> 6e58aaea_f08e_c099_3cc7_f9567bfb1ae7 91721f45_4909_e489_8c1f_084f8bd87145["typing_extensions"] 56627fd7_312b_b0f6_2695_523285254063 --> 91721f45_4909_e489_8c1f_084f8bd87145 8593fff4_7ff6_3339_11fb_f4c14be375c2["langchain_community.utils.math"] 56627fd7_312b_b0f6_2695_523285254063 --> 8593fff4_7ff6_3339_11fb_f4c14be375c2 4e227489_ec66_d1c2_22fe_722af1af7107["langchain_community.document_transformers.embeddings_redundant_filter"] 56627fd7_312b_b0f6_2695_523285254063 --> 4e227489_ec66_d1c2_22fe_722af1af7107 cd17727f_b882_7f06_aadc_71fbf75bebb0["numpy"] 56627fd7_312b_b0f6_2695_523285254063 --> cd17727f_b882_7f06_aadc_71fbf75bebb0 style 56627fd7_312b_b0f6_2695_523285254063 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
from collections.abc import Callable, Sequence
from langchain_core.callbacks import Callbacks
from langchain_core.documents import BaseDocumentCompressor, Document
from langchain_core.embeddings import Embeddings
from langchain_core.utils import pre_init
from pydantic import ConfigDict, Field
from typing_extensions import override
def _get_similarity_function() -> Callable:
try:
from langchain_community.utils.math import cosine_similarity
except ImportError as e:
msg = (
"To use please install langchain-community "
"with `pip install langchain-community`."
)
raise ImportError(msg) from e
return cosine_similarity
class EmbeddingsFilter(BaseDocumentCompressor):
"""Embeddings Filter.
Document compressor that uses embeddings to drop documents unrelated to the query.
"""
embeddings: Embeddings
"""Embeddings to use for embedding document contents and queries."""
similarity_fn: Callable = Field(default_factory=_get_similarity_function)
"""Similarity function for comparing documents. Function expected to take as input
two matrices (List[List[float]]) and return a matrix of scores where higher values
indicate greater similarity."""
k: int | None = 20
"""The number of relevant documents to return. Can be set to `None`, in which case
`similarity_threshold` must be specified."""
similarity_threshold: float | None = None
"""Threshold for determining when two documents are similar enough
to be considered redundant. Defaults to `None`, must be specified if `k` is set
to None."""
model_config = ConfigDict(
arbitrary_types_allowed=True,
)
@pre_init
def validate_params(cls, values: dict) -> dict:
"""Validate similarity parameters."""
if values["k"] is None and values["similarity_threshold"] is None:
msg = "Must specify one of `k` or `similarity_threshold`."
raise ValueError(msg)
return values
@override
def compress_documents(
self,
documents: Sequence[Document],
query: str,
callbacks: Callbacks | None = None,
// ... (82 more lines)
Domain
Subdomains
Functions
Classes
Dependencies
- collections.abc
- langchain_community.document_transformers.embeddings_redundant_filter
- langchain_community.utils.math
- langchain_core.callbacks
- langchain_core.documents
- langchain_core.embeddings
- langchain_core.utils
- numpy
- pydantic
- typing_extensions
Source
Frequently Asked Questions
What does embeddings_filter.py do?
embeddings_filter.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, RunnableInterface subdomain.
What functions are defined in embeddings_filter.py?
embeddings_filter.py defines 1 function(s): _get_similarity_function.
What does embeddings_filter.py depend on?
embeddings_filter.py imports 10 module(s): collections.abc, langchain_community.document_transformers.embeddings_redundant_filter, langchain_community.utils.math, langchain_core.callbacks, langchain_core.documents, langchain_core.embeddings, langchain_core.utils, numpy, and 2 more.
Where is embeddings_filter.py in the architecture?
embeddings_filter.py is located at libs/langchain/langchain_classic/retrievers/document_compressors/embeddings_filter.py (domain: CoreAbstractions, subdomain: RunnableInterface, directory: libs/langchain/langchain_classic/retrievers/document_compressors).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free