Home / File/ embeddings_filter.py — langchain Source File

embeddings_filter.py — langchain Source File

Architecture documentation for embeddings_filter.py, a python file in the langchain codebase. 10 imports, 0 dependents.

File python CoreAbstractions RunnableInterface 10 imports 1 functions 1 classes

Entity Profile

Dependency Diagram

graph LR
  56627fd7_312b_b0f6_2695_523285254063["embeddings_filter.py"]
  cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7["collections.abc"]
  56627fd7_312b_b0f6_2695_523285254063 --> cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7
  f3bc7443_c889_119d_0744_aacc3620d8d2["langchain_core.callbacks"]
  56627fd7_312b_b0f6_2695_523285254063 --> f3bc7443_c889_119d_0744_aacc3620d8d2
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  56627fd7_312b_b0f6_2695_523285254063 --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  bc46b61d_cfdf_3f6b_a9dd_ac2a328d84b3["langchain_core.embeddings"]
  56627fd7_312b_b0f6_2695_523285254063 --> bc46b61d_cfdf_3f6b_a9dd_ac2a328d84b3
  f4d905c6_a2b2_eb8f_be9b_7808b72f6a16["langchain_core.utils"]
  56627fd7_312b_b0f6_2695_523285254063 --> f4d905c6_a2b2_eb8f_be9b_7808b72f6a16
  6e58aaea_f08e_c099_3cc7_f9567bfb1ae7["pydantic"]
  56627fd7_312b_b0f6_2695_523285254063 --> 6e58aaea_f08e_c099_3cc7_f9567bfb1ae7
  91721f45_4909_e489_8c1f_084f8bd87145["typing_extensions"]
  56627fd7_312b_b0f6_2695_523285254063 --> 91721f45_4909_e489_8c1f_084f8bd87145
  8593fff4_7ff6_3339_11fb_f4c14be375c2["langchain_community.utils.math"]
  56627fd7_312b_b0f6_2695_523285254063 --> 8593fff4_7ff6_3339_11fb_f4c14be375c2
  4e227489_ec66_d1c2_22fe_722af1af7107["langchain_community.document_transformers.embeddings_redundant_filter"]
  56627fd7_312b_b0f6_2695_523285254063 --> 4e227489_ec66_d1c2_22fe_722af1af7107
  cd17727f_b882_7f06_aadc_71fbf75bebb0["numpy"]
  56627fd7_312b_b0f6_2695_523285254063 --> cd17727f_b882_7f06_aadc_71fbf75bebb0
  style 56627fd7_312b_b0f6_2695_523285254063 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

from collections.abc import Callable, Sequence

from langchain_core.callbacks import Callbacks
from langchain_core.documents import BaseDocumentCompressor, Document
from langchain_core.embeddings import Embeddings
from langchain_core.utils import pre_init
from pydantic import ConfigDict, Field
from typing_extensions import override


def _get_similarity_function() -> Callable:
    try:
        from langchain_community.utils.math import cosine_similarity
    except ImportError as e:
        msg = (
            "To use please install langchain-community "
            "with `pip install langchain-community`."
        )
        raise ImportError(msg) from e
    return cosine_similarity


class EmbeddingsFilter(BaseDocumentCompressor):
    """Embeddings Filter.

    Document compressor that uses embeddings to drop documents unrelated to the query.
    """

    embeddings: Embeddings
    """Embeddings to use for embedding document contents and queries."""
    similarity_fn: Callable = Field(default_factory=_get_similarity_function)
    """Similarity function for comparing documents. Function expected to take as input
    two matrices (List[List[float]]) and return a matrix of scores where higher values
    indicate greater similarity."""
    k: int | None = 20
    """The number of relevant documents to return. Can be set to `None`, in which case
    `similarity_threshold` must be specified."""
    similarity_threshold: float | None = None
    """Threshold for determining when two documents are similar enough
    to be considered redundant. Defaults to `None`, must be specified if `k` is set
    to None."""

    model_config = ConfigDict(
        arbitrary_types_allowed=True,
    )

    @pre_init
    def validate_params(cls, values: dict) -> dict:
        """Validate similarity parameters."""
        if values["k"] is None and values["similarity_threshold"] is None:
            msg = "Must specify one of `k` or `similarity_threshold`."
            raise ValueError(msg)
        return values

    @override
    def compress_documents(
        self,
        documents: Sequence[Document],
        query: str,
        callbacks: Callbacks | None = None,
// ... (82 more lines)

Subdomains

Dependencies

  • collections.abc
  • langchain_community.document_transformers.embeddings_redundant_filter
  • langchain_community.utils.math
  • langchain_core.callbacks
  • langchain_core.documents
  • langchain_core.embeddings
  • langchain_core.utils
  • numpy
  • pydantic
  • typing_extensions

Frequently Asked Questions

What does embeddings_filter.py do?
embeddings_filter.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, RunnableInterface subdomain.
What functions are defined in embeddings_filter.py?
embeddings_filter.py defines 1 function(s): _get_similarity_function.
What does embeddings_filter.py depend on?
embeddings_filter.py imports 10 module(s): collections.abc, langchain_community.document_transformers.embeddings_redundant_filter, langchain_community.utils.math, langchain_core.callbacks, langchain_core.documents, langchain_core.embeddings, langchain_core.utils, numpy, and 2 more.
Where is embeddings_filter.py in the architecture?
embeddings_filter.py is located at libs/langchain/langchain_classic/retrievers/document_compressors/embeddings_filter.py (domain: CoreAbstractions, subdomain: RunnableInterface, directory: libs/langchain/langchain_classic/retrievers/document_compressors).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free