Home / File/ transformers.py — langchain Source File

transformers.py — langchain Source File

Architecture documentation for transformers.py, a python file in the langchain codebase. 5 imports, 0 dependents.

File python DocumentProcessing DataLoaders 5 imports 1 functions 1 classes

Entity Profile

Dependency Diagram

graph LR
  2fc89e25_3b5f_37e3_4816_8dcd082ad581["transformers.py"]
  cccbe73e_4644_7211_4d55_e8fb133a8014["abc"]
  2fc89e25_3b5f_37e3_4816_8dcd082ad581 --> cccbe73e_4644_7211_4d55_e8fb133a8014
  8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"]
  2fc89e25_3b5f_37e3_4816_8dcd082ad581 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3
  2971f9da_6393_a3e3_610e_ace3d35ee978["langchain_core.runnables.config"]
  2fc89e25_3b5f_37e3_4816_8dcd082ad581 --> 2971f9da_6393_a3e3_610e_ace3d35ee978
  cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7["collections.abc"]
  2fc89e25_3b5f_37e3_4816_8dcd082ad581 --> cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  2fc89e25_3b5f_37e3_4816_8dcd082ad581 --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  style 2fc89e25_3b5f_37e3_4816_8dcd082ad581 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

"""Document transformers."""

from __future__ import annotations

from abc import ABC, abstractmethod
from typing import TYPE_CHECKING, Any

from langchain_core.runnables.config import run_in_executor

if TYPE_CHECKING:
    from collections.abc import Sequence

    from langchain_core.documents import Document


class BaseDocumentTransformer(ABC):
    """Abstract base class for document transformation.

    A document transformation takes a sequence of `Document` objects and returns a
    sequence of transformed `Document` objects.

    Example:
        ```python
        class EmbeddingsRedundantFilter(BaseDocumentTransformer, BaseModel):
            embeddings: Embeddings
            similarity_fn: Callable = cosine_similarity
            similarity_threshold: float = 0.95

            class Config:
                arbitrary_types_allowed = True

            def transform_documents(
                self, documents: Sequence[Document], **kwargs: Any
            ) -> Sequence[Document]:
                stateful_documents = get_stateful_documents(documents)
                embedded_documents = _get_embeddings_from_stateful_docs(
                    self.embeddings, stateful_documents
                )
                included_idxs = _filter_similar_embeddings(
                    embedded_documents,
                    self.similarity_fn,
                    self.similarity_threshold,
                )
                return [stateful_documents[i] for i in sorted(included_idxs)]

            async def atransform_documents(
                self, documents: Sequence[Document], **kwargs: Any
            ) -> Sequence[Document]:
                raise NotImplementedError
        ```
    """

    @abstractmethod
    def transform_documents(
        self, documents: Sequence[Document], **kwargs: Any
    ) -> Sequence[Document]:
        """Transform a list of documents.

        Args:
            documents: A sequence of `Document` objects to be transformed.

        Returns:
            A sequence of transformed `Document` objects.
        """

    async def atransform_documents(
        self, documents: Sequence[Document], **kwargs: Any
    ) -> Sequence[Document]:
        """Asynchronously transform a list of documents.

        Args:
            documents: A sequence of `Document` objects to be transformed.

        Returns:
            A sequence of transformed `Document` objects.
        """
        return await run_in_executor(
            None, self.transform_documents, documents, **kwargs
        )

Subdomains

Functions

Dependencies

  • abc
  • collections.abc
  • langchain_core.documents
  • langchain_core.runnables.config
  • typing

Frequently Asked Questions

What does transformers.py do?
transformers.py is a source file in the langchain codebase, written in python. It belongs to the DocumentProcessing domain, DataLoaders subdomain.
What functions are defined in transformers.py?
transformers.py defines 1 function(s): collections.
What does transformers.py depend on?
transformers.py imports 5 module(s): abc, collections.abc, langchain_core.documents, langchain_core.runnables.config, typing.
Where is transformers.py in the architecture?
transformers.py is located at libs/core/langchain_core/documents/transformers.py (domain: DocumentProcessing, subdomain: DataLoaders, directory: libs/core/langchain_core/documents).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free