Home / Class/ DocumentCompressorPipeline Class — langchain Architecture

DocumentCompressorPipeline Class — langchain Architecture

Architecture documentation for the DocumentCompressorPipeline class in base.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  94714de4_7c52_d36e_a00e_88086fb437ff["DocumentCompressorPipeline"]
  1c219081_6061_3fb9_0ccd_08e0b97c9474["BaseDocumentCompressor"]
  94714de4_7c52_d36e_a00e_88086fb437ff -->|extends| 1c219081_6061_3fb9_0ccd_08e0b97c9474
  91ea4f6e_168e_8d34_bca6_53e61cdc1840["BaseDocumentTransformer"]
  94714de4_7c52_d36e_a00e_88086fb437ff -->|extends| 91ea4f6e_168e_8d34_bca6_53e61cdc1840
  75de190b_0731_ecff_0670_12db0d5e9795["base.py"]
  94714de4_7c52_d36e_a00e_88086fb437ff -->|defined in| 75de190b_0731_ecff_0670_12db0d5e9795
  feea04a0_095e_4647_5bea_20b1cd5ddf1d["compress_documents()"]
  94714de4_7c52_d36e_a00e_88086fb437ff -->|method| feea04a0_095e_4647_5bea_20b1cd5ddf1d
  0c4d469a_35ad_409f_44d2_601ec72d0728["acompress_documents()"]
  94714de4_7c52_d36e_a00e_88086fb437ff -->|method| 0c4d469a_35ad_409f_44d2_601ec72d0728

Relationship Graph

Source Code

libs/langchain/langchain_classic/retrievers/document_compressors/base.py lines 13–81

class DocumentCompressorPipeline(BaseDocumentCompressor):
    """Document compressor that uses a pipeline of Transformers."""

    transformers: list[BaseDocumentTransformer | BaseDocumentCompressor]
    """List of document filters that are chained together and run in sequence."""

    model_config = ConfigDict(
        arbitrary_types_allowed=True,
    )

    def compress_documents(
        self,
        documents: Sequence[Document],
        query: str,
        callbacks: Callbacks | None = None,
    ) -> Sequence[Document]:
        """Transform a list of documents."""
        for _transformer in self.transformers:
            if isinstance(_transformer, BaseDocumentCompressor):
                accepts_callbacks = (
                    signature(_transformer.compress_documents).parameters.get(
                        "callbacks",
                    )
                    is not None
                )
                if accepts_callbacks:
                    documents = _transformer.compress_documents(
                        documents,
                        query,
                        callbacks=callbacks,
                    )
                else:
                    documents = _transformer.compress_documents(documents, query)
            elif isinstance(_transformer, BaseDocumentTransformer):
                documents = _transformer.transform_documents(documents)
            else:
                msg = f"Got unexpected transformer type: {_transformer}"  # type: ignore[unreachable]
                raise ValueError(msg)  # noqa: TRY004
        return documents

    async def acompress_documents(
        self,
        documents: Sequence[Document],
        query: str,
        callbacks: Callbacks | None = None,
    ) -> Sequence[Document]:
        """Compress retrieved documents given the query context."""
        for _transformer in self.transformers:
            if isinstance(_transformer, BaseDocumentCompressor):
                accepts_callbacks = (
                    signature(_transformer.acompress_documents).parameters.get(
                        "callbacks",
                    )
                    is not None
                )
                if accepts_callbacks:
                    documents = await _transformer.acompress_documents(
                        documents,
                        query,
                        callbacks=callbacks,
                    )
                else:
                    documents = await _transformer.acompress_documents(documents, query)
            elif isinstance(_transformer, BaseDocumentTransformer):
                documents = await _transformer.atransform_documents(documents)
            else:
                msg = f"Got unexpected transformer type: {_transformer}"  # type: ignore[unreachable]
                raise ValueError(msg)  # noqa: TRY004
        return documents

Frequently Asked Questions

What is the DocumentCompressorPipeline class?
DocumentCompressorPipeline is a class in the langchain codebase, defined in libs/langchain/langchain_classic/retrievers/document_compressors/base.py.
Where is DocumentCompressorPipeline defined?
DocumentCompressorPipeline is defined in libs/langchain/langchain_classic/retrievers/document_compressors/base.py at line 13.
What does DocumentCompressorPipeline extend?
DocumentCompressorPipeline extends BaseDocumentCompressor, BaseDocumentTransformer.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free