test_ensemble.py — langchain Source File

Architecture documentation for test_ensemble.py, a python file in the langchain codebase. 5 imports, 0 dependents.

File python CoreAbstractions Serialization 5 imports 1 functions 1 classes

Entity Profile

CoreAbstractions→ Serialization→ test_ensemble.py — langchain Source File

Dependency Diagram

graph LR
  e864dbf7_6db6_640c_a5ae_86c3e52632af["test_ensemble.py"]
  e8ec017e_6c91_4b34_675f_2a96c5aa9be6["langchain_core.callbacks.manager"]
  e864dbf7_6db6_640c_a5ae_86c3e52632af --> e8ec017e_6c91_4b34_675f_2a96c5aa9be6
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  e864dbf7_6db6_640c_a5ae_86c3e52632af --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  38bc5323_3713_7377_32f8_091293bea54b["langchain_core.retrievers"]
  e864dbf7_6db6_640c_a5ae_86c3e52632af --> 38bc5323_3713_7377_32f8_091293bea54b
  91721f45_4909_e489_8c1f_084f8bd87145["typing_extensions"]
  e864dbf7_6db6_640c_a5ae_86c3e52632af --> 91721f45_4909_e489_8c1f_084f8bd87145
  6c572730_ac39_9f50_98ed_f75c67bbdfe6["langchain_classic.retrievers.ensemble"]
  e864dbf7_6db6_640c_a5ae_86c3e52632af --> 6c572730_ac39_9f50_98ed_f75c67bbdfe6
  style e864dbf7_6db6_640c_a5ae_86c3e52632af fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

from langchain_core.callbacks.manager import CallbackManagerForRetrieverRun
from langchain_core.documents import Document
from langchain_core.retrievers import BaseRetriever
from typing_extensions import override

from langchain_classic.retrievers.ensemble import EnsembleRetriever


class MockRetriever(BaseRetriever):
    docs: list[Document]

    @override
    def _get_relevant_documents(
        self,
        query: str,
        *,
        run_manager: CallbackManagerForRetrieverRun | None = None,
    ) -> list[Document]:
        """Return the documents."""
        return self.docs


def test_invoke() -> None:
    documents1 = [
        Document(page_content="a", metadata={"id": 1}),
        Document(page_content="b", metadata={"id": 2}),
        Document(page_content="c", metadata={"id": 3}),
    ]
    documents2 = [Document(page_content="b")]

    retriever1 = MockRetriever(docs=documents1)
    retriever2 = MockRetriever(docs=documents2)

    ensemble_retriever = EnsembleRetriever(
        retrievers=[retriever1, retriever2],
        weights=[0.5, 0.5],
        id_key=None,
    )
    ranked_documents = ensemble_retriever.invoke("_")

    # The document with page_content "b" in documents2
    # will be merged with the document with page_content "b"
    # in documents1, so the length of ranked_documents should be 3.
    # Additionally, the document with page_content "b" will be ranked 1st.
    assert len(ranked_documents) == 3
    assert ranked_documents[0].page_content == "b"

    documents1 = [
        Document(page_content="a", metadata={"id": 1}),
        Document(page_content="b", metadata={"id": 2}),
        Document(page_content="c", metadata={"id": 3}),
    ]
    documents2 = [Document(page_content="d")]

    retriever1 = MockRetriever(docs=documents1)
    retriever2 = MockRetriever(docs=documents2)

    ensemble_retriever = EnsembleRetriever(
        retrievers=[retriever1, retriever2],
        weights=[0.5, 0.5],
        id_key=None,
    )
    ranked_documents = ensemble_retriever.invoke("_")

    # The document with page_content "d" in documents2 will not be merged
    # with any document in documents1, so the length of ranked_documents
    # should be 4. The document with page_content "a" and the document
    # with page_content "d" will have the same score, but the document
    # with page_content "a" will be ranked 1st because retriever1 has a smaller index.
    assert len(ranked_documents) == 4
    assert ranked_documents[0].page_content == "a"

    documents1 = [
        Document(page_content="a", metadata={"id": 1}),
        Document(page_content="b", metadata={"id": 2}),
        Document(page_content="c", metadata={"id": 3}),
    ]
    documents2 = [Document(page_content="d", metadata={"id": 2})]

    retriever1 = MockRetriever(docs=documents1)
    retriever2 = MockRetriever(docs=documents2)

    ensemble_retriever = EnsembleRetriever(
        retrievers=[retriever1, retriever2],
        weights=[0.5, 0.5],
        id_key="id",
    )
    ranked_documents = ensemble_retriever.invoke("_")

    # Since id_key is specified, the document with id 2 will be merged.
    # Therefore, the length of ranked_documents should be 3.
    # Additionally, the document with page_content "b" will be ranked 1st.
    assert len(ranked_documents) == 3
    assert ranked_documents[0].page_content == "b"

Domain

CoreAbstractions

Subdomains

Serialization

Functions

test_invoke()

Classes

MockRetriever

Dependencies

langchain_classic.retrievers.ensemble
langchain_core.callbacks.manager
langchain_core.documents
langchain_core.retrievers
typing_extensions

Source

View on GitHub

Frequently Asked Questions

What does test_ensemble.py do?

test_ensemble.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, Serialization subdomain.

What functions are defined in test_ensemble.py?

test_ensemble.py defines 1 function(s): test_invoke.

What does test_ensemble.py depend on?

test_ensemble.py imports 5 module(s): langchain_classic.retrievers.ensemble, langchain_core.callbacks.manager, langchain_core.documents, langchain_core.retrievers, typing_extensions.

Where is test_ensemble.py in the architecture?

test_ensemble.py is located at libs/langchain/tests/unit_tests/retrievers/test_ensemble.py (domain: CoreAbstractions, subdomain: Serialization, directory: libs/langchain/tests/unit_tests/retrievers).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free