test_eval_chain.py — langchain Source File

Architecture documentation for test_eval_chain.py, a python file in the langchain codebase. 4 imports, 0 dependents.

File python CoreAbstractions RunnableInterface 4 imports 6 functions

Entity Profile

CoreAbstractions→ RunnableInterface→ test_eval_chain.py — langchain Source File

Dependency Diagram

graph LR
  5816c849_7673_f83b_f539_9c245cf43f14["test_eval_chain.py"]
  120e2591_3e15_b895_72b6_cb26195e40a6["pytest"]
  5816c849_7673_f83b_f539_9c245cf43f14 --> 120e2591_3e15_b895_72b6_cb26195e40a6
  be300afc_e29c_5acc_fb97_ba6637c7d942["langchain_classic.evaluation.criteria.eval_chain"]
  5816c849_7673_f83b_f539_9c245cf43f14 --> be300afc_e29c_5acc_fb97_ba6637c7d942
  538b302b_528d_b6e6_cf56_04147780d18b["langchain_classic.evaluation.schema"]
  5816c849_7673_f83b_f539_9c245cf43f14 --> 538b302b_528d_b6e6_cf56_04147780d18b
  7e88f5ce_ff41_7d87_8fb2_f355489a149e["tests.unit_tests.llms.fake_llm"]
  5816c849_7673_f83b_f539_9c245cf43f14 --> 7e88f5ce_ff41_7d87_8fb2_f355489a149e
  style 5816c849_7673_f83b_f539_9c245cf43f14 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

"""Test the criteria eval chain."""

import pytest

from langchain_classic.evaluation.criteria.eval_chain import (
    _SUPPORTED_CRITERIA,
    Criteria,
    CriteriaEvalChain,
    CriteriaResultOutputParser,
    LabeledCriteriaEvalChain,
)
from langchain_classic.evaluation.schema import StringEvaluator
from tests.unit_tests.llms.fake_llm import FakeLLM


def test_resolve_criteria_str() -> None:
    assert CriteriaEvalChain.resolve_criteria("helpfulness") == {
        "helpfulness": _SUPPORTED_CRITERIA[Criteria.HELPFULNESS],
    }
    assert CriteriaEvalChain.resolve_criteria("correctness") == {
        "correctness": _SUPPORTED_CRITERIA[Criteria.CORRECTNESS],
    }


@pytest.mark.parametrize(
    ("text", "want"),
    [
        ("Y", {"reasoning": "", "value": "Y", "score": 1}),
        (
            "Here is my step-by-step reasoning for the given criteria:\n"
            'The criterion is: "Do you like cake?" I like cake.\n'
            "Y",
            {
                "reasoning": "Here is my step-by-step reasoning for the given criteria:"
                '\nThe criterion is: "Do you like cake?" I like cake.',
                "value": "Y",
                "score": 1,
            },
        ),
        (
            " NThe submission N is correct, accurate, and factual. It accurately"
            " identifies the specific effects of knowledge and interest on"
            " these factors. Therefore, the submission Y meets the criteria. Y",
            {
                "reasoning": "NThe submission N is correct, accurate, and factual. It"
                " accurately identifies the specific effects of knowledge and interest"
                " on these factors. Therefore, the submission Y meets the criteria.",
                "value": "Y",
                "score": 1,
            },
        ),
    ],
)
def test_criteria_result_output_parser_parse(text: str, want: dict) -> None:
    output_parser = CriteriaResultOutputParser()
    got = output_parser.parse(text)
    assert got.get("reasoning") == want["reasoning"]
    assert got.get("value") == want["value"]
    assert got.get("score") == want["score"]


@pytest.mark.parametrize("criterion", list(Criteria))
def test_resolve_criteria_enum(criterion: Criteria) -> None:
    assert CriteriaEvalChain.resolve_criteria(criterion) == {
        criterion.value: _SUPPORTED_CRITERIA[criterion],
    }


def test_criteria_eval_chain() -> None:
    chain = CriteriaEvalChain.from_llm(
        llm=FakeLLM(
            queries={"text": "The meaning of life\nY"},
            sequential_responses=True,
        ),
        criteria={"my criterion": "my criterion description"},
    )
    with pytest.warns(UserWarning, match=chain._skip_reference_warning):
        result = chain.evaluate_strings(
            prediction="my prediction",
            reference="my reference",
            input="my input",
        )
    assert result["reasoning"] == "The meaning of life"


def test_criteria_eval_chain_missing_reference() -> None:
    chain = LabeledCriteriaEvalChain.from_llm(
        llm=FakeLLM(
            queries={"text": "The meaning of life\nY"},
            sequential_responses=True,
        ),
        criteria={"my criterion": "my criterion description"},
    )
    with pytest.raises(
        ValueError, match="LabeledCriteriaEvalChain requires a reference string"
    ):
        chain.evaluate_strings(prediction="my prediction", input="my input")


def test_implements_string_protocol() -> None:
    assert issubclass(CriteriaEvalChain, StringEvaluator)

Domain

CoreAbstractions

Subdomains

RunnableInterface

Functions

Dependencies

langchain_classic.evaluation.criteria.eval_chain
langchain_classic.evaluation.schema
pytest
tests.unit_tests.llms.fake_llm

Source

View on GitHub

Frequently Asked Questions

What does test_eval_chain.py do?

test_eval_chain.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, RunnableInterface subdomain.

What functions are defined in test_eval_chain.py?

test_eval_chain.py defines 6 function(s): test_criteria_eval_chain, test_criteria_eval_chain_missing_reference, test_criteria_result_output_parser_parse, test_implements_string_protocol, test_resolve_criteria_enum, test_resolve_criteria_str.

What does test_eval_chain.py depend on?

test_eval_chain.py imports 4 module(s): langchain_classic.evaluation.criteria.eval_chain, langchain_classic.evaluation.schema, pytest, tests.unit_tests.llms.fake_llm.

Where is test_eval_chain.py in the architecture?

test_eval_chain.py is located at libs/langchain/tests/unit_tests/evaluation/criteria/test_eval_chain.py (domain: CoreAbstractions, subdomain: RunnableInterface, directory: libs/langchain/tests/unit_tests/evaluation/criteria).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free