PairwiseStringEvaluator Class — langchain Architecture

Architecture documentation for the PairwiseStringEvaluator class in schema.py from the langchain codebase.

Class python

Entity Profile

Dependency Diagram

graph TD
  910b9203_afa5_b8ca_1a1e_3933f70c340f["PairwiseStringEvaluator"]
  c3049278_87a9_b830_293b_a3dcf7ed9dd3["_EvalArgsMixin"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|extends| c3049278_87a9_b830_293b_a3dcf7ed9dd3
  b8a2957b_df2e_04bc_f892_0752a91e1a55["schema.py"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|defined in| b8a2957b_df2e_04bc_f892_0752a91e1a55
  6695fe03_0fd9_7552_8ebc_8e87d5983012["_evaluate_string_pairs()"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|method| 6695fe03_0fd9_7552_8ebc_8e87d5983012
  a7651907_5732_2675_07d4_747590e3fc72["_aevaluate_string_pairs()"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|method| a7651907_5732_2675_07d4_747590e3fc72
  3ea3b093_5648_88aa_046a_77ffd66a2814["evaluate_string_pairs()"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|method| 3ea3b093_5648_88aa_046a_77ffd66a2814
  d7002ca7_f638_ad53_b56f_74502f453dd5["aevaluate_string_pairs()"]
  910b9203_afa5_b8ca_1a1e_3933f70c340f -->|method| d7002ca7_f638_ad53_b56f_74502f453dd5

Relationship Graph

Source Code

libs/langchain/langchain_classic/evaluation/schema.py lines 265–380

class PairwiseStringEvaluator(_EvalArgsMixin, ABC):
    """Compare the output of two models (or two outputs of the same model)."""

    @abstractmethod
    def _evaluate_string_pairs(
        self,
        *,
        prediction: str,
        prediction_b: str,
        reference: str | None = None,
        input: str | None = None,  # noqa: A002
        **kwargs: Any,
    ) -> dict:
        """Evaluate the output string pairs.

        Args:
            prediction: The output string from the first model.
            prediction_b: The output string from the second model.
            reference: The expected output / reference string.
            input: The input string.
            **kwargs: Additional keyword arguments, such as callbacks and optional reference strings.

        Returns:
            `dict` containing the preference, scores, and/or other information.
        """  # noqa: E501

    async def _aevaluate_string_pairs(
        self,
        *,
        prediction: str,
        prediction_b: str,
        reference: str | None = None,
        input: str | None = None,  # noqa: A002
        **kwargs: Any,
    ) -> dict:
        """Asynchronously evaluate the output string pairs.

        Args:
            prediction: The output string from the first model.
            prediction_b: The output string from the second model.
            reference: The expected output / reference string.
            input: The input string.
            **kwargs: Additional keyword arguments, such as callbacks and optional reference strings.

        Returns:
            `dict` containing the preference, scores, and/or other information.
        """  # noqa: E501
        return await run_in_executor(
            None,
            self._evaluate_string_pairs,
            prediction=prediction,
            prediction_b=prediction_b,
            reference=reference,
            input=input,
            **kwargs,
        )

    def evaluate_string_pairs(
        self,
        *,
        prediction: str,
        prediction_b: str,
        reference: str | None = None,
        input: str | None = None,  # noqa: A002
        **kwargs: Any,
    ) -> dict:
        """Evaluate the output string pairs.

        Args:
            prediction: The output string from the first model.
            prediction_b: The output string from the second model.
            reference: The expected output / reference string.
            input: The input string.
            **kwargs: Additional keyword arguments, such as callbacks and optional reference strings.

        Returns:
            `dict` containing the preference, scores, and/or other information.
        """  # noqa: E501
        self._check_evaluation_args(reference=reference, input_=input)
        return self._evaluate_string_pairs(
            prediction=prediction,