EvaluatorCallbackHandler Class — langchain Architecture

Architecture documentation for the EvaluatorCallbackHandler class in evaluation.py from the langchain codebase.

Class python

Entity Profile

Dependency Diagram

graph TD
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd["EvaluatorCallbackHandler"]
  0f6b3261_31fa_c34e_ca33_cb141bdf78ff["BaseTracer"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|extends| 0f6b3261_31fa_c34e_ca33_cb141bdf78ff
  591182fd_2224_b396_14bb_b20582f6b720["evaluation.py"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|defined in| 591182fd_2224_b396_14bb_b20582f6b720
  cbca36a7_cf61_b4f4_35e4_e5b9a1bc5b1e["__init__()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| cbca36a7_cf61_b4f4_35e4_e5b9a1bc5b1e
  0c53e289_4919_bbc6_c165_0a9bf3c71d14["_evaluate_in_project()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| 0c53e289_4919_bbc6_c165_0a9bf3c71d14
  b0dc41a9_2b64_0617_60e8_135a3bffaff5["_select_eval_results()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| b0dc41a9_2b64_0617_60e8_135a3bffaff5
  9cdfe3a0_d3ac_e744_24ad_fa3959981970["_log_evaluation_feedback()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| 9cdfe3a0_d3ac_e744_24ad_fa3959981970
  f8b5f1f4_e3b0_be12_bf7c_561a99a6e105["_persist_run()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| f8b5f1f4_e3b0_be12_bf7c_561a99a6e105
  0ece466d_c5f2_7e8f_ed12_9492fc042332["wait_for_futures()"]
  d98d30f4_d5fd_24fc_54d0_e2f82eecc3cd -->|method| 0ece466d_c5f2_7e8f_ed12_9492fc042332

Relationship Graph

Source Code

libs/core/langchain_core/tracers/evaluation.py lines 38–226

class EvaluatorCallbackHandler(BaseTracer):
    """Tracer that runs a run evaluator whenever a run is persisted.

    Attributes:
        client: The LangSmith client instance used for evaluating the runs.
    """

    name: str = "evaluator_callback_handler"

    example_id: UUID | None = None
    """The example ID associated with the runs."""

    client: langsmith.Client
    """The LangSmith client instance used for evaluating the runs."""

    evaluators: Sequence[langsmith.RunEvaluator] = ()
    """The sequence of run evaluators to be executed."""

    executor: ThreadPoolExecutor | None = None
    """The thread pool executor used for running the evaluators."""

    futures: weakref.WeakSet[Future] = weakref.WeakSet()
    """The set of futures representing the running evaluators."""

    skip_unfinished: bool = True
    """Whether to skip runs that are not finished or raised an error."""

    project_name: str | None = None
    """The LangSmith project name to be organize eval chain runs under."""

    logged_eval_results: dict[tuple[str, str], list[EvaluationResult]]

    lock: threading.Lock

    def __init__(
        self,
        evaluators: Sequence[langsmith.RunEvaluator],
        client: langsmith.Client | None = None,
        example_id: UUID | str | None = None,
        skip_unfinished: bool = True,  # noqa: FBT001,FBT002
        project_name: str | None = "evaluators",
        max_concurrency: int | None = None,
        **kwargs: Any,
    ) -> None:
        """Create an EvaluatorCallbackHandler.

        Args:
            evaluators: The run evaluators to apply to all top level runs.
            client: The LangSmith client instance to use for evaluating the runs.

                If not specified, a new instance will be created.
            example_id: The example ID to be associated with the runs.
            skip_unfinished: Whether to skip unfinished runs.
            project_name: The LangSmith project name to be organize eval chain runs
                under.
            max_concurrency: The maximum number of concurrent evaluators to run.
        """
        super().__init__(**kwargs)
        self.example_id = (
            UUID(example_id) if isinstance(example_id, str) else example_id
        )
        self.client = client or langchain_tracer.get_client()
        self.evaluators = evaluators
        if max_concurrency is None:
            self.executor = _get_executor()
        elif max_concurrency > 0:
            self.executor = ThreadPoolExecutor(max_workers=max_concurrency)
            weakref.finalize(
                self,
                lambda: cast("ThreadPoolExecutor", self.executor).shutdown(wait=True),
            )
        else:
            self.executor = None
        self.futures = weakref.WeakSet[Future[None]]()
        self.skip_unfinished = skip_unfinished
        self.project_name = project_name
        self.logged_eval_results = {}
        self.lock = threading.Lock()
        _TRACERS.add(self)

    def _evaluate_in_project(self, run: Run, evaluator: langsmith.RunEvaluator) -> None: