arun_on_dataset() — langchain Function Reference

Architecture documentation for the arun_on_dataset() function in runner_utils.py from the langchain codebase.

Function python LangChainCore MessageInterface calls 2

Entity Profile

LangChainCore→ MessageInterface→ arun_on_dataset() — langchain Function Reference

Dependency Diagram

graph TD
  1ac4f6b0_183e_bd75_7f40_81c2246416b6["arun_on_dataset()"]
  8253c602_7d0c_9195_a7e1_3e9b19304131["runner_utils.py"]
  1ac4f6b0_183e_bd75_7f40_81c2246416b6 -->|defined in| 8253c602_7d0c_9195_a7e1_3e9b19304131
  00d82cfb_ba59_4f67_e504_1faad0617f06["prepare()"]
  1ac4f6b0_183e_bd75_7f40_81c2246416b6 -->|calls| 00d82cfb_ba59_4f67_e504_1faad0617f06
  f2fb82ef_40a0_07e3_1d8e_3a52a5a502ce["finish()"]
  1ac4f6b0_183e_bd75_7f40_81c2246416b6 -->|calls| f2fb82ef_40a0_07e3_1d8e_3a52a5a502ce
  style 1ac4f6b0_183e_bd75_7f40_81c2246416b6 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/langchain/langchain_classic/smith/evaluation/runner_utils.py lines 1338–1509

async def arun_on_dataset(
    client: Client | None,
    dataset_name: str,
    llm_or_chain_factory: MODEL_OR_CHAIN_FACTORY,
    *,
    evaluation: smith_eval.RunEvalConfig | None = None,
    dataset_version: datetime | str | None = None,
    concurrency_level: int = 5,
    project_name: str | None = None,
    project_metadata: dict[str, Any] | None = None,
    verbose: bool = False,
    revision_id: str | None = None,
    **kwargs: Any,
) -> dict[str, Any]:
    """Run on dataset.

    Run the Chain or language model on a dataset and store traces
    to the specified project name.

    For the (usually faster) async version of this function,
    see `arun_on_dataset`.

    Args:
        dataset_name: Name of the dataset to run the chain on.
        llm_or_chain_factory: Language model or Chain constructor to run
            over the dataset. The Chain constructor is used to permit
            independent calls on each example without carrying over state.
        evaluation: Configuration for evaluators to run on the
            results of the chain.
        dataset_version: Optional version of the dataset.
        concurrency_level: The number of async tasks to run concurrently.
        project_name: Name of the project to store the traces in.
            Defaults to `{dataset_name}-{chain class name}-{datetime}`.
        project_metadata: Optional metadata to add to the project.
            Useful for storing information the test variant.
            (prompt version, model version, etc.)
        client: LangSmith client to use to access the dataset and to
            log feedback and run traces.
        verbose: Whether to print progress.
        revision_id: Optional revision identifier to assign this test run to
            track the performance of different versions of your system.
        **kwargs: Should not be used, but is provided for backwards compatibility.

    Returns:
        `dict` containing the run's project name and the resulting model outputs.

    Examples:
    ```python
    from langsmith import Client
    from langchain_openai import ChatOpenAI
    from langchain_classic.chains import LLMChain
    from langchain_classic.smith import smith_eval.RunEvalConfig, run_on_dataset

    # Chains may have memory. Passing in a constructor function lets the
    # evaluation framework avoid cross-contamination between runs.
    def construct_chain():
        model = ChatOpenAI(temperature=0)
        chain = LLMChain.from_string(
            model,
            "What's the answer to {your_input_key}"
        )
        return chain

    # Load off-the-shelf evaluators via config or the EvaluatorType (string or enum)
    evaluation_config = smith_eval.RunEvalConfig(
        evaluators=[
            "qa",  # "Correctness" against a reference answer
            "embedding_distance",
            smith_eval.RunEvalConfig.Criteria("helpfulness"),
            smith_eval.RunEvalConfig.Criteria({
                "fifth-grader-score": "Do you have to be smarter than a fifth "
                "grader to answer this question?"
            }),
        ]
    )

    client = Client()
    await arun_on_dataset(
        client,
        dataset_name="<my_dataset_name>",
        llm_or_chain_factory=construct_chain,