run_on_dataset() — langchain Function Reference
Architecture documentation for the run_on_dataset() function in runner_utils.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 0522e7ee_f1e6_b6f7_6738_1dad72cfffba["run_on_dataset()"] 8253c602_7d0c_9195_a7e1_3e9b19304131["runner_utils.py"] 0522e7ee_f1e6_b6f7_6738_1dad72cfffba -->|defined in| 8253c602_7d0c_9195_a7e1_3e9b19304131 c2ae8ee6_ba74_2f11_df16_cafb61b88f1e["_wrap_in_chain_factory()"] c2ae8ee6_ba74_2f11_df16_cafb61b88f1e -->|calls| 0522e7ee_f1e6_b6f7_6738_1dad72cfffba 9a9f493e_7864_c75d_ebe0_af192df494f6["_prepare_eval_run()"] 9a9f493e_7864_c75d_ebe0_af192df494f6 -->|calls| 0522e7ee_f1e6_b6f7_6738_1dad72cfffba 00d82cfb_ba59_4f67_e504_1faad0617f06["prepare()"] 0522e7ee_f1e6_b6f7_6738_1dad72cfffba -->|calls| 00d82cfb_ba59_4f67_e504_1faad0617f06 385c1e91_d947_1192_8746_ee1dd66ceb54["_run_llm_or_chain()"] 0522e7ee_f1e6_b6f7_6738_1dad72cfffba -->|calls| 385c1e91_d947_1192_8746_ee1dd66ceb54 f2fb82ef_40a0_07e3_1d8e_3a52a5a502ce["finish()"] 0522e7ee_f1e6_b6f7_6738_1dad72cfffba -->|calls| f2fb82ef_40a0_07e3_1d8e_3a52a5a502ce style 0522e7ee_f1e6_b6f7_6738_1dad72cfffba fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/langchain/langchain_classic/smith/evaluation/runner_utils.py lines 1512–1698
def run_on_dataset(
client: Client | None,
dataset_name: str,
llm_or_chain_factory: MODEL_OR_CHAIN_FACTORY,
*,
evaluation: smith_eval.RunEvalConfig | None = None,
dataset_version: datetime | str | None = None,
concurrency_level: int = 5,
project_name: str | None = None,
project_metadata: dict[str, Any] | None = None,
verbose: bool = False,
revision_id: str | None = None,
**kwargs: Any,
) -> dict[str, Any]:
"""Run on dataset.
Run the Chain or language model on a dataset and store traces
to the specified project name.
For the (usually faster) async version of this function,
see `arun_on_dataset`.
Args:
dataset_name: Name of the dataset to run the chain on.
llm_or_chain_factory: Language model or Chain constructor to run
over the dataset. The Chain constructor is used to permit
independent calls on each example without carrying over state.
evaluation: Configuration for evaluators to run on the
results of the chain.
dataset_version: Optional version of the dataset.
concurrency_level: The number of async tasks to run concurrently.
project_name: Name of the project to store the traces in.
Defaults to `{dataset_name}-{chain class name}-{datetime}`.
project_metadata: Optional metadata to add to the project.
Useful for storing information the test variant.
(prompt version, model version, etc.)
client: LangSmith client to use to access the dataset and to
log feedback and run traces.
verbose: Whether to print progress.
revision_id: Optional revision identifier to assign this test run to
track the performance of different versions of your system.
**kwargs: Should not be used, but is provided for backwards compatibility.
Returns:
`dict` containing the run's project name and the resulting model outputs.
Examples:
```python
from langsmith import Client
from langchain_openai import ChatOpenAI
from langchain_classic.chains import LLMChain
from langchain_classic.smith import smith_eval.RunEvalConfig, run_on_dataset
# Chains may have memory. Passing in a constructor function lets the
# evaluation framework avoid cross-contamination between runs.
def construct_chain():
model = ChatOpenAI(temperature=0)
chain = LLMChain.from_string(
model,
"What's the answer to {your_input_key}"
)
return chain
# Load off-the-shelf evaluators via config or the EvaluatorType (string or enum)
evaluation_config = smith_eval.RunEvalConfig(
evaluators=[
"qa", # "Correctness" against a reference answer
"embedding_distance",
smith_eval.RunEvalConfig.Criteria("helpfulness"),
smith_eval.RunEvalConfig.Criteria({
"fifth-grader-score": "Do you have to be smarter than a fifth "
"grader to answer this question?"
}),
]
)
client = Client()
run_on_dataset(
client,
dataset_name="<my_dataset_name>",
llm_or_chain_factory=construct_chain,
Domain
Subdomains
Source
Frequently Asked Questions
What does run_on_dataset() do?
run_on_dataset() is a function in the langchain codebase, defined in libs/langchain/langchain_classic/smith/evaluation/runner_utils.py.
Where is run_on_dataset() defined?
run_on_dataset() is defined in libs/langchain/langchain_classic/smith/evaluation/runner_utils.py at line 1512.
What does run_on_dataset() call?
run_on_dataset() calls 3 function(s): _run_llm_or_chain, finish, prepare.
What calls run_on_dataset()?
run_on_dataset() is called by 2 function(s): _prepare_eval_run, _wrap_in_chain_factory.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free