ToolCallLimitMiddleware Class — langchain Architecture

Architecture documentation for the ToolCallLimitMiddleware class in tool_call_limit.py from the langchain codebase.

Class python

Entity Profile

Dependency Diagram

graph TD
  68f7858b_b207_3223_e360_e5e5b35adf21["ToolCallLimitMiddleware"]
  de5a7878_b3fe_95d7_2575_7f534546dc1e["AIMessage"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|extends| de5a7878_b3fe_95d7_2575_7f534546dc1e
  4eaf9f24_fce9_0af1_80e1_f2e4f9935aeb["tool_call_limit.py"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|defined in| 4eaf9f24_fce9_0af1_80e1_f2e4f9935aeb
  334dcb67_a99d_7d1b_fe5a_d24003d136d6["__init__()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| 334dcb67_a99d_7d1b_fe5a_d24003d136d6
  fc849828_dd4d_4c4b_d266_09dd9b4abea8["name()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| fc849828_dd4d_4c4b_d266_09dd9b4abea8
  2e791b1d_2872_d64f_168d_31d605574212["_would_exceed_limit()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| 2e791b1d_2872_d64f_168d_31d605574212
  69072976_06ba_948b_d3cc_f7ede1e73d0f["_matches_tool_filter()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| 69072976_06ba_948b_d3cc_f7ede1e73d0f
  6385d4be_67e3_d055_9df4_4d5383df6d8d["_separate_tool_calls()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| 6385d4be_67e3_d055_9df4_4d5383df6d8d
  2d4474df_60a6_0dd0_a7de_afbe664f4896["after_model()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| 2d4474df_60a6_0dd0_a7de_afbe664f4896
  b3442318_f785_696a_62ca_a4ede582e8fd["aafter_model()"]
  68f7858b_b207_3223_e360_e5e5b35adf21 -->|method| b3442318_f785_696a_62ca_a4ede582e8fd

Relationship Graph

Source Code

libs/langchain_v1/langchain/agents/middleware/tool_call_limit.py lines 140–488

class ToolCallLimitMiddleware(AgentMiddleware[ToolCallLimitState[ResponseT], ContextT, ResponseT]):
    """Track tool call counts and enforces limits during agent execution.

    This middleware monitors the number of tool calls made and can terminate or
    restrict execution when limits are exceeded. It supports both thread-level
    (persistent across runs) and run-level (per invocation) call counting.

    Configuration:
        - `exit_behavior`: How to handle when limits are exceeded
            - `'continue'`: Block exceeded tools, let execution continue (default)
            - `'error'`: Raise an exception
            - `'end'`: Stop immediately with a `ToolMessage` + AI message for the single
                tool call that exceeded the limit (raises `NotImplementedError` if there
                are other pending tool calls (due to parallel tool calling).

    Examples:
        !!! example "Continue execution with blocked tools (default)"

            ```python
            from langchain.agents.middleware.tool_call_limit import ToolCallLimitMiddleware
            from langchain.agents import create_agent

            # Block exceeded tools but let other tools and model continue
            limiter = ToolCallLimitMiddleware(
                thread_limit=20,
                run_limit=10,
                exit_behavior="continue",  # default
            )

            agent = create_agent("openai:gpt-4o", middleware=[limiter])
            ```

        !!! example "Stop immediately when limit exceeded"

            ```python
            # End execution immediately with an AI message
            limiter = ToolCallLimitMiddleware(run_limit=5, exit_behavior="end")

            agent = create_agent("openai:gpt-4o", middleware=[limiter])
            ```

        !!! example "Raise exception on limit"

            ```python
            # Strict limit with exception handling
            limiter = ToolCallLimitMiddleware(
                tool_name="search", thread_limit=5, exit_behavior="error"
            )

            agent = create_agent("openai:gpt-4o", middleware=[limiter])

            try:
                result = await agent.invoke({"messages": [HumanMessage("Task")]})
            except ToolCallLimitExceededError as e:
                print(f"Search limit exceeded: {e}")
            ```

    """

    state_schema = ToolCallLimitState  # type: ignore[assignment]

    def __init__(
        self,
        *,
        tool_name: str | None = None,
        thread_limit: int | None = None,
        run_limit: int | None = None,
        exit_behavior: ExitBehavior = "continue",
    ) -> None:
        """Initialize the tool call limit middleware.

        Args:
            tool_name: Name of the specific tool to limit. If `None`, limits apply
                to all tools.
            thread_limit: Maximum number of tool calls allowed per thread.
                `None` means no limit.
            run_limit: Maximum number of tool calls allowed per run.
                `None` means no limit.
            exit_behavior: How to handle when limits are exceeded.

                - `'continue'`: Block exceeded tools with error messages, let other