RefineDocumentsChain Class — langchain Architecture

Architecture documentation for the RefineDocumentsChain class in refine.py from the langchain codebase.

Class python

Entity Profile

Dependency Diagram

graph TD
  d0a9b180_8f2a_0a58_5392_0a04880a6b38["RefineDocumentsChain"]
  2f364d76_a69d_403d_0a63_04792fe626bb["BaseCombineDocumentsChain"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|extends| 2f364d76_a69d_403d_0a63_04792fe626bb
  74d6ec2a_05d5_3e6e_3e6a_6069ad631556["refine.py"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|defined in| 74d6ec2a_05d5_3e6e_3e6a_6069ad631556
  3d1029f5_05d1_4078_b275_efe2b3eeb0fd["output_keys()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 3d1029f5_05d1_4078_b275_efe2b3eeb0fd
  0c5bd9e1_b148_fbaf_9a71_78d3c6a7c683["get_return_intermediate_steps()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 0c5bd9e1_b148_fbaf_9a71_78d3c6a7c683
  0e47fcc0_ce15_4829_b595_ad6347624db4["get_default_document_variable_name()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 0e47fcc0_ce15_4829_b595_ad6347624db4
  65338410_0245_c6de_1cd9_f26020d9e985["combine_docs()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 65338410_0245_c6de_1cd9_f26020d9e985
  b8ccc423_c361_3ff8_e8ee_a7fc228a1cb1["acombine_docs()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| b8ccc423_c361_3ff8_e8ee_a7fc228a1cb1
  04688adf_0639_ab43_5c3d_32ac51e84466["_construct_result()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 04688adf_0639_ab43_5c3d_32ac51e84466
  546e08a8_7a35_4f89_2107_54e2c8bffe5f["_construct_refine_inputs()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 546e08a8_7a35_4f89_2107_54e2c8bffe5f
  37a82994_6ed9_4853_ba28_ec381055abd1["_construct_initial_inputs()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 37a82994_6ed9_4853_ba28_ec381055abd1
  752909cd_d766_2a12_1d46_27c062827115["_chain_type()"]
  d0a9b180_8f2a_0a58_5392_0a04880a6b38 -->|method| 752909cd_d766_2a12_1d46_27c062827115

Relationship Graph

Source Code

libs/langchain/langchain_classic/chains/combine_documents/refine.py lines 33–229

class RefineDocumentsChain(BaseCombineDocumentsChain):
    """Combine documents by doing a first pass and then refining on more documents.

    This algorithm first calls `initial_llm_chain` on the first document, passing
    that first document in with the variable name `document_variable_name`, and
    produces a new variable with the variable name `initial_response_name`.

    Then, it loops over every remaining document. This is called the "refine" step.
    It calls `refine_llm_chain`,
    passing in that document with the variable name `document_variable_name`
    as well as the previous response with the variable name `initial_response_name`.

    Example:
        ```python
        from langchain_classic.chains import RefineDocumentsChain, LLMChain
        from langchain_core.prompts import PromptTemplate
        from langchain_openai import OpenAI

        # This controls how each document will be formatted. Specifically,
        # it will be passed to `format_document` - see that function for more
        # details.
        document_prompt = PromptTemplate(
            input_variables=["page_content"], template="{page_content}"
        )
        document_variable_name = "context"
        model = OpenAI()
        # The prompt here should take as an input variable the
        # `document_variable_name`
        prompt = PromptTemplate.from_template("Summarize this content: {context}")
        initial_llm_chain = LLMChain(llm=model, prompt=prompt)
        initial_response_name = "prev_response"
        # The prompt here should take as an input variable the
        # `document_variable_name` as well as `initial_response_name`
        prompt_refine = PromptTemplate.from_template(
            "Here's your first summary: {prev_response}. "
            "Now add to it based on the following context: {context}"
        )
        refine_llm_chain = LLMChain(llm=model, prompt=prompt_refine)
        chain = RefineDocumentsChain(
            initial_llm_chain=initial_llm_chain,
            refine_llm_chain=refine_llm_chain,
            document_prompt=document_prompt,
            document_variable_name=document_variable_name,
            initial_response_name=initial_response_name,
        )
        ```
    """

    initial_llm_chain: LLMChain
    """LLM chain to use on initial document."""
    refine_llm_chain: LLMChain
    """LLM chain to use when refining."""
    document_variable_name: str
    """The variable name in the initial_llm_chain to put the documents in.
    If only one variable in the initial_llm_chain, this need not be provided."""
    initial_response_name: str
    """The variable name to format the initial response in when refining."""
    document_prompt: BasePromptTemplate = Field(
        default_factory=_get_default_document_prompt,
    )
    """Prompt to use to format each document, gets passed to `format_document`."""
    return_intermediate_steps: bool = False
    """Return the results of the refine steps in the output."""

    @property
    def output_keys(self) -> list[str]:
        """Expect input key."""
        _output_keys = super().output_keys
        if self.return_intermediate_steps:
            _output_keys = [*_output_keys, "intermediate_steps"]
        return _output_keys

    model_config = ConfigDict(
        arbitrary_types_allowed=True,
        extra="forbid",
    )

    @model_validator(mode="before")
    @classmethod
    def get_return_intermediate_steps(cls, values: dict) -> Any:
        """For backwards compatibility."""