Home / Function/ split_documents() — langchain Function Reference

split_documents() — langchain Function Reference

Architecture documentation for the split_documents() function in html.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  242347f4_37b6_e8c6_d9d5_c00530e34196["split_documents()"]
  0c8a5f97_7cb0_fe24_746d_9689c4e5426c["HTMLSectionSplitter"]
  242347f4_37b6_e8c6_d9d5_c00530e34196 -->|defined in| 0c8a5f97_7cb0_fe24_746d_9689c4e5426c
  fb63895c_3000_9932_3530_3357c6736f4f["create_documents()"]
  242347f4_37b6_e8c6_d9d5_c00530e34196 -->|calls| fb63895c_3000_9932_3530_3357c6736f4f
  style 242347f4_37b6_e8c6_d9d5_c00530e34196 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/html.py lines 376–393

    def split_documents(self, documents: Iterable[Document]) -> list[Document]:
        """Split documents.

        Args:
            documents: Iterable of `Document` objects to be split.

        Returns:
            A list of split `Document` objects.
        """
        texts, metadatas = [], []
        for doc in documents:
            texts.append(doc.page_content)
            metadatas.append(doc.metadata)
        results = self.create_documents(texts, metadatas=metadatas)

        text_splitter = RecursiveCharacterTextSplitter(**self.kwargs)

        return text_splitter.split_documents(results)

Subdomains

Frequently Asked Questions

What does split_documents() do?
split_documents() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/html.py.
Where is split_documents() defined?
split_documents() is defined in libs/text-splitters/langchain_text_splitters/html.py at line 376.
What does split_documents() call?
split_documents() calls 1 function(s): create_documents.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free