Home / Function/ transform_documents() — langchain Function Reference

transform_documents() — langchain Function Reference

Architecture documentation for the transform_documents() function in html.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  e9c69e37_40ed_2949_d6dc_f6a7770ff7b8["transform_documents()"]
  5af47ada_f6e1_33df_ed07_12ca64351fa0["HTMLSemanticPreservingSplitter"]
  e9c69e37_40ed_2949_d6dc_f6a7770ff7b8 -->|defined in| 5af47ada_f6e1_33df_ed07_12ca64351fa0
  127c75d0_d814_d16e_a93c_928f021add9c["split_text()"]
  e9c69e37_40ed_2949_d6dc_f6a7770ff7b8 -->|calls| 127c75d0_d814_d16e_a93c_928f021add9c
  3a8f906a_02bf_a0ff_6dbb_2ffbc48f937d["split_text()"]
  e9c69e37_40ed_2949_d6dc_f6a7770ff7b8 -->|calls| 3a8f906a_02bf_a0ff_6dbb_2ffbc48f937d
  style e9c69e37_40ed_2949_d6dc_f6a7770ff7b8 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/html.py lines 737–760

    def transform_documents(
        self, documents: Sequence[Document], **kwargs: Any
    ) -> list[Document]:
        """Transform sequence of documents by splitting them.

        Args:
            documents: A sequence of `Document` objects to be split.

        Returns:
            A sequence of split `Document` objects.
        """
        transformed = []
        for doc in documents:
            splits = self.split_text(doc.page_content)
            if self._preserve_parent_metadata:
                splits = [
                    Document(
                        page_content=split_doc.page_content,
                        metadata={**doc.metadata, **split_doc.metadata},
                    )
                    for split_doc in splits
                ]
            transformed.extend(splits)
        return transformed

Subdomains

Frequently Asked Questions

What does transform_documents() do?
transform_documents() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/html.py.
Where is transform_documents() defined?
transform_documents() is defined in libs/text-splitters/langchain_text_splitters/html.py at line 737.
What does transform_documents() call?
transform_documents() calls 2 function(s): split_text, split_text.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free