Home / Function/ create_documents() — langchain Function Reference

create_documents() — langchain Function Reference

Architecture documentation for the create_documents() function in base.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  a4cdf08b_5d25_7d6b_a425_7a96372e8666["create_documents()"]
  c86e37d5_f962_cc1e_9821_b665e1359ae8["TextSplitter"]
  a4cdf08b_5d25_7d6b_a425_7a96372e8666 -->|defined in| c86e37d5_f962_cc1e_9821_b665e1359ae8
  d14f3e1b_dd57_6268_5d47_c8b53356440d["split_documents()"]
  d14f3e1b_dd57_6268_5d47_c8b53356440d -->|calls| a4cdf08b_5d25_7d6b_a425_7a96372e8666
  01cef059_4479_0a04_53ff_2c366fd5c5bf["split_text()"]
  a4cdf08b_5d25_7d6b_a425_7a96372e8666 -->|calls| 01cef059_4479_0a04_53ff_2c366fd5c5bf
  style a4cdf08b_5d25_7d6b_a425_7a96372e8666 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/base.py lines 103–129

    def create_documents(
        self, texts: list[str], metadatas: list[dict[Any, Any]] | None = None
    ) -> list[Document]:
        """Create a list of `Document` objects from a list of texts.

        Args:
            texts: A list of texts to be split and converted into documents.
            metadatas: Optional list of metadata to associate with each document.

        Returns:
            A list of `Document` objects.
        """
        metadatas_ = metadatas or [{}] * len(texts)
        documents = []
        for i, text in enumerate(texts):
            index = 0
            previous_chunk_len = 0
            for chunk in self.split_text(text):
                metadata = copy.deepcopy(metadatas_[i])
                if self._add_start_index:
                    offset = index + previous_chunk_len - self._chunk_overlap
                    index = text.find(chunk, max(0, offset))
                    metadata["start_index"] = index
                    previous_chunk_len = len(chunk)
                new_doc = Document(page_content=chunk, metadata=metadata)
                documents.append(new_doc)
        return documents

Subdomains

Calls

Called By

Frequently Asked Questions

What does create_documents() do?
create_documents() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/base.py.
Where is create_documents() defined?
create_documents() is defined in libs/text-splitters/langchain_text_splitters/base.py at line 103.
What does create_documents() call?
create_documents() calls 1 function(s): split_text.
What calls create_documents()?
create_documents() is called by 1 function(s): split_documents.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free