_split_docs_for_adding() — langchain Function Reference

Architecture documentation for the _split_docs_for_adding() function in parent_document_retriever.py from the langchain codebase.

Function python LangChainCore LanguageModelBase called by 2

Entity Profile

LangChainCore→ LanguageModelBase→ _split_docs_for_adding() — langchain Function Reference

Dependency Diagram

graph TD
  913129df_8767_b41b_518c_6c6d98c8bd9d["_split_docs_for_adding()"]
  eb3f3ce7_75a2_f5aa_7781_56a2f86f1973["ParentDocumentRetriever"]
  913129df_8767_b41b_518c_6c6d98c8bd9d -->|defined in| eb3f3ce7_75a2_f5aa_7781_56a2f86f1973
  92bed5c9_a76d_c063_a022_68683b1b371d["add_documents()"]
  92bed5c9_a76d_c063_a022_68683b1b371d -->|calls| 913129df_8767_b41b_518c_6c6d98c8bd9d
  c068c021_6d63_9a46_4f44_2d7d7141278d["aadd_documents()"]
  c068c021_6d63_9a46_4f44_2d7d7141278d -->|calls| 913129df_8767_b41b_518c_6c6d98c8bd9d
  style 913129df_8767_b41b_518c_6c6d98c8bd9d fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/langchain/langchain_classic/retrievers/parent_document_retriever.py lines 76–114

    def _split_docs_for_adding(
        self,
        documents: list[Document],
        ids: list[str] | None = None,
        *,
        add_to_docstore: bool = True,
    ) -> tuple[list[Document], list[tuple[str, Document]]]:
        if self.parent_splitter is not None:
            documents = self.parent_splitter.split_documents(documents)
        if ids is None:
            doc_ids = [str(uuid.uuid4()) for _ in documents]
            if not add_to_docstore:
                msg = "If IDs are not passed in, `add_to_docstore` MUST be True"
                raise ValueError(msg)
        else:
            if len(documents) != len(ids):
                msg = (
                    "Got uneven list of documents and ids. "
                    "If `ids` is provided, should be same length as `documents`."
                )
                raise ValueError(msg)
            doc_ids = ids

        docs = []
        full_docs = []
        for i, doc in enumerate(documents):
            _id = doc_ids[i]
            sub_docs = self.child_splitter.split_documents([doc])
            if self.child_metadata_fields is not None:
                for _doc in sub_docs:
                    _doc.metadata = {
                        k: _doc.metadata[k] for k in self.child_metadata_fields
                    }
            for _doc in sub_docs:
                _doc.metadata[self.id_key] = _id
            docs.extend(sub_docs)
            full_docs.append((_id, doc))

        return docs, full_docs

Domain

LangChainCore

Subdomains

LanguageModelBase

Defined In

libs/langchain/langchain_classic/retrievers/parent_document_retriever.py

Called By

Source

View on GitHub

Frequently Asked Questions

What does _split_docs_for_adding() do?

_split_docs_for_adding() is a function in the langchain codebase, defined in libs/langchain/langchain_classic/retrievers/parent_document_retriever.py.

Where is _split_docs_for_adding() defined?

_split_docs_for_adding() is defined in libs/langchain/langchain_classic/retrievers/parent_document_retriever.py at line 76.

What calls _split_docs_for_adding()?

_split_docs_for_adding() is called by 2 function(s): aadd_documents, add_documents.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free