base.py — langchain Source File

Architecture documentation for base.py, a python file in the langchain codebase. 7 imports, 0 dependents.

File python DocumentProcessing DataLoaders 7 imports 3 functions 2 classes

Entity Profile

DocumentProcessing→ DataLoaders→ base.py — langchain Source File

Dependency Diagram

graph LR
  360884f8_60d4_6961_e0a8_bd97850fd2b3["base.py"]
  cccbe73e_4644_7211_4d55_e8fb133a8014["abc"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> cccbe73e_4644_7211_4d55_e8fb133a8014
  8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3
  2ceb1686_0f8c_8ae0_36d1_7c0b702fda1c["langchain_core.runnables"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> 2ceb1686_0f8c_8ae0_36d1_7c0b702fda1c
  cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7["collections.abc"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> cfe2bde5_180e_e3b0_df2b_55b3ebaca8e7
  5d24a664_4d9b_7491_ea6a_e13ddbcc8eeb["langchain_text_splitters"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> 5d24a664_4d9b_7491_ea6a_e13ddbcc8eeb
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  e46f4a9d_5482_b595_2107_c5825c92261e["langchain_core.documents.base"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3 --> e46f4a9d_5482_b595_2107_c5825c92261e
  style 360884f8_60d4_6961_e0a8_bd97850fd2b3 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

"""Abstract interface for document loader implementations."""

from __future__ import annotations

from abc import ABC, abstractmethod
from typing import TYPE_CHECKING

from langchain_core.runnables import run_in_executor

if TYPE_CHECKING:
    from collections.abc import AsyncIterator, Iterator

    from langchain_text_splitters import TextSplitter

    from langchain_core.documents import Document
    from langchain_core.documents.base import Blob

try:
    from langchain_text_splitters import RecursiveCharacterTextSplitter

    _HAS_TEXT_SPLITTERS = True
except ImportError:
    _HAS_TEXT_SPLITTERS = False


class BaseLoader(ABC):  # noqa: B024
    """Interface for document loader.

    Implementations should implement the lazy-loading method using generators to avoid
    loading all documents into memory at once.

    `load` is provided just for user convenience and should not be overridden.
    """

    # Sub-classes should not implement this method directly. Instead, they
    # should implement the lazy load method.
    def load(self) -> list[Document]:
        """Load data into `Document` objects.

        Returns:
            The documents.
        """
        return list(self.lazy_load())

    async def aload(self) -> list[Document]:
        """Load data into `Document` objects.

        Returns:
            The documents.
        """
        return [document async for document in self.alazy_load()]

    def load_and_split(
        self, text_splitter: TextSplitter | None = None
    ) -> list[Document]:
        """Load `Document` and split into chunks. Chunks are returned as `Document`.

        !!! danger

            Do not override this method. It should be considered to be deprecated!
// ... (96 more lines)

Domain

DocumentProcessing

Subdomains

DataLoaders

Functions

Classes

Dependencies

abc
collections.abc
langchain_core.documents
langchain_core.documents.base
langchain_core.runnables
langchain_text_splitters
typing

Source

View on GitHub

Frequently Asked Questions

What does base.py do?

base.py is a source file in the langchain codebase, written in python. It belongs to the DocumentProcessing domain, DataLoaders subdomain.

What functions are defined in base.py?

base.py defines 3 function(s): _HAS_TEXT_SPLITTERS, collections, langchain_text_splitters.

What does base.py depend on?

base.py imports 7 module(s): abc, collections.abc, langchain_core.documents, langchain_core.documents.base, langchain_core.runnables, langchain_text_splitters, typing.

Where is base.py in the architecture?

base.py is located at libs/core/langchain_core/document_loaders/base.py (domain: DocumentProcessing, subdomain: DataLoaders, directory: libs/core/langchain_core/document_loaders).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free