Home / Class/ BaseLoader Class — langchain Architecture

BaseLoader Class — langchain Architecture

Architecture documentation for the BaseLoader class in base.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8["BaseLoader"]
  360884f8_60d4_6961_e0a8_bd97850fd2b3["base.py"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|defined in| 360884f8_60d4_6961_e0a8_bd97850fd2b3
  e0a3f14c_d483_262a_6986_006241b8be81["load()"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| e0a3f14c_d483_262a_6986_006241b8be81
  19e26a62_87ab_2cdb_23d2_4df20d99e78e["aload()"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 19e26a62_87ab_2cdb_23d2_4df20d99e78e
  8fbd76b5_602e_6d85_e949_58422d1e40b5["load_and_split()"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 8fbd76b5_602e_6d85_e949_58422d1e40b5
  98484d4f_4547_e4b1_5e16_19321ff7cb65["lazy_load()"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 98484d4f_4547_e4b1_5e16_19321ff7cb65
  70d7cce2_ce08_773d_8019_8661c6065d12["alazy_load()"]
  a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 70d7cce2_ce08_773d_8019_8661c6065d12

Relationship Graph

Source Code

libs/core/langchain_core/document_loaders/base.py lines 26–114

class BaseLoader(ABC):  # noqa: B024
    """Interface for document loader.

    Implementations should implement the lazy-loading method using generators to avoid
    loading all documents into memory at once.

    `load` is provided just for user convenience and should not be overridden.
    """

    # Sub-classes should not implement this method directly. Instead, they
    # should implement the lazy load method.
    def load(self) -> list[Document]:
        """Load data into `Document` objects.

        Returns:
            The documents.
        """
        return list(self.lazy_load())

    async def aload(self) -> list[Document]:
        """Load data into `Document` objects.

        Returns:
            The documents.
        """
        return [document async for document in self.alazy_load()]

    def load_and_split(
        self, text_splitter: TextSplitter | None = None
    ) -> list[Document]:
        """Load `Document` and split into chunks. Chunks are returned as `Document`.

        !!! danger

            Do not override this method. It should be considered to be deprecated!

        Args:
            text_splitter: `TextSplitter` instance to use for splitting documents.

                Defaults to `RecursiveCharacterTextSplitter`.

        Raises:
            ImportError: If `langchain-text-splitters` is not installed and no
                `text_splitter` is provided.

        Returns:
            List of `Document` objects.
        """
        if text_splitter is None:
            if not _HAS_TEXT_SPLITTERS:
                msg = (
                    "Unable to import from langchain_text_splitters. Please specify "
                    "text_splitter or install langchain_text_splitters with "
                    "`pip install -U langchain-text-splitters`."
                )
                raise ImportError(msg)

            text_splitter_: TextSplitter = RecursiveCharacterTextSplitter()
        else:
            text_splitter_ = text_splitter
        docs = self.load()
        return text_splitter_.split_documents(docs)

    # Attention: This method will be upgraded into an abstractmethod once it's
    #            implemented in all the existing subclasses.
    def lazy_load(self) -> Iterator[Document]:
        """A lazy loader for `Document`.

        Yields:
            The `Document` objects.
        """
        if type(self).load != BaseLoader.load:
            return iter(self.load())
        msg = f"{self.__class__.__name__} does not implement lazy_load()"
        raise NotImplementedError(msg)

    async def alazy_load(self) -> AsyncIterator[Document]:
        """A lazy loader for `Document`.

        Yields:
            The `Document` objects.

Frequently Asked Questions

What is the BaseLoader class?
BaseLoader is a class in the langchain codebase, defined in libs/core/langchain_core/document_loaders/base.py.
Where is BaseLoader defined?
BaseLoader is defined in libs/core/langchain_core/document_loaders/base.py at line 26.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free