BaseLoader Class — langchain Architecture
Architecture documentation for the BaseLoader class in base.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD a03328f5_8313_3e81_e0e5_6b8aa55a53b8["BaseLoader"] 360884f8_60d4_6961_e0a8_bd97850fd2b3["base.py"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|defined in| 360884f8_60d4_6961_e0a8_bd97850fd2b3 e0a3f14c_d483_262a_6986_006241b8be81["load()"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| e0a3f14c_d483_262a_6986_006241b8be81 19e26a62_87ab_2cdb_23d2_4df20d99e78e["aload()"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 19e26a62_87ab_2cdb_23d2_4df20d99e78e 8fbd76b5_602e_6d85_e949_58422d1e40b5["load_and_split()"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 8fbd76b5_602e_6d85_e949_58422d1e40b5 98484d4f_4547_e4b1_5e16_19321ff7cb65["lazy_load()"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 98484d4f_4547_e4b1_5e16_19321ff7cb65 70d7cce2_ce08_773d_8019_8661c6065d12["alazy_load()"] a03328f5_8313_3e81_e0e5_6b8aa55a53b8 -->|method| 70d7cce2_ce08_773d_8019_8661c6065d12
Relationship Graph
Source Code
libs/core/langchain_core/document_loaders/base.py lines 26–114
class BaseLoader(ABC): # noqa: B024
"""Interface for document loader.
Implementations should implement the lazy-loading method using generators to avoid
loading all documents into memory at once.
`load` is provided just for user convenience and should not be overridden.
"""
# Sub-classes should not implement this method directly. Instead, they
# should implement the lazy load method.
def load(self) -> list[Document]:
"""Load data into `Document` objects.
Returns:
The documents.
"""
return list(self.lazy_load())
async def aload(self) -> list[Document]:
"""Load data into `Document` objects.
Returns:
The documents.
"""
return [document async for document in self.alazy_load()]
def load_and_split(
self, text_splitter: TextSplitter | None = None
) -> list[Document]:
"""Load `Document` and split into chunks. Chunks are returned as `Document`.
!!! danger
Do not override this method. It should be considered to be deprecated!
Args:
text_splitter: `TextSplitter` instance to use for splitting documents.
Defaults to `RecursiveCharacterTextSplitter`.
Raises:
ImportError: If `langchain-text-splitters` is not installed and no
`text_splitter` is provided.
Returns:
List of `Document` objects.
"""
if text_splitter is None:
if not _HAS_TEXT_SPLITTERS:
msg = (
"Unable to import from langchain_text_splitters. Please specify "
"text_splitter or install langchain_text_splitters with "
"`pip install -U langchain-text-splitters`."
)
raise ImportError(msg)
text_splitter_: TextSplitter = RecursiveCharacterTextSplitter()
else:
text_splitter_ = text_splitter
docs = self.load()
return text_splitter_.split_documents(docs)
# Attention: This method will be upgraded into an abstractmethod once it's
# implemented in all the existing subclasses.
def lazy_load(self) -> Iterator[Document]:
"""A lazy loader for `Document`.
Yields:
The `Document` objects.
"""
if type(self).load != BaseLoader.load:
return iter(self.load())
msg = f"{self.__class__.__name__} does not implement lazy_load()"
raise NotImplementedError(msg)
async def alazy_load(self) -> AsyncIterator[Document]:
"""A lazy loader for `Document`.
Yields:
The `Document` objects.
Source
Frequently Asked Questions
What is the BaseLoader class?
BaseLoader is a class in the langchain codebase, defined in libs/core/langchain_core/document_loaders/base.py.
Where is BaseLoader defined?
BaseLoader is defined in libs/core/langchain_core/document_loaders/base.py at line 26.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free