__init__() — langchain Function Reference
Architecture documentation for the __init__() function in base.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD c7195a4f_fa21_284a_6792_0765592ec3fd["__init__()"] c86e37d5_f962_cc1e_9821_b665e1359ae8["TextSplitter"] c7195a4f_fa21_284a_6792_0765592ec3fd -->|defined in| c86e37d5_f962_cc1e_9821_b665e1359ae8 style c7195a4f_fa21_284a_6792_0765592ec3fd fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/base.py lines 47–90
def __init__(
self,
chunk_size: int = 4000,
chunk_overlap: int = 200,
length_function: Callable[[str], int] = len,
keep_separator: bool | Literal["start", "end"] = False, # noqa: FBT001,FBT002
add_start_index: bool = False, # noqa: FBT001,FBT002
strip_whitespace: bool = True, # noqa: FBT001,FBT002
) -> None:
"""Create a new `TextSplitter`.
Args:
chunk_size: Maximum size of chunks to return
chunk_overlap: Overlap in characters between chunks
length_function: Function that measures the length of given chunks
keep_separator: Whether to keep the separator and where to place it
in each corresponding chunk `(True='start')`
add_start_index: If `True`, includes chunk's start index in metadata
strip_whitespace: If `True`, strips whitespace from the start and end of
every document
Raises:
ValueError: If `chunk_size` is less than or equal to 0
ValueError: If `chunk_overlap` is less than 0
ValueError: If `chunk_overlap` is greater than `chunk_size`
"""
if chunk_size <= 0:
msg = f"chunk_size must be > 0, got {chunk_size}"
raise ValueError(msg)
if chunk_overlap < 0:
msg = f"chunk_overlap must be >= 0, got {chunk_overlap}"
raise ValueError(msg)
if chunk_overlap > chunk_size:
msg = (
f"Got a larger chunk overlap ({chunk_overlap}) than chunk size "
f"({chunk_size}), should be smaller."
)
raise ValueError(msg)
self._chunk_size = chunk_size
self._chunk_overlap = chunk_overlap
self._length_function = length_function
self._keep_separator = keep_separator
self._add_start_index = add_start_index
self._strip_whitespace = strip_whitespace
Domain
Subdomains
Source
Frequently Asked Questions
What does __init__() do?
__init__() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/base.py.
Where is __init__() defined?
__init__() is defined in libs/text-splitters/langchain_text_splitters/base.py at line 47.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free