__init__() — langchain Function Reference
Architecture documentation for the __init__() function in markdown.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD fc2b67ed_e223_c683_5ac5_a115772a6829["__init__()"] cd7394a9_9856_dc15_cb00_078cf42f0529["ExperimentalMarkdownSyntaxTextSplitter"] fc2b67ed_e223_c683_5ac5_a115772a6829 -->|defined in| cd7394a9_9856_dc15_cb00_078cf42f0529 style fc2b67ed_e223_c683_5ac5_a115772a6829 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/markdown.py lines 333–370
def __init__(
self,
headers_to_split_on: list[tuple[str, str]] | None = None,
return_each_line: bool = False, # noqa: FBT001,FBT002
strip_headers: bool = True, # noqa: FBT001,FBT002
) -> None:
"""Initialize the text splitter with header splitting and formatting options.
This constructor sets up the required configuration for splitting text into
chunks based on specified headers and formatting preferences.
Args:
headers_to_split_on: A list of tuples, where each tuple contains a header
tag (e.g., "h1") and its corresponding metadata key.
If `None`, default headers are used.
return_each_line: Whether to return each line as an individual chunk.
Defaults to `False`, which aggregates lines into larger chunks.
strip_headers: Whether to exclude headers from the resulting chunks.
"""
self.chunks: list[Document] = []
self.current_chunk = Document(page_content="")
self.current_header_stack: list[tuple[int, str]] = []
self.strip_headers = strip_headers
if headers_to_split_on:
self.splittable_headers = dict(headers_to_split_on)
else:
self.splittable_headers = {
"#": "Header 1",
"##": "Header 2",
"###": "Header 3",
"####": "Header 4",
"#####": "Header 5",
"######": "Header 6",
}
self.return_each_line = return_each_line
Domain
Subdomains
Source
Frequently Asked Questions
What does __init__() do?
__init__() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/markdown.py.
Where is __init__() defined?
__init__() is defined in libs/text-splitters/langchain_text_splitters/markdown.py at line 333.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free