test_md_header_text_splitter_with_custom_headers() — langchain Function Reference
Architecture documentation for the test_md_header_text_splitter_with_custom_headers() function in test_text_splitters.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 642d6e20_87ef_3076_3f0c_4dda9ba38eab["test_md_header_text_splitter_with_custom_headers()"] 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"] 642d6e20_87ef_3076_3f0c_4dda9ba38eab -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135 style 642d6e20_87ef_3076_3f0c_4dda9ba38eab fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 1556–1609
def test_md_header_text_splitter_with_custom_headers() -> None:
"""Test markdown splitter with custom header patterns like **Header**."""
markdown_document = """**Chapter 1**
This is the content for chapter 1.
***Section 1.1***
This is the content for section 1.1.
**Chapter 2**
This is the content for chapter 2.
***Section 2.1***
This is the content for section 2.1.
"""
headers_to_split_on = [
("**", "Bold Header"),
("***", "Bold Italic Header"),
]
custom_header_patterns = {
"**": 1, # Level 1 headers
"***": 2, # Level 2 headers
}
markdown_splitter = MarkdownHeaderTextSplitter(
headers_to_split_on=headers_to_split_on,
custom_header_patterns=custom_header_patterns,
)
output = markdown_splitter.split_text(markdown_document)
expected_output = [
Document(
page_content="This is the content for chapter 1.",
metadata={"Bold Header": "Chapter 1"},
),
Document(
page_content="This is the content for section 1.1.",
metadata={"Bold Header": "Chapter 1", "Bold Italic Header": "Section 1.1"},
),
Document(
page_content="This is the content for chapter 2.",
metadata={"Bold Header": "Chapter 2"},
),
Document(
page_content="This is the content for section 2.1.",
metadata={"Bold Header": "Chapter 2", "Bold Italic Header": "Section 2.1"},
),
]
assert output == expected_output
Domain
Subdomains
Source
Frequently Asked Questions
What does test_md_header_text_splitter_with_custom_headers() do?
test_md_header_text_splitter_with_custom_headers() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_md_header_text_splitter_with_custom_headers() defined?
test_md_header_text_splitter_with_custom_headers() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 1556.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free