test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() — langchain Function Reference
Architecture documentation for the test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() function in test_text_splitters.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 128f3301_cc27_3a80_fd61_d6cd96ef2f34["test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files()"] 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"] 128f3301_cc27_3a80_fd61_d6cd96ef2f34 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135 style 128f3301_cc27_3a80_fd61_d6cd96ef2f34 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 2173–2259
def test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() -> (
None
):
"""Test ExperimentalMarkdownSyntaxTextSplitter with header on multiple files.
Test experimental markdown splitter by header called consecutively on two files.
"""
markdown_splitter = ExperimentalMarkdownSyntaxTextSplitter(strip_headers=False)
output = []
for experimental_markdown_document in EXPERIMENTAL_MARKDOWN_DOCUMENTS:
output += markdown_splitter.split_text(experimental_markdown_document)
expected_output = [
Document(
page_content="# My Header 1 From Document 1\n"
"Content for header 1 from Document 1\n",
metadata={"Header 1": "My Header 1 From Document 1"},
),
Document(
page_content="## Header 2 From Document 1\n"
"Content for header 2 from Document 1\n",
metadata={
"Header 1": "My Header 1 From Document 1",
"Header 2": "Header 2 From Document 1",
},
),
Document(
page_content=(
"```python\ndef func_definition():\n "
"print('Keep the whitespace consistent')\n```\n"
),
metadata={
"Code": "python",
"Header 1": "My Header 1 From Document 1",
"Header 2": "Header 2 From Document 1",
},
),
Document(
page_content="# Header 1 again From Document 1\n"
"We should also split on the horizontal line\n",
metadata={"Header 1": "Header 1 again From Document 1"},
),
Document(
page_content=(
"This will be a new doc but with the same header metadata\n\n"
"And it includes a new paragraph"
),
metadata={"Header 1": "Header 1 again From Document 1"},
),
Document(
page_content="# My Header 1 From Document 2\n"
"Content for header 1 from Document 2\n",
metadata={"Header 1": "My Header 1 From Document 2"},
),
Document(
page_content="## Header 2 From Document 2\n"
"Content for header 2 from Document 2\n",
metadata={
"Header 1": "My Header 1 From Document 2",
"Header 2": "Header 2 From Document 2",
},
),
Document(
page_content=(
"```python\ndef func_definition():\n "
"print('Keep the whitespace consistent')\n```\n"
),
metadata={
"Code": "python",
"Header 1": "My Header 1 From Document 2",
"Header 2": "Header 2 From Document 2",
},
),
Document(
page_content="# Header 1 again From Document 2\n"
"We should also split on the horizontal line\n",
metadata={"Header 1": "Header 1 again From Document 2"},
),
Document(
page_content=(
"This will be a new doc but with the same header metadata\n\n"
Domain
Subdomains
Source
Frequently Asked Questions
What does test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() do?
test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() defined?
test_experimental_markdown_syntax_text_splitter_with_header_on_multi_files() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 2173.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free