Home / Function/ test_md_header_text_splitter_mixed_headers() — langchain Function Reference

test_md_header_text_splitter_mixed_headers() — langchain Function Reference

Architecture documentation for the test_md_header_text_splitter_mixed_headers() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  fff5cf36_253f_ce26_996e_95b381c896b4["test_md_header_text_splitter_mixed_headers()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  fff5cf36_253f_ce26_996e_95b381c896b4 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style fff5cf36_253f_ce26_996e_95b381c896b4 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 1612–1674

def test_md_header_text_splitter_mixed_headers() -> None:
    """Test markdown splitter with both standard and custom headers."""
    markdown_document = """# Standard Header 1

Content under standard header.

**Custom Header 1**

Content under custom header.

## Standard Header 2

Content under standard header 2.

***Custom Header 2***

Content under custom header 2.
"""

    headers_to_split_on = [
        ("#", "Header 1"),
        ("##", "Header 2"),
        ("**", "Bold Header"),
        ("***", "Bold Italic Header"),
    ]

    custom_header_patterns = {
        "**": 1,  # Same level as #
        "***": 2,  # Same level as ##
    }

    markdown_splitter = MarkdownHeaderTextSplitter(
        headers_to_split_on=headers_to_split_on,
        custom_header_patterns=custom_header_patterns,
    )
    output = markdown_splitter.split_text(markdown_document)

    expected_output = [
        Document(
            page_content="Content under standard header.",
            metadata={"Header 1": "Standard Header 1"},
        ),
        Document(
            page_content="Content under custom header.",
            metadata={"Bold Header": "Custom Header 1"},
        ),
        Document(
            page_content="Content under standard header 2.",
            metadata={
                "Bold Header": "Custom Header 1",
                "Header 2": "Standard Header 2",
            },
        ),
        Document(
            page_content="Content under custom header 2.",
            metadata={
                "Bold Header": "Custom Header 1",
                "Bold Italic Header": "Custom Header 2",
            },
        ),
    ]

    assert output == expected_output

Domain

Subdomains

Frequently Asked Questions

What does test_md_header_text_splitter_mixed_headers() do?
test_md_header_text_splitter_mixed_headers() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_md_header_text_splitter_mixed_headers() defined?
test_md_header_text_splitter_mixed_headers() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 1612.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free