Home / Function/ test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() — langchain Function Reference

test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() — langchain Function Reference

Architecture documentation for the test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  cce7e8af_0f60_c535_811b_7e2d07a4ea3d["test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  cce7e8af_0f60_c535_811b_7e2d07a4ea3d -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style cce7e8af_0f60_c535_811b_7e2d07a4ea3d fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 2262–2335

def test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() -> (
    None
):
    """Test ExperimentalMarkdownSyntaxTextSplitter header config on multiple files.

    Test experimental markdown splitter by header configuration called consecutively
    on two files.
    """
    headers_to_split_on = [("#", "Encabezamiento 1")]
    markdown_splitter = ExperimentalMarkdownSyntaxTextSplitter(
        headers_to_split_on=headers_to_split_on
    )
    output = []
    for experimental_markdown_document in EXPERIMENTAL_MARKDOWN_DOCUMENTS:
        output += markdown_splitter.split_text(experimental_markdown_document)

    expected_output = [
        Document(
            page_content="Content for header 1 from Document 1\n"
            "## Header 2 From Document 1\n"
            "Content for header 2 from Document 1\n",
            metadata={"Encabezamiento 1": "My Header 1 From Document 1"},
        ),
        Document(
            page_content=(
                "```python\ndef func_definition():\n   "
                "print('Keep the whitespace consistent')\n```\n"
            ),
            metadata={
                "Code": "python",
                "Encabezamiento 1": "My Header 1 From Document 1",
            },
        ),
        Document(
            page_content="We should also split on the horizontal line\n",
            metadata={"Encabezamiento 1": "Header 1 again From Document 1"},
        ),
        Document(
            page_content=(
                "This will be a new doc but with the same header metadata\n\n"
                "And it includes a new paragraph"
            ),
            metadata={"Encabezamiento 1": "Header 1 again From Document 1"},
        ),
        Document(
            page_content="Content for header 1 from Document 2\n"
            "## Header 2 From Document 2\n"
            "Content for header 2 from Document 2\n",
            metadata={"Encabezamiento 1": "My Header 1 From Document 2"},
        ),
        Document(
            page_content=(
                "```python\ndef func_definition():\n   "
                "print('Keep the whitespace consistent')\n```\n"
            ),
            metadata={
                "Code": "python",
                "Encabezamiento 1": "My Header 1 From Document 2",
            },
        ),
        Document(
            page_content="We should also split on the horizontal line\n",
            metadata={"Encabezamiento 1": "Header 1 again From Document 2"},
        ),
        Document(
            page_content=(
                "This will be a new doc but with the same header metadata\n\n"
                "And it includes a new paragraph"
            ),
            metadata={"Encabezamiento 1": "Header 1 again From Document 2"},
        ),
    ]

    assert output == expected_output

Domain

Subdomains

Frequently Asked Questions

What does test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() do?
test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() defined?
test_experimental_markdown_syntax_text_splitter_header_config_on_multi_files() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 2262.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free