Home / Function/ test_html_header_text_splitter() — langchain Function Reference

test_html_header_text_splitter() — langchain Function Reference

Architecture documentation for the test_html_header_text_splitter() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  4e25bc1f_884b_eaf4_1078_ed0ebebea27a["test_html_header_text_splitter()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  4e25bc1f_884b_eaf4_1078_ed0ebebea27a -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  b5d48f00_ae6e_36fa_d4c3_b94b9e390692["html_header_splitter_splitter_factory()"]
  4e25bc1f_884b_eaf4_1078_ed0ebebea27a -->|calls| b5d48f00_ae6e_36fa_d4c3_b94b9e390692
  style 4e25bc1f_884b_eaf4_1078_ed0ebebea27a fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 2651–2692

def test_html_header_text_splitter(
    html_header_splitter_splitter_factory: Callable[
        [list[tuple[str, str]]], HTMLHeaderTextSplitter
    ],
    headers_to_split_on: list[tuple[str, str]],
    html_input: str,
    expected_documents: list[Document],
    test_case: str,
) -> None:
    """Test the HTML header text splitter.

    Args:
        html_header_splitter_splitter_factory : Factory function to create the HTML
            header splitter.
        headers_to_split_on: List of headers to split on.
        html_input: The HTML input string to be split.
        expected_documents: List of expected Document objects.
        test_case: Description of the test case.

    Raises:
        AssertionError: If the number of documents or their content/metadata
            does not match the expected values.
    """
    splitter = html_header_splitter_splitter_factory(headers_to_split_on)
    docs = splitter.split_text(html_input)

    assert len(docs) == len(expected_documents), (
        f"Test Case '{test_case}' Failed: Number of documents mismatch. "
        f"Expected {len(expected_documents)}, got {len(docs)}."
    )
    for idx, (doc, expected) in enumerate(
        zip(docs, expected_documents, strict=False), start=1
    ):
        assert doc.page_content == expected.page_content, (
            f"Test Case '{test_case}' Failed at Document {idx}: "
            f"Content mismatch.\nExpected: {expected.page_content}"
            "\nGot: {doc.page_content}"
        )
        assert doc.metadata == expected.metadata, (
            f"Test Case '{test_case}' Failed at Document {idx}: "
            f"Metadata mismatch.\nExpected: {expected.metadata}\nGot: {doc.metadata}"
        )

Domain

Subdomains

Frequently Asked Questions

What does test_html_header_text_splitter() do?
test_html_header_text_splitter() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_header_text_splitter() defined?
test_html_header_text_splitter() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 2651.
What does test_html_header_text_splitter() call?
test_html_header_text_splitter() calls 1 function(s): html_header_splitter_splitter_factory.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free