test_html_no_headers_with_multiple_splitters() — langchain Function Reference
Architecture documentation for the test_html_no_headers_with_multiple_splitters() function in test_text_splitters.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 92c72a6d_b050_17b1_6494_b52311e03131["test_html_no_headers_with_multiple_splitters()"] 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"] 92c72a6d_b050_17b1_6494_b52311e03131 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135 b5d48f00_ae6e_36fa_d4c3_b94b9e390692["html_header_splitter_splitter_factory()"] 92c72a6d_b050_17b1_6494_b52311e03131 -->|calls| b5d48f00_ae6e_36fa_d4c3_b94b9e390692 style 92c72a6d_b050_17b1_6494_b52311e03131 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 2879–2920
def test_html_no_headers_with_multiple_splitters(
html_header_splitter_splitter_factory: Callable[
[list[tuple[str, str]]], HTMLHeaderTextSplitter
],
headers_to_split_on: list[tuple[str, str]],
html_content: str,
expected_output: list[Document],
test_case: str,
) -> None:
"""Test HTML content splitting without headers using multiple splitters.
Args:
html_header_splitter_splitter_factory: Factory to create the HTML header
splitter.
headers_to_split_on: List of headers to split on.
html_content: HTML content to be split.
expected_output: Expected list of `Document` objects after splitting.
test_case: Description of the test case.
Raises:
AssertionError: If the number of documents or their content/metadata
does not match the expected output.
"""
splitter = html_header_splitter_splitter_factory(headers_to_split_on)
docs = splitter.split_text(html_content)
assert len(docs) == len(expected_output), (
f"{test_case} Failed: Number of documents mismatch. "
f"Expected {len(expected_output)}, got {len(docs)}."
)
for idx, (doc, expected) in enumerate(
zip(docs, expected_output, strict=False), start=1
):
assert doc.page_content == expected.page_content, (
f"{test_case} Failed at Document {idx}: "
f"Content mismatch.\nExpected: {expected.page_content}\n"
"Got: {doc.page_content}"
)
assert doc.metadata == expected.metadata, (
f"{test_case} Failed at Document {idx}: "
f"Metadata mismatch.\nExpected: {expected.metadata}\nGot: {doc.metadata}"
)
Domain
Subdomains
Source
Frequently Asked Questions
What does test_html_no_headers_with_multiple_splitters() do?
test_html_no_headers_with_multiple_splitters() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_no_headers_with_multiple_splitters() defined?
test_html_no_headers_with_multiple_splitters() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 2879.
What does test_html_no_headers_with_multiple_splitters() call?
test_html_no_headers_with_multiple_splitters() calls 1 function(s): html_header_splitter_splitter_factory.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free