test_additional_html_header_text_splitter() — langchain Function Reference
Architecture documentation for the test_additional_html_header_text_splitter() function in test_text_splitters.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 1cd2c40f_c8ef_d87a_f9a4_67d3bc4d599a["test_additional_html_header_text_splitter()"] 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"] 1cd2c40f_c8ef_d87a_f9a4_67d3bc4d599a -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135 b5d48f00_ae6e_36fa_d4c3_b94b9e390692["html_header_splitter_splitter_factory()"] 1cd2c40f_c8ef_d87a_f9a4_67d3bc4d599a -->|calls| b5d48f00_ae6e_36fa_d4c3_b94b9e390692 style 1cd2c40f_c8ef_d87a_f9a4_67d3bc4d599a fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 2807–2848
def test_additional_html_header_text_splitter(
html_header_splitter_splitter_factory: Callable[
[list[tuple[str, str]]], HTMLHeaderTextSplitter
],
headers_to_split_on: list[tuple[str, str]],
html_content: str,
expected_output: list[Document],
test_case: str,
) -> None:
"""Test the HTML header text splitter.
Args:
html_header_splitter_splitter_factory: Factory function to create the HTML
header splitter.
headers_to_split_on: List of headers to split on.
html_content: HTML content to be split.
expected_output: Expected list of `Document` objects.
test_case: Description of the test case.
Raises:
AssertionError: If the number of documents or their content/metadata
does not match the expected output.
"""
splitter = html_header_splitter_splitter_factory(headers_to_split_on)
docs = splitter.split_text(html_content)
assert len(docs) == len(expected_output), (
f"{test_case} Failed: Number of documents mismatch. "
f"Expected {len(expected_output)}, got {len(docs)}."
)
for idx, (doc, expected) in enumerate(
zip(docs, expected_output, strict=False), start=1
):
assert doc.page_content == expected.page_content, (
f"{test_case} Failed at Document {idx}: "
f"Content mismatch.\nExpected: {expected.page_content}\n"
"Got: {doc.page_content}"
)
assert doc.metadata == expected.metadata, (
f"{test_case} Failed at Document {idx}: "
f"Metadata mismatch.\nExpected: {expected.metadata}\nGot: {doc.metadata}"
)
Domain
Subdomains
Source
Frequently Asked Questions
What does test_additional_html_header_text_splitter() do?
test_additional_html_header_text_splitter() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_additional_html_header_text_splitter() defined?
test_additional_html_header_text_splitter() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 2807.
What does test_additional_html_header_text_splitter() call?
test_additional_html_header_text_splitter() calls 1 function(s): html_header_splitter_splitter_factory.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free