Home / Function/ test_html_splitter_with_nested_preserved_elements() — langchain Function Reference

test_html_splitter_with_nested_preserved_elements() — langchain Function Reference

Architecture documentation for the test_html_splitter_with_nested_preserved_elements() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  1fde3037_a2b3_e7e0_c89f_a1a36eb8cc95["test_html_splitter_with_nested_preserved_elements()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  1fde3037_a2b3_e7e0_c89f_a1a36eb8cc95 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style 1fde3037_a2b3_e7e0_c89f_a1a36eb8cc95 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 3479–3516

def test_html_splitter_with_nested_preserved_elements() -> None:
    """Test HTML splitter with preserved elements nested in containers.

    Test that preserved elements are correctly preserved even when they are
    nested inside other container elements like <section> or <article>.
    This is a regression test for issue #31569
    """
    html_content = """
    <article>
        <h1>Section 1</h1>
        <section>
            <p>Some context about the data:</p>
            <table>
                <tr><td>Col1</td><td>Col2</td></tr>
                <tr><td>Data1</td><td>Data2</td></tr>
            </table>
            <p>Conclusion about data.</p>
        </section>
    </article>
    """
    with suppress_langchain_beta_warning():
        splitter = HTMLSemanticPreservingSplitter(
            headers_to_split_on=[("h1", "Header 1")],
            elements_to_preserve=["table"],
            max_chunk_size=1000,
        )
    documents = splitter.split_text(html_content)

    # The table should be preserved in the output
    assert len(documents) == 1
    content = documents[0].page_content
    # Check that the table structure is maintained (not flattened)
    assert "Col1" in content
    assert "Col2" in content
    assert "Data1" in content
    assert "Data2" in content
    # Check metadata
    assert documents[0].metadata == {"Header 1": "Section 1"}

Domain

Subdomains

Frequently Asked Questions

What does test_html_splitter_with_nested_preserved_elements() do?
test_html_splitter_with_nested_preserved_elements() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_splitter_with_nested_preserved_elements() defined?
test_html_splitter_with_nested_preserved_elements() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 3479.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free