Home / Function/ test_html_splitter_preserved_elements_reverse_order() — langchain Function Reference

test_html_splitter_preserved_elements_reverse_order() — langchain Function Reference

Architecture documentation for the test_html_splitter_preserved_elements_reverse_order() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  84ba97a6_1e8d_6bce_6a40_9f62650d3c11["test_html_splitter_preserved_elements_reverse_order()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  84ba97a6_1e8d_6bce_6a40_9f62650d3c11 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style 84ba97a6_1e8d_6bce_6a40_9f62650d3c11 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 3978–4015

def test_html_splitter_preserved_elements_reverse_order() -> None:
    """Test HTML splitter with preserved elements and conflicting placeholders.

    This test validates that preserved elements are reinserted in reverse order
    to prevent conflicts when one placeholder might be a substring of another.
    """
    html_content = """
    <h1>Section 1</h1>
    <table>
        <tr><td>Table 1 content</td></tr>
    </table>
    <p>Some text between tables</p>
    <table>
        <tr><td>Table 10 content</td></tr>
    </table>
    <ul>
        <li>List item 1</li>
        <li>List item 10</li>
    </ul>
    """
    with suppress_langchain_beta_warning():
        splitter = HTMLSemanticPreservingSplitter(
            headers_to_split_on=[("h1", "Header 1")],
            elements_to_preserve=["table", "ul"],
            max_chunk_size=100,
        )
    documents = splitter.split_text(html_content)

    # Verify that all preserved elements are correctly reinserted
    # This would fail if placeholders were processed in forward order
    # when one placeholder is a substring of another
    assert len(documents) >= 1
    # Check that table content is preserved
    content = " ".join(doc.page_content for doc in documents)
    assert "Table 1 content" in content
    assert "Table 10 content" in content
    assert "List item 1" in content
    assert "List item 10" in content

Domain

Subdomains

Frequently Asked Questions

What does test_html_splitter_preserved_elements_reverse_order() do?
test_html_splitter_preserved_elements_reverse_order() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_splitter_preserved_elements_reverse_order() defined?
test_html_splitter_preserved_elements_reverse_order() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 3978.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free