Home / Function/ test_html_splitter_with_preserved_elements() — langchain Function Reference

test_html_splitter_with_preserved_elements() — langchain Function Reference

Architecture documentation for the test_html_splitter_with_preserved_elements() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  d660f993_0ada_1354_abd9_79fdedd99741["test_html_splitter_with_preserved_elements()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  d660f993_0ada_1354_abd9_79fdedd99741 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style d660f993_0ada_1354_abd9_79fdedd99741 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 3443–3475

def test_html_splitter_with_preserved_elements() -> None:
    """Test HTML splitter with preserved elements.

    Test HTML splitting with preserved elements like <table>, <ul> with low chunk
    size.
    """
    html_content = """
    <h1>Section 1</h1>
    <table>
        <tr><td>Row 1</td></tr>
        <tr><td>Row 2</td></tr>
    </table>
    <ul>
        <li>Item 1</li>
        <li>Item 2</li>
    </ul>
    """
    with suppress_langchain_beta_warning():
        splitter = HTMLSemanticPreservingSplitter(
            headers_to_split_on=[("h1", "Header 1")],
            elements_to_preserve=["table", "ul"],
            max_chunk_size=50,  # Deliberately low to test preservation
        )
    documents = splitter.split_text(html_content)

    expected = [
        Document(
            page_content="Row 1 Row 2 Item 1 Item 2",
            metadata={"Header 1": "Section 1"},
        ),
    ]

    assert documents == expected  # Shouldn't split the table or ul

Domain

Subdomains

Frequently Asked Questions

What does test_html_splitter_with_preserved_elements() do?
test_html_splitter_with_preserved_elements() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_splitter_with_preserved_elements() defined?
test_html_splitter_with_preserved_elements() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 3443.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free