test_html_splitter_with_mixed_preserve_and_filter() — langchain Function Reference
Architecture documentation for the test_html_splitter_with_mixed_preserve_and_filter() function in test_text_splitters.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD de09deb1_ef0a_c4ba_5051_d7aab6f920d8["test_html_splitter_with_mixed_preserve_and_filter()"] 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"] de09deb1_ef0a_c4ba_5051_d7aab6f920d8 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135 style de09deb1_ef0a_c4ba_5051_d7aab6f920d8 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 3727–3759
def test_html_splitter_with_mixed_preserve_and_filter() -> None:
"""Test HTML splitting with both preserved elements and denylist tags."""
html_content = """
<h1>Section 1</h1>
<table>
<tr>
<td>Keep this table</td>
<td>Cell contents kept, span removed
<span>This span should be removed.</span>
</td>
</tr>
</table>
<p>This paragraph should be kept.</p>
<span>This span should be removed.</span>
"""
with suppress_langchain_beta_warning():
splitter = HTMLSemanticPreservingSplitter(
headers_to_split_on=[("h1", "Header 1")],
elements_to_preserve=["table"],
denylist_tags=["span"],
max_chunk_size=1000,
)
documents = splitter.split_text(html_content)
expected = [
Document(
page_content="Keep this table Cell contents kept, span removed"
" This paragraph should be kept.",
metadata={"Header 1": "Section 1"},
),
]
assert documents == expected
Domain
Subdomains
Source
Frequently Asked Questions
What does test_html_splitter_with_mixed_preserve_and_filter() do?
test_html_splitter_with_mixed_preserve_and_filter() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_html_splitter_with_mixed_preserve_and_filter() defined?
test_html_splitter_with_mixed_preserve_and_filter() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 3727.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free