Home / Function/ test_happy_path_splitting_based_on_header_with_font_size() — langchain Function Reference

test_happy_path_splitting_based_on_header_with_font_size() — langchain Function Reference

Architecture documentation for the test_happy_path_splitting_based_on_header_with_font_size() function in test_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  02f51f3f_b32d_884d_3c23_ae0f398bc307["test_happy_path_splitting_based_on_header_with_font_size()"]
  6d6b8ad4_1cfe_fbb0_e58e_76a50487c135["test_text_splitters.py"]
  02f51f3f_b32d_884d_3c23_ae0f398bc307 -->|defined in| 6d6b8ad4_1cfe_fbb0_e58e_76a50487c135
  style 02f51f3f_b32d_884d_3c23_ae0f398bc307 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/unit_tests/test_text_splitters.py lines 3009–3055

def test_happy_path_splitting_based_on_header_with_font_size() -> None:
    # arrange
    html_string = """<!DOCTYPE html>
            <html>
            <body>
                <div>
                    <span style="font-size: 22px">Foo</span>
                    <p>Some intro text about Foo.</p>
                    <div>
                        <h2>Bar main section</h2>
                        <p>Some intro text about Bar.</p>
                        <h3>Bar subsection 1</h3>
                        <p>Some text about the first subtopic of Bar.</p>
                        <h3>Bar subsection 2</h3>
                        <p>Some text about the second subtopic of Bar.</p>
                    </div>
                    <div>
                        <h2>Baz</h2>
                        <p>Some text about Baz</p>
                    </div>
                    <br>
                    <p>Some concluding text about Foo</p>
                </div>
            </body>
            </html>"""

    sec_splitter = HTMLSectionSplitter(
        headers_to_split_on=[("h1", "Header 1"), ("h2", "Header 2")]
    )

    docs = sec_splitter.split_text(html_string)

    assert len(docs) == 3
    assert docs[0].page_content == "Foo \n Some intro text about Foo."
    assert docs[0].metadata["Header 1"] == "Foo"

    assert docs[1].page_content == (
        "Bar main section \n Some intro text about Bar. \n "
        "Bar subsection 1 \n Some text about the first subtopic of Bar. \n "
        "Bar subsection 2 \n Some text about the second subtopic of Bar."
    )
    assert docs[1].metadata["Header 2"] == "Bar main section"

    assert docs[2].page_content == (
        "Baz \n Some text about Baz \n \n \n Some concluding text about Foo"
    )
    assert docs[2].metadata["Header 2"] == "Baz"

Domain

Subdomains

Frequently Asked Questions

What does test_happy_path_splitting_based_on_header_with_font_size() do?
test_happy_path_splitting_based_on_header_with_font_size() is a function in the langchain codebase, defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py.
Where is test_happy_path_splitting_based_on_header_with_font_size() defined?
test_happy_path_splitting_based_on_header_with_font_size() is defined in libs/text-splitters/tests/unit_tests/test_text_splitters.py at line 3009.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free