Home / Function/ test_nltk_text_splitter_with_add_start_index() — langchain Function Reference

test_nltk_text_splitter_with_add_start_index() — langchain Function Reference

Architecture documentation for the test_nltk_text_splitter_with_add_start_index() function in test_nlp_text_splitters.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  135485c5_44f0_0534_b020_ec00035724c3["test_nltk_text_splitter_with_add_start_index()"]
  a159cbba_51f0_5d34_8696_299b594bb0fe["test_nlp_text_splitters.py"]
  135485c5_44f0_0534_b020_ec00035724c3 -->|defined in| a159cbba_51f0_5d34_8696_299b594bb0fe
  style 135485c5_44f0_0534_b020_ec00035724c3 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/integration_tests/test_nlp_text_splitters.py lines 105–123

def test_nltk_text_splitter_with_add_start_index() -> None:
    splitter = NLTKTextSplitter(
        chunk_size=80,
        chunk_overlap=0,
        separator="",
        use_span_tokenize=True,
        add_start_index=True,
    )
    txt = (
        "Innovation drives our success.        "
        "Collaboration fosters creative solutions. "
        "Efficiency enhances data management."
    )
    docs = [Document(txt)]
    chunks = splitter.split_documents(docs)
    assert len(chunks) == 2
    for chunk in chunks:
        s_i = chunk.metadata["start_index"]
        assert chunk.page_content == txt[s_i : s_i + len(chunk.page_content)]

Domain

Subdomains

Frequently Asked Questions

What does test_nltk_text_splitter_with_add_start_index() do?
test_nltk_text_splitter_with_add_start_index() is a function in the langchain codebase, defined in libs/text-splitters/tests/integration_tests/test_nlp_text_splitters.py.
Where is test_nltk_text_splitter_with_add_start_index() defined?
test_nltk_text_splitter_with_add_start_index() is defined in libs/text-splitters/tests/integration_tests/test_nlp_text_splitters.py at line 105.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free