Home / Function/ split_text() — langchain Function Reference

split_text() — langchain Function Reference

Architecture documentation for the split_text() function in nltk.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"]
  213eacdd_0034_0c08_d507_c27d0748affc["NLTKTextSplitter"]
  18590a0b_5de9_0196_2d21_608d79b9ef70 -->|defined in| 213eacdd_0034_0c08_d507_c27d0748affc
  18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"]
  18590a0b_5de9_0196_2d21_608d79b9ef70 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70
  1ad208c7_864b_e6dd_1344_b5ed70211298["_further_split_chunk()"]
  1ad208c7_864b_e6dd_1344_b5ed70211298 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70
  18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"]
  18590a0b_5de9_0196_2d21_608d79b9ef70 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70
  style 18590a0b_5de9_0196_2d21_608d79b9ef70 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/nltk.py lines 58–72

    def split_text(self, text: str) -> list[str]:
        # First we naively split the large input into a bunch of smaller ones.
        if self._use_span_tokenize:
            spans = list(self._tokenizer.span_tokenize(text))
            splits = []
            for i, (start, end) in enumerate(spans):
                if i > 0:
                    prev_end = spans[i - 1][1]
                    sentence = text[prev_end:start] + text[start:end]
                else:
                    sentence = text[start:end]
                splits.append(sentence)
        else:
            splits = self._tokenizer(text, language=self._language)
        return self._merge_splits(splits, self._separator)

Subdomains

Calls

Frequently Asked Questions

What does split_text() do?
split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/nltk.py.
Where is split_text() defined?
split_text() is defined in libs/text-splitters/langchain_text_splitters/nltk.py at line 58.
What does split_text() call?
split_text() calls 1 function(s): split_text.
What calls split_text()?
split_text() is called by 2 function(s): _further_split_chunk, split_text.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free