split_text() — langchain Function Reference
Architecture documentation for the split_text() function in nltk.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"] 213eacdd_0034_0c08_d507_c27d0748affc["NLTKTextSplitter"] 18590a0b_5de9_0196_2d21_608d79b9ef70 -->|defined in| 213eacdd_0034_0c08_d507_c27d0748affc 18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"] 18590a0b_5de9_0196_2d21_608d79b9ef70 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70 1ad208c7_864b_e6dd_1344_b5ed70211298["_further_split_chunk()"] 1ad208c7_864b_e6dd_1344_b5ed70211298 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70 18590a0b_5de9_0196_2d21_608d79b9ef70["split_text()"] 18590a0b_5de9_0196_2d21_608d79b9ef70 -->|calls| 18590a0b_5de9_0196_2d21_608d79b9ef70 style 18590a0b_5de9_0196_2d21_608d79b9ef70 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/nltk.py lines 58–72
def split_text(self, text: str) -> list[str]:
# First we naively split the large input into a bunch of smaller ones.
if self._use_span_tokenize:
spans = list(self._tokenizer.span_tokenize(text))
splits = []
for i, (start, end) in enumerate(spans):
if i > 0:
prev_end = spans[i - 1][1]
sentence = text[prev_end:start] + text[start:end]
else:
sentence = text[start:end]
splits.append(sentence)
else:
splits = self._tokenizer(text, language=self._language)
return self._merge_splits(splits, self._separator)
Domain
Subdomains
Calls
Called By
Source
Frequently Asked Questions
What does split_text() do?
split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/nltk.py.
Where is split_text() defined?
split_text() is defined in libs/text-splitters/langchain_text_splitters/nltk.py at line 58.
What does split_text() call?
split_text() calls 1 function(s): split_text.
What calls split_text()?
split_text() is called by 2 function(s): _further_split_chunk, split_text.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free