_split_text() — langchain Function Reference
Architecture documentation for the _split_text() function in character.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 90247d99_f9d1_5357_ea60_e7b8e740431f["_split_text()"] 22d8d30b_9b36_1532_bb1c_4c9aa03a4bb8["RecursiveCharacterTextSplitter"] 90247d99_f9d1_5357_ea60_e7b8e740431f -->|defined in| 22d8d30b_9b36_1532_bb1c_4c9aa03a4bb8 cdc32315_d799_46f6_bd91_09d4da023d15["split_text()"] cdc32315_d799_46f6_bd91_09d4da023d15 -->|calls| 90247d99_f9d1_5357_ea60_e7b8e740431f 4df0af83_3c38_0acb_3015_c017f66de0cc["_split_text_with_regex()"] 90247d99_f9d1_5357_ea60_e7b8e740431f -->|calls| 4df0af83_3c38_0acb_3015_c017f66de0cc style 90247d99_f9d1_5357_ea60_e7b8e740431f fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/character.py lines 107–147
def _split_text(self, text: str, separators: list[str]) -> list[str]:
"""Split incoming text and return chunks."""
final_chunks = []
# Get appropriate separator to use
separator = separators[-1]
new_separators = []
for i, s_ in enumerate(separators):
separator_ = s_ if self._is_separator_regex else re.escape(s_)
if not s_:
separator = s_
break
if re.search(separator_, text):
separator = s_
new_separators = separators[i + 1 :]
break
separator_ = separator if self._is_separator_regex else re.escape(separator)
splits = _split_text_with_regex(
text, separator_, keep_separator=self._keep_separator
)
# Now go merging things, recursively splitting longer texts.
good_splits = []
separator_ = "" if self._keep_separator else separator
for s in splits:
if self._length_function(s) < self._chunk_size:
good_splits.append(s)
else:
if good_splits:
merged_text = self._merge_splits(good_splits, separator_)
final_chunks.extend(merged_text)
good_splits = []
if not new_separators:
final_chunks.append(s)
else:
other_info = self._split_text(s, new_separators)
final_chunks.extend(other_info)
if good_splits:
merged_text = self._merge_splits(good_splits, separator_)
final_chunks.extend(merged_text)
return final_chunks
Domain
Subdomains
Calls
Called By
Source
Frequently Asked Questions
What does _split_text() do?
_split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/character.py.
Where is _split_text() defined?
_split_text() is defined in libs/text-splitters/langchain_text_splitters/character.py at line 107.
What does _split_text() call?
_split_text() calls 1 function(s): _split_text_with_regex.
What calls _split_text()?
_split_text() is called by 1 function(s): split_text.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free