split_text() — langchain Function Reference
Architecture documentation for the split_text() function in character.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 11fcdf48_fab6_27e3_ca6c_8904e5348e23["split_text()"] 70b3caa4_8308_371e_5891_177bf03efb36["CharacterTextSplitter"] 11fcdf48_fab6_27e3_ca6c_8904e5348e23 -->|defined in| 70b3caa4_8308_371e_5891_177bf03efb36 cdc32315_d799_46f6_bd91_09d4da023d15["split_text()"] cdc32315_d799_46f6_bd91_09d4da023d15 -->|calls| 11fcdf48_fab6_27e3_ca6c_8904e5348e23 cdc32315_d799_46f6_bd91_09d4da023d15["split_text()"] 11fcdf48_fab6_27e3_ca6c_8904e5348e23 -->|calls| cdc32315_d799_46f6_bd91_09d4da023d15 4df0af83_3c38_0acb_3015_c017f66de0cc["_split_text_with_regex()"] 11fcdf48_fab6_27e3_ca6c_8904e5348e23 -->|calls| 4df0af83_3c38_0acb_3015_c017f66de0cc style 11fcdf48_fab6_27e3_ca6c_8904e5348e23 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/character.py lines 25–58
def split_text(self, text: str) -> list[str]:
"""Split into chunks without re-inserting lookaround separators.
Args:
text: The text to split.
Returns:
A list of text chunks.
"""
# 1. Determine split pattern: raw regex or escaped literal
sep_pattern = (
self._separator if self._is_separator_regex else re.escape(self._separator)
)
# 2. Initial split (keep separator if requested)
splits = _split_text_with_regex(
text, sep_pattern, keep_separator=self._keep_separator
)
# 3. Detect zero-width lookaround so we never re-insert it
lookaround_prefixes = ("(?=", "(?<!", "(?<=", "(?!")
is_lookaround = self._is_separator_regex and any(
self._separator.startswith(p) for p in lookaround_prefixes
)
# 4. Decide merge separator:
# - if keep_separator or lookaround -> don't re-insert
# - else -> re-insert literal separator
merge_sep = ""
if not (self._keep_separator or is_lookaround):
merge_sep = self._separator
# 5. Merge adjacent splits and return
return self._merge_splits(splits, merge_sep)
Domain
Subdomains
Called By
Source
Frequently Asked Questions
What does split_text() do?
split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/character.py.
Where is split_text() defined?
split_text() is defined in libs/text-splitters/langchain_text_splitters/character.py at line 25.
What does split_text() call?
split_text() calls 2 function(s): _split_text_with_regex, split_text.
What calls split_text()?
split_text() is called by 1 function(s): split_text.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free