Home / Function/ _split_text_with_regex() — langchain Function Reference

_split_text_with_regex() — langchain Function Reference

Architecture documentation for the _split_text_with_regex() function in character.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  4df0af83_3c38_0acb_3015_c017f66de0cc["_split_text_with_regex()"]
  2928a4a1_9408_cbea_fa7c_7f66eab697a2["character.py"]
  4df0af83_3c38_0acb_3015_c017f66de0cc -->|defined in| 2928a4a1_9408_cbea_fa7c_7f66eab697a2
  11fcdf48_fab6_27e3_ca6c_8904e5348e23["split_text()"]
  11fcdf48_fab6_27e3_ca6c_8904e5348e23 -->|calls| 4df0af83_3c38_0acb_3015_c017f66de0cc
  90247d99_f9d1_5357_ea60_e7b8e740431f["_split_text()"]
  90247d99_f9d1_5357_ea60_e7b8e740431f -->|calls| 4df0af83_3c38_0acb_3015_c017f66de0cc
  style 4df0af83_3c38_0acb_3015_c017f66de0cc fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/character.py lines 61–85

def _split_text_with_regex(
    text: str, separator: str, *, keep_separator: bool | Literal["start", "end"]
) -> list[str]:
    # Now that we have the separator, split the text
    if separator:
        if keep_separator:
            # The parentheses in the pattern keep the delimiters in the result.
            splits_ = re.split(f"({separator})", text)
            splits = (
                ([splits_[i] + splits_[i + 1] for i in range(0, len(splits_) - 1, 2)])
                if keep_separator == "end"
                else ([splits_[i] + splits_[i + 1] for i in range(1, len(splits_), 2)])
            )
            if len(splits_) % 2 == 0:
                splits += splits_[-1:]
            splits = (
                ([*splits, splits_[-1]])
                if keep_separator == "end"
                else ([splits_[0], *splits])
            )
        else:
            splits = re.split(separator, text)
    else:
        splits = list(text)
    return [s for s in splits if s]

Subdomains

Frequently Asked Questions

What does _split_text_with_regex() do?
_split_text_with_regex() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/character.py.
Where is _split_text_with_regex() defined?
_split_text_with_regex() is defined in libs/text-splitters/langchain_text_splitters/character.py at line 61.
What calls _split_text_with_regex()?
_split_text_with_regex() is called by 2 function(s): _split_text, split_text.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free