split_text() — langchain Function Reference
Architecture documentation for the split_text() function in sentence_transformers.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD a7a0dc6a_7652_b658_2bb9_d850d67979ca["split_text()"] 059dfb7c_30ac_164c_5a3e_708a02d51601["SentenceTransformersTokenTextSplitter"] a7a0dc6a_7652_b658_2bb9_d850d67979ca -->|defined in| 059dfb7c_30ac_164c_5a3e_708a02d51601 a7a0dc6a_7652_b658_2bb9_d850d67979ca["split_text()"] a7a0dc6a_7652_b658_2bb9_d850d67979ca -->|calls| a7a0dc6a_7652_b658_2bb9_d850d67979ca 0172994e_4917_2bec_356d_ac072e832565["_encode()"] a7a0dc6a_7652_b658_2bb9_d850d67979ca -->|calls| 0172994e_4917_2bec_356d_ac072e832565 a7a0dc6a_7652_b658_2bb9_d850d67979ca["split_text()"] a7a0dc6a_7652_b658_2bb9_d850d67979ca -->|calls| a7a0dc6a_7652_b658_2bb9_d850d67979ca style a7a0dc6a_7652_b658_2bb9_d850d67979ca fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/sentence_transformers.py lines 74–99
def split_text(self, text: str) -> list[str]:
"""Splits the input text into smaller components by splitting text on tokens.
This method encodes the input text using a private `_encode` method, then
strips the start and stop token IDs from the encoded result. It returns the
processed segments as a list of strings.
Args:
text: The input text to be split.
Returns:
A list of string components derived from the input text after encoding and
processing.
"""
def encode_strip_start_and_stop_token_ids(text: str) -> list[int]:
return self._encode(text)[1:-1]
tokenizer = Tokenizer(
chunk_overlap=self._chunk_overlap,
tokens_per_chunk=self.tokens_per_chunk,
decode=self.tokenizer.decode,
encode=encode_strip_start_and_stop_token_ids,
)
return split_text_on_tokens(text=text, tokenizer=tokenizer)
Domain
Subdomains
Calls
Called By
Source
Frequently Asked Questions
What does split_text() do?
split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/sentence_transformers.py.
Where is split_text() defined?
split_text() is defined in libs/text-splitters/langchain_text_splitters/sentence_transformers.py at line 74.
What does split_text() call?
split_text() calls 2 function(s): _encode, split_text.
What calls split_text()?
split_text() is called by 1 function(s): split_text.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free