Home / Function/ split_text() — langchain Function Reference

split_text() — langchain Function Reference

Architecture documentation for the split_text() function in jsx.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  9784be88_5c99_10fd_04e7_dc7156a5b963["split_text()"]
  ca9449bb_53da_b30e_bfd5_acbf2e310d44["JSFrameworkTextSplitter"]
  9784be88_5c99_10fd_04e7_dc7156a5b963 -->|defined in| ca9449bb_53da_b30e_bfd5_acbf2e310d44
  style 9784be88_5c99_10fd_04e7_dc7156a5b963 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/jsx.py lines 46–102

    def split_text(self, text: str) -> list[str]:
        """Split text into chunks.

        This method splits the text into chunks by:

        * Extracting unique opening component tags using regex
        * Creating separators list with extracted tags and JS separators
        * Splitting the text using the separators by calling the parent class method

        Args:
            text: String containing code to split

        Returns:
            List of text chunks split on component and JS boundaries
        """
        # Extract unique opening component tags using regex
        # Regex to match opening tags, excluding self-closing tags
        opening_tags = re.findall(r"<\s*([a-zA-Z0-9]+)[^>]*>", text)

        component_tags = []
        for tag in opening_tags:
            if tag not in component_tags:
                component_tags.append(tag)
        component_separators = [f"<{tag}" for tag in component_tags]

        js_separators = [
            "\nexport ",
            " export ",
            "\nfunction ",
            "\nasync function ",
            " async function ",
            "\nconst ",
            "\nlet ",
            "\nvar ",
            "\nclass ",
            " class ",
            "\nif ",
            " if ",
            "\nfor ",
            " for ",
            "\nwhile ",
            " while ",
            "\nswitch ",
            " switch ",
            "\ncase ",
            " case ",
            "\ndefault ",
            " default ",
        ]
        separators = (
            self._separators
            + js_separators
            + component_separators
            + ["<>", "\n\n", "&&\n", "||\n"]
        )
        self._separators = separators
        return super().split_text(text)

Subdomains

Frequently Asked Questions

What does split_text() do?
split_text() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/jsx.py.
Where is split_text() defined?
split_text() is defined in libs/text-splitters/langchain_text_splitters/jsx.py at line 46.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free