jsx.py — langchain Source File
Architecture documentation for jsx.py, a python file in the langchain codebase. 3 imports, 0 dependents.
Entity Profile
Dependency Diagram
graph LR e969e3be_caa0_f4cc_b1ed_b8ef51787409["jsx.py"] 67ec3255_645e_8b6e_1eff_1eb3c648ed95["re"] e969e3be_caa0_f4cc_b1ed_b8ef51787409 --> 67ec3255_645e_8b6e_1eff_1eb3c648ed95 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"] e969e3be_caa0_f4cc_b1ed_b8ef51787409 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3 5d24a664_4d9b_7491_ea6a_e13ddbcc8eeb["langchain_text_splitters"] e969e3be_caa0_f4cc_b1ed_b8ef51787409 --> 5d24a664_4d9b_7491_ea6a_e13ddbcc8eeb style e969e3be_caa0_f4cc_b1ed_b8ef51787409 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
"""JavaScript framework text splitter."""
import re
from typing import Any
from langchain_text_splitters import RecursiveCharacterTextSplitter
class JSFrameworkTextSplitter(RecursiveCharacterTextSplitter):
"""Text splitter that handles React (JSX), Vue, and Svelte code.
This splitter extends `RecursiveCharacterTextSplitter` to handle React (JSX), Vue,
and Svelte code by:
1. Detecting and extracting custom component tags from the text
2. Using those tags as additional separators along with standard JS syntax
The splitter combines:
* Custom component tags as separators (e.g. `<Component`, `<div`)
* JavaScript syntax elements (function, const, if, etc)
* Standard text splitting on newlines
This allows chunks to break at natural boundaries in React, Vue, and Svelte
component code.
"""
def __init__(
self,
separators: list[str] | None = None,
chunk_size: int = 2000,
chunk_overlap: int = 0,
**kwargs: Any,
) -> None:
"""Initialize the JS Framework text splitter.
Args:
separators: Optional list of custom separator strings to use
chunk_size: Maximum size of chunks to return
chunk_overlap: Overlap in characters between chunks
**kwargs: Additional arguments to pass to parent class
"""
super().__init__(chunk_size=chunk_size, chunk_overlap=chunk_overlap, **kwargs)
self._separators = separators or []
def split_text(self, text: str) -> list[str]:
"""Split text into chunks.
This method splits the text into chunks by:
* Extracting unique opening component tags using regex
* Creating separators list with extracted tags and JS separators
* Splitting the text using the separators by calling the parent class method
Args:
text: String containing code to split
Returns:
List of text chunks split on component and JS boundaries
"""
# Extract unique opening component tags using regex
# Regex to match opening tags, excluding self-closing tags
opening_tags = re.findall(r"<\s*([a-zA-Z0-9]+)[^>]*>", text)
component_tags = []
for tag in opening_tags:
if tag not in component_tags:
component_tags.append(tag)
component_separators = [f"<{tag}" for tag in component_tags]
js_separators = [
"\nexport ",
" export ",
"\nfunction ",
"\nasync function ",
" async function ",
"\nconst ",
"\nlet ",
"\nvar ",
"\nclass ",
" class ",
"\nif ",
" if ",
"\nfor ",
" for ",
"\nwhile ",
" while ",
"\nswitch ",
" switch ",
"\ncase ",
" case ",
"\ndefault ",
" default ",
]
separators = (
self._separators
+ js_separators
+ component_separators
+ ["<>", "\n\n", "&&\n", "||\n"]
)
self._separators = separators
return super().split_text(text)
Domain
Subdomains
Classes
Dependencies
- langchain_text_splitters
- re
- typing
Source
Frequently Asked Questions
What does jsx.py do?
jsx.py is a source file in the langchain codebase, written in python. It belongs to the DocumentProcessing domain, TextSplitters subdomain.
What does jsx.py depend on?
jsx.py imports 3 module(s): langchain_text_splitters, re, typing.
Where is jsx.py in the architecture?
jsx.py is located at libs/text-splitters/langchain_text_splitters/jsx.py (domain: DocumentProcessing, subdomain: TextSplitters, directory: libs/text-splitters/langchain_text_splitters).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free