json.py — langchain Source File
Architecture documentation for json.py, a python file in the langchain codebase. 4 imports, 1 dependents.
Entity Profile
Dependency Diagram
graph LR c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"] e874d8a4_cef0_9d0b_d1ee_84999c07cc2c["copy"] c67269cb_3e1f_66bc_89a3_cf12560e7339 --> e874d8a4_cef0_9d0b_d1ee_84999c07cc2c c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"] c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c67269cb_3e1f_66bc_89a3_cf12560e7339 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"] c67269cb_3e1f_66bc_89a3_cf12560e7339 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3 c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"] c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c554676d_b731_47b2_a98f_c1c2d537c0aa c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"] c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c67269cb_3e1f_66bc_89a3_cf12560e7339 style c67269cb_3e1f_66bc_89a3_cf12560e7339 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
"""JSON text splitter."""
from __future__ import annotations
import copy
import json
from typing import Any
from langchain_core.documents import Document
class RecursiveJsonSplitter:
"""Splits JSON data into smaller, structured chunks while preserving hierarchy.
This class provides methods to split JSON data into smaller dictionaries or
JSON-formatted strings based on configurable maximum and minimum chunk sizes.
It supports nested JSON structures, optionally converts lists into dictionaries
for better chunking, and allows the creation of document objects for further use.
"""
max_chunk_size: int = 2000
"""The maximum size for each chunk."""
min_chunk_size: int = 1800
"""The minimum size for each chunk, derived from `max_chunk_size` if not
explicitly provided.
"""
def __init__(
self, max_chunk_size: int = 2000, min_chunk_size: int | None = None
) -> None:
"""Initialize the chunk size configuration for text processing.
This constructor sets up the maximum and minimum chunk sizes, ensuring that
the `min_chunk_size` defaults to a value slightly smaller than the
`max_chunk_size` if not explicitly provided.
Args:
max_chunk_size: The maximum size for a chunk.
min_chunk_size: The minimum size for a chunk.
If `None`, defaults to the maximum chunk size minus 200, with a lower
bound of 50.
"""
super().__init__()
self.max_chunk_size = max_chunk_size
self.min_chunk_size = (
min_chunk_size
if min_chunk_size is not None
else max(max_chunk_size - 200, 50)
)
@staticmethod
def _json_size(data: dict[str, Any]) -> int:
"""Calculate the size of the serialized JSON object."""
return len(json.dumps(data))
@staticmethod
def _set_nested_dict(
d: dict[str, Any],
// ... (131 more lines)
Domain
Subdomains
Classes
Dependencies
- copy
- json.py
- langchain_core.documents
- typing
Source
Frequently Asked Questions
What does json.py do?
json.py is a source file in the langchain codebase, written in python. It belongs to the DocumentProcessing domain, TextSplitters subdomain.
What does json.py depend on?
json.py imports 4 module(s): copy, json.py, langchain_core.documents, typing.
What files import json.py?
json.py is imported by 1 file(s): json.py.
Where is json.py in the architecture?
json.py is located at libs/text-splitters/langchain_text_splitters/json.py (domain: DocumentProcessing, subdomain: TextSplitters, directory: libs/text-splitters/langchain_text_splitters).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free