Home / File/ json.py — langchain Source File

json.py — langchain Source File

Architecture documentation for json.py, a python file in the langchain codebase. 4 imports, 1 dependents.

File python DocumentProcessing TextSplitters 4 imports 1 dependents 1 classes

Entity Profile

Dependency Diagram

graph LR
  c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"]
  e874d8a4_cef0_9d0b_d1ee_84999c07cc2c["copy"]
  c67269cb_3e1f_66bc_89a3_cf12560e7339 --> e874d8a4_cef0_9d0b_d1ee_84999c07cc2c
  c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"]
  c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c67269cb_3e1f_66bc_89a3_cf12560e7339
  8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"]
  c67269cb_3e1f_66bc_89a3_cf12560e7339 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  c67269cb_3e1f_66bc_89a3_cf12560e7339["json.py"]
  c67269cb_3e1f_66bc_89a3_cf12560e7339 --> c67269cb_3e1f_66bc_89a3_cf12560e7339
  style c67269cb_3e1f_66bc_89a3_cf12560e7339 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

"""JSON text splitter."""

from __future__ import annotations

import copy
import json
from typing import Any

from langchain_core.documents import Document


class RecursiveJsonSplitter:
    """Splits JSON data into smaller, structured chunks while preserving hierarchy.

    This class provides methods to split JSON data into smaller dictionaries or
    JSON-formatted strings based on configurable maximum and minimum chunk sizes.
    It supports nested JSON structures, optionally converts lists into dictionaries
    for better chunking, and allows the creation of document objects for further use.
    """

    max_chunk_size: int = 2000
    """The maximum size for each chunk."""

    min_chunk_size: int = 1800
    """The minimum size for each chunk, derived from `max_chunk_size` if not
    explicitly provided.
    """

    def __init__(
        self, max_chunk_size: int = 2000, min_chunk_size: int | None = None
    ) -> None:
        """Initialize the chunk size configuration for text processing.

        This constructor sets up the maximum and minimum chunk sizes, ensuring that
        the `min_chunk_size` defaults to a value slightly smaller than the
        `max_chunk_size` if not explicitly provided.

        Args:
            max_chunk_size: The maximum size for a chunk.
            min_chunk_size: The minimum size for a chunk.

                If `None`, defaults to the maximum chunk size minus 200, with a lower
                bound of 50.
        """
        super().__init__()
        self.max_chunk_size = max_chunk_size
        self.min_chunk_size = (
            min_chunk_size
            if min_chunk_size is not None
            else max(max_chunk_size - 200, 50)
        )

    @staticmethod
    def _json_size(data: dict[str, Any]) -> int:
        """Calculate the size of the serialized JSON object."""
        return len(json.dumps(data))

    @staticmethod
    def _set_nested_dict(
        d: dict[str, Any],
// ... (131 more lines)

Subdomains

Dependencies

  • copy
  • json.py
  • langchain_core.documents
  • typing

Frequently Asked Questions

What does json.py do?
json.py is a source file in the langchain codebase, written in python. It belongs to the DocumentProcessing domain, TextSplitters subdomain.
What does json.py depend on?
json.py imports 4 module(s): copy, json.py, langchain_core.documents, typing.
What files import json.py?
json.py is imported by 1 file(s): json.py.
Where is json.py in the architecture?
json.py is located at libs/text-splitters/langchain_text_splitters/json.py (domain: DocumentProcessing, subdomain: TextSplitters, directory: libs/text-splitters/langchain_text_splitters).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free