Home / Class/ DocumentIndex Class — langchain Architecture

DocumentIndex Class — langchain Architecture

Architecture documentation for the DocumentIndex class in base.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07["DocumentIndex"]
  2a401977_bd56_ea94_9c8f_d0b77072baae["BaseRetriever"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|extends| 2a401977_bd56_ea94_9c8f_d0b77072baae
  44ffc3da_66a5_f9ca_57ac_f9a80e82f0c8["base.py"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|defined in| 44ffc3da_66a5_f9ca_57ac_f9a80e82f0c8
  4abf70dc_dd15_60c5_4cce_1312c88e3647["upsert()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| 4abf70dc_dd15_60c5_4cce_1312c88e3647
  5982d0ee_4108_afd4_2bd8_488e0f0d8697["aupsert()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| 5982d0ee_4108_afd4_2bd8_488e0f0d8697
  84c9f1ba_cde8_3ce9_dc71_858dbd25926e["delete()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| 84c9f1ba_cde8_3ce9_dc71_858dbd25926e
  37d43766_b397_82f0_edad_bd75602ef9b1["adelete()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| 37d43766_b397_82f0_edad_bd75602ef9b1
  99922a07_b48e_2712_da81_be82e5f2c28d["get()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| 99922a07_b48e_2712_da81_be82e5f2c28d
  c862be9c_797a_0bd4_b6f9_db3c695602bd["aget()"]
  b6dfcb0b_d4ed_4574_14c7_fa7e485f1b07 -->|method| c862be9c_797a_0bd4_b6f9_db3c695602bd

Relationship Graph

Source Code

libs/core/langchain_core/indexing/base.py lines 497–661

class DocumentIndex(BaseRetriever):
    """A document retriever that supports indexing operations.

    This indexing interface is designed to be a generic abstraction for storing and
    querying documents that has an ID and metadata associated with it.

    The interface is designed to be agnostic to the underlying implementation of the
    indexing system.

    The interface is designed to support the following operations:

    1. Storing document in the index.
    2. Fetching document by ID.
    3. Searching for document using a query.
    """

    @abc.abstractmethod
    def upsert(self, items: Sequence[Document], /, **kwargs: Any) -> UpsertResponse:
        """Upsert documents into the index.

        The upsert functionality should utilize the ID field of the content object
        if it is provided. If the ID is not provided, the upsert method is free
        to generate an ID for the content.

        When an ID is specified and the content already exists in the `VectorStore`,
        the upsert method should update the content with the new data. If the content
        does not exist, the upsert method should add the item to the `VectorStore`.

        Args:
            items: Sequence of documents to add to the `VectorStore`.
            **kwargs: Additional keyword arguments.

        Returns:
            A response object that contains the list of IDs that were
            successfully added or updated in the `VectorStore` and the list of IDs that
            failed to be added or updated.
        """

    async def aupsert(
        self, items: Sequence[Document], /, **kwargs: Any
    ) -> UpsertResponse:
        """Add or update documents in the `VectorStore`. Async version of `upsert`.

        The upsert functionality should utilize the ID field of the item
        if it is provided. If the ID is not provided, the upsert method is free
        to generate an ID for the item.

        When an ID is specified and the item already exists in the `VectorStore`,
        the upsert method should update the item with the new data. If the item
        does not exist, the upsert method should add the item to the `VectorStore`.

        Args:
            items: Sequence of documents to add to the `VectorStore`.
            **kwargs: Additional keyword arguments.

        Returns:
            A response object that contains the list of IDs that were
            successfully added or updated in the `VectorStore` and the list of IDs that
            failed to be added or updated.
        """
        return await run_in_executor(
            None,
            self.upsert,
            items,
            **kwargs,
        )

    @abc.abstractmethod
    def delete(self, ids: list[str] | None = None, **kwargs: Any) -> DeleteResponse:
        """Delete by IDs or other criteria.

        Calling delete without any input parameters should raise a ValueError!

        Args:
            ids: List of IDs to delete.
            **kwargs: Additional keyword arguments. This is up to the implementation.
                For example, can include an option to delete the entire index,
                or else issue a non-blocking delete etc.

        Returns:
            A response object that contains the list of IDs that were

Extends

Frequently Asked Questions

What is the DocumentIndex class?
DocumentIndex is a class in the langchain codebase, defined in libs/core/langchain_core/indexing/base.py.
Where is DocumentIndex defined?
DocumentIndex is defined in libs/core/langchain_core/indexing/base.py at line 497.
What does DocumentIndex extend?
DocumentIndex extends BaseRetriever.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free