Home / File/ test_hashed_document.py — langchain Source File

test_hashed_document.py — langchain Source File

Architecture documentation for test_hashed_document.py, a python file in the langchain codebase. 3 imports, 0 dependents.

Entity Profile

Dependency Diagram

graph LR
  af3e2fd9_b742_b6ab_ccff_6e4948c48977["test_hashed_document.py"]
  8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"]
  af3e2fd9_b742_b6ab_ccff_6e4948c48977 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3
  c554676d_b731_47b2_a98f_c1c2d537c0aa["langchain_core.documents"]
  af3e2fd9_b742_b6ab_ccff_6e4948c48977 --> c554676d_b731_47b2_a98f_c1c2d537c0aa
  ffb0de41_b0f2_997d_ad3d_1a29fba34ab1["langchain_core.indexing.api"]
  af3e2fd9_b742_b6ab_ccff_6e4948c48977 --> ffb0de41_b0f2_997d_ad3d_1a29fba34ab1
  style af3e2fd9_b742_b6ab_ccff_6e4948c48977 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

from typing import Literal

from langchain_core.documents import Document
from langchain_core.indexing.api import _get_document_with_hash


def test_hashed_document_hashing() -> None:
    document = Document(
        uid="123", page_content="Lorem ipsum dolor sit amet", metadata={"key": "value"}
    )
    hashed_document = _get_document_with_hash(document, key_encoder="sha1")
    assert isinstance(hashed_document.id, str)


def test_to_document() -> None:
    """Test to_document method."""
    original_doc = Document(
        page_content="Lorem ipsum dolor sit amet", metadata={"key": "value"}
    )
    hashed_doc = _get_document_with_hash(original_doc, key_encoder="sha1")
    assert isinstance(hashed_doc, Document)
    assert hashed_doc is not original_doc
    assert hashed_doc.page_content == "Lorem ipsum dolor sit amet"
    assert hashed_doc.metadata["key"] == "value"


def test_hashing() -> None:
    """Test from document class method."""
    document = Document(
        page_content="Lorem ipsum dolor sit amet", metadata={"key": "value"}
    )
    hashed_document = _get_document_with_hash(document, key_encoder="sha1")
    # hash should be deterministic
    assert hashed_document.id == "fd1dc827-051b-537d-a1fe-1fa043e8b276"

    # Verify that hashing with sha1 is deterministic
    another_hashed_document = _get_document_with_hash(document, key_encoder="sha1")
    assert another_hashed_document.id == hashed_document.id

    # Verify that the result is different from SHA256, SHA512, blake2b
    values: list[Literal["sha256", "sha512", "blake2b"]] = [
        "sha256",
        "sha512",
        "blake2b",
    ]

    for key_encoder in values:
        different_hashed_document = _get_document_with_hash(
            document, key_encoder=key_encoder
        )
        assert different_hashed_document.id != hashed_document.id


def test_hashing_custom_key_encoder() -> None:
    """Test hashing with a custom key encoder."""

    def custom_key_encoder(doc: Document) -> str:
        return f"quack-{doc.metadata['key']}"

    document = Document(
        page_content="Lorem ipsum dolor sit amet", metadata={"key": "like a duck"}
    )
    hashed_document = _get_document_with_hash(document, key_encoder=custom_key_encoder)
    assert hashed_document.id == "quack-like a duck"
    assert isinstance(hashed_document.id, str)

Subdomains

Dependencies

  • langchain_core.documents
  • langchain_core.indexing.api
  • typing

Frequently Asked Questions

What does test_hashed_document.py do?
test_hashed_document.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, RunnableInterface subdomain.
What functions are defined in test_hashed_document.py?
test_hashed_document.py defines 4 function(s): test_hashed_document_hashing, test_hashing, test_hashing_custom_key_encoder, test_to_document.
What does test_hashed_document.py depend on?
test_hashed_document.py imports 3 module(s): langchain_core.documents, langchain_core.indexing.api, typing.
Where is test_hashed_document.py in the architecture?
test_hashed_document.py is located at libs/core/tests/unit_tests/indexing/test_hashed_document.py (domain: CoreAbstractions, subdomain: RunnableInterface, directory: libs/core/tests/unit_tests/indexing).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free