Home / Function/ test_incremental_indexing_with_batch_size() — langchain Function Reference

test_incremental_indexing_with_batch_size() — langchain Function Reference

Architecture documentation for the test_incremental_indexing_with_batch_size() function in test_indexing.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  1a911a4b_a0d4_f08d_9e45_e3bd45adc45d["test_incremental_indexing_with_batch_size()"]
  576ad89d_c8dc_eddf_9cd2_c8ae0e7c9978["test_indexing.py"]
  1a911a4b_a0d4_f08d_9e45_e3bd45adc45d -->|defined in| 576ad89d_c8dc_eddf_9cd2_c8ae0e7c9978
  style 1a911a4b_a0d4_f08d_9e45_e3bd45adc45d fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/core/tests/unit_tests/indexing/test_indexing.py lines 1567–1644

def test_incremental_indexing_with_batch_size(
    record_manager: InMemoryRecordManager, vector_store: InMemoryVectorStore
) -> None:
    """Test indexing with incremental indexing."""
    loader = ToyLoader(
        documents=[
            Document(
                page_content="1",
                metadata={"source": "1"},
            ),
            Document(
                page_content="2",
                metadata={"source": "1"},
            ),
            Document(
                page_content="3",
                metadata={"source": "1"},
            ),
            Document(
                page_content="4",
                metadata={"source": "1"},
            ),
        ]
    )

    with patch.object(
        record_manager,
        "get_time",
        return_value=datetime(2021, 1, 1, tzinfo=timezone.utc).timestamp(),
    ):
        assert index(
            loader,
            record_manager,
            vector_store,
            cleanup="incremental",
            source_id_key="source",
            batch_size=2,
            key_encoder="sha256",
        ) == {
            "num_added": 4,
            "num_deleted": 0,
            "num_skipped": 0,
            "num_updated": 0,
        }

    doc_texts = {
        # Ignoring type since doc should be in the store and not a None
        vector_store.get_by_ids([uid])[0].page_content
        for uid in vector_store.store
    }
    assert doc_texts == {"1", "2", "3", "4"}

    with patch.object(
        record_manager,
        "get_time",
        return_value=datetime(2021, 1, 2, tzinfo=timezone.utc).timestamp(),
    ):
        assert index(
            loader,
            record_manager,
            vector_store,
            cleanup="incremental",
            source_id_key="source",
            batch_size=2,
            key_encoder="sha256",
        ) == {
            "num_added": 2,
            "num_deleted": 2,
            "num_skipped": 2,
            "num_updated": 0,
        }

    doc_texts = {
        # Ignoring type since doc should be in the store and not a None
        vector_store.get_by_ids([uid])[0].page_content
        for uid in vector_store.store
    }
    assert doc_texts == {"1", "2", "3", "4"}

Subdomains

Frequently Asked Questions

What does test_incremental_indexing_with_batch_size() do?
test_incremental_indexing_with_batch_size() is a function in the langchain codebase, defined in libs/core/tests/unit_tests/indexing/test_indexing.py.
Where is test_incremental_indexing_with_batch_size() defined?
test_incremental_indexing_with_batch_size() is defined in libs/core/tests/unit_tests/indexing/test_indexing.py at line 1567.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free