aembed_documents() — langchain Function Reference

Architecture documentation for the aembed_documents() function in cache.py from the langchain codebase.

Function python LangChainCore Runnables

Entity Profile

LangChainCore→ Runnables→ aembed_documents() — langchain Function Reference

Dependency Diagram

graph TD
  db7dff5e_5bf1_905b_2298_cceb81672a4c["aembed_documents()"]
  b3be4e54_ae5f_c527_4e99_0843e3d30f72["CacheBackedEmbeddings"]
  db7dff5e_5bf1_905b_2298_cceb81672a4c -->|defined in| b3be4e54_ae5f_c527_4e99_0843e3d30f72
  style db7dff5e_5bf1_905b_2298_cceb81672a4c fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/langchain/langchain_classic/embeddings/cache.py lines 201–239

    async def aembed_documents(self, texts: list[str]) -> list[list[float]]:
        """Embed a list of texts.

        The method first checks the cache for the embeddings.
        If the embeddings are not found, the method uses the underlying embedder
        to embed the documents and stores the results in the cache.

        Args:
            texts: A list of texts to embed.

        Returns:
            A list of embeddings for the given texts.
        """
        vectors: list[list[float] | None] = await self.document_embedding_store.amget(
            texts
        )
        all_missing_indices: list[int] = [
            i for i, vector in enumerate(vectors) if vector is None
        ]

        # batch_iterate supports None batch_size which returns all elements at once
        # as a single batch.
        for missing_indices in batch_iterate(self.batch_size, all_missing_indices):
            missing_texts = [texts[i] for i in missing_indices]
            missing_vectors = await self.underlying_embeddings.aembed_documents(
                missing_texts,
            )
            await self.document_embedding_store.amset(
                list(zip(missing_texts, missing_vectors, strict=False)),
            )
            for index, updated_vector in zip(
                missing_indices, missing_vectors, strict=False
            ):
                vectors[index] = updated_vector

        return cast(
            "list[list[float]]",
            vectors,
        )  # Nones should have been resolved by now