from_bytes_store() — langchain Function Reference
Architecture documentation for the from_bytes_store() function in cache.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD fc5a90e3_3529_5688_86a8_34cee618454e["from_bytes_store()"] b3be4e54_ae5f_c527_4e99_0843e3d30f72["CacheBackedEmbeddings"] fc5a90e3_3529_5688_86a8_34cee618454e -->|defined in| b3be4e54_ae5f_c527_4e99_0843e3d30f72 4b1f75e8_3a36_4d2e_f5d9_04dbd0255a60["_make_default_key_encoder()"] fc5a90e3_3529_5688_86a8_34cee618454e -->|calls| 4b1f75e8_3a36_4d2e_f5d9_04dbd0255a60 style fc5a90e3_3529_5688_86a8_34cee618454e fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/langchain/langchain_classic/embeddings/cache.py lines 288–370
def from_bytes_store(
cls,
underlying_embeddings: Embeddings,
document_embedding_cache: ByteStore,
*,
namespace: str = "",
batch_size: int | None = None,
query_embedding_cache: bool | ByteStore = False,
key_encoder: Callable[[str], str]
| Literal["sha1", "blake2b", "sha256", "sha512"] = "sha1",
) -> CacheBackedEmbeddings:
"""On-ramp that adds the necessary serialization and encoding to the store.
Args:
underlying_embeddings: The embedder to use for embedding.
document_embedding_cache: The cache to use for storing document embeddings.
*,
namespace: The namespace to use for document cache.
This namespace is used to avoid collisions with other caches.
For example, set it to the name of the embedding model used.
batch_size: The number of documents to embed between store updates.
query_embedding_cache: The cache to use for storing query embeddings.
True to use the same cache as document embeddings.
False to not cache query embeddings.
key_encoder: Optional callable to encode keys. If not provided,
a default encoder using SHA-1 will be used. SHA-1 is not
collision-resistant, and a motivated attacker could craft two
different texts that hash to the same cache key.
New applications should use one of the alternative encoders
or provide a custom and strong key encoder function to avoid this risk.
If you change a key encoder in an existing cache, consider
just creating a new cache, to avoid (the potential for)
collisions with existing keys or having duplicate keys
for the same text in the cache.
Returns:
An instance of CacheBackedEmbeddings that uses the provided cache.
"""
if isinstance(key_encoder, str):
key_encoder = _make_default_key_encoder(namespace, key_encoder)
elif callable(key_encoder):
# If a custom key encoder is provided, it should not be used with a
# namespace.
# A user can handle namespacing in directly their custom key encoder.
if namespace:
msg = (
"Do not supply `namespace` when using a custom key_encoder; "
"add any prefixing inside the encoder itself."
)
raise ValueError(msg)
else:
msg = ( # type: ignore[unreachable]
"key_encoder must be either 'blake2b', 'sha1', 'sha256', 'sha512' "
"or a callable that encodes keys."
)
raise ValueError(msg) # noqa: TRY004
document_embedding_store = EncoderBackedStore[str, list[float]](
document_embedding_cache,
key_encoder,
_value_serializer,
_value_deserializer,
)
if query_embedding_cache is True:
query_embedding_store = document_embedding_store
elif query_embedding_cache is False:
query_embedding_store = None
else:
query_embedding_store = EncoderBackedStore[str, list[float]](
query_embedding_cache,
key_encoder,
_value_serializer,
_value_deserializer,
)
return cls(
underlying_embeddings,
document_embedding_store,
batch_size=batch_size,
Domain
Subdomains
Source
Frequently Asked Questions
What does from_bytes_store() do?
from_bytes_store() is a function in the langchain codebase, defined in libs/langchain/langchain_classic/embeddings/cache.py.
Where is from_bytes_store() defined?
from_bytes_store() is defined in libs/langchain/langchain_classic/embeddings/cache.py at line 288.
What does from_bytes_store() call?
from_bytes_store() calls 1 function(s): _make_default_key_encoder.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free