fastembed_sparse.py — langchain Source File
Architecture documentation for fastembed_sparse.py, a python file in the langchain codebase. 4 imports, 0 dependents.
Entity Profile
Dependency Diagram
graph LR 1610162b_3386_8d1b_c654_904b72105353["fastembed_sparse.py"] feec1ec4_6917_867b_d228_b134d0ff8099["typing"] 1610162b_3386_8d1b_c654_904b72105353 --> feec1ec4_6917_867b_d228_b134d0ff8099 cb8d0736_e9dd_2b2f_f6cf_9b165c97bf8b["langchain_qdrant.sparse_embeddings"] 1610162b_3386_8d1b_c654_904b72105353 --> cb8d0736_e9dd_2b2f_f6cf_9b165c97bf8b 2bf6d401_816d_d011_3b05_a6114f55ff58["collections.abc"] 1610162b_3386_8d1b_c654_904b72105353 --> 2bf6d401_816d_d011_3b05_a6114f55ff58 2a608d75_f635_d1c7_b03f_32a2a76f2a9d["fastembed"] 1610162b_3386_8d1b_c654_904b72105353 --> 2a608d75_f635_d1c7_b03f_32a2a76f2a9d style 1610162b_3386_8d1b_c654_904b72105353 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
from __future__ import annotations
from typing import TYPE_CHECKING, Any
from langchain_qdrant.sparse_embeddings import SparseEmbeddings, SparseVector
if TYPE_CHECKING:
from collections.abc import Sequence
class FastEmbedSparse(SparseEmbeddings):
"""An interface for sparse embedding models to use with Qdrant."""
def __init__(
self,
model_name: str = "Qdrant/bm25",
batch_size: int = 256,
cache_dir: str | None = None,
threads: int | None = None,
providers: Sequence[Any] | None = None,
parallel: int | None = None,
**kwargs: Any,
) -> None:
"""Sparse encoder implementation using FastEmbed.
Uses [FastEmbed](https://qdrant.github.io/fastembed/) for sparse text
embeddings.
For a list of available models, see [the Qdrant docs](https://qdrant.github.io/fastembed/examples/Supported_Models/).
Args:
model_name (str): The name of the model to use.
batch_size (int): Batch size for encoding.
cache_dir (str, optional): The path to the model cache directory.\
Can also be set using the\
`FASTEMBED_CACHE_PATH` env variable.
threads (int, optional): The number of threads onnxruntime session can use.
providers (Sequence[Any], optional): List of ONNX execution providers.\
parallel (int, optional): If `>1`, data-parallel encoding will be used, r\
Recommended for encoding of large datasets.\
If `0`, use all available cores.\
If `None`, don't use data-parallel processing,\
use default onnxruntime threading instead.\
kwargs: Additional options to pass to `fastembed.SparseTextEmbedding`
Raises:
ValueError: If the `model_name` is not supported in `SparseTextEmbedding`.
"""
try:
from fastembed import ( # type: ignore[import-not-found] # noqa: PLC0415
SparseTextEmbedding,
)
except ImportError as err:
msg = (
"The 'fastembed' package is not installed. "
"Please install it with "
"`pip install fastembed` or `pip install fastembed-gpu`."
)
raise ValueError(msg) from err
self._batch_size = batch_size
self._parallel = parallel
self._model = SparseTextEmbedding(
model_name=model_name,
cache_dir=cache_dir,
threads=threads,
providers=providers,
**kwargs,
)
def embed_documents(self, texts: list[str]) -> list[SparseVector]:
results = self._model.embed(
texts, batch_size=self._batch_size, parallel=self._parallel
)
return [
SparseVector(indices=result.indices.tolist(), values=result.values.tolist())
for result in results
]
def embed_query(self, text: str) -> SparseVector:
result = next(self._model.query_embed(text))
return SparseVector(
indices=result.indices.tolist(), values=result.values.tolist()
)
Domain
Subdomains
Functions
Classes
Dependencies
- collections.abc
- fastembed
- langchain_qdrant.sparse_embeddings
- typing
Source
Frequently Asked Questions
What does fastembed_sparse.py do?
fastembed_sparse.py is a source file in the langchain codebase, written in python. It belongs to the LangChainCore domain, MessageInterface subdomain.
What functions are defined in fastembed_sparse.py?
fastembed_sparse.py defines 1 function(s): collections.
What does fastembed_sparse.py depend on?
fastembed_sparse.py imports 4 module(s): collections.abc, fastembed, langchain_qdrant.sparse_embeddings, typing.
Where is fastembed_sparse.py in the architecture?
fastembed_sparse.py is located at libs/partners/qdrant/langchain_qdrant/fastembed_sparse.py (domain: LangChainCore, subdomain: MessageInterface, directory: libs/partners/qdrant/langchain_qdrant).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free