FastEmbedSparse Class — langchain Architecture
Architecture documentation for the FastEmbedSparse class in fastembed_sparse.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 0a59d8d5_2457_e267_a195_a6431d0e41e9["FastEmbedSparse"] d2c50637_94ac_9030_8f99_d10858fb4c29["SparseEmbeddings"] 0a59d8d5_2457_e267_a195_a6431d0e41e9 -->|extends| d2c50637_94ac_9030_8f99_d10858fb4c29 1610162b_3386_8d1b_c654_904b72105353["fastembed_sparse.py"] 0a59d8d5_2457_e267_a195_a6431d0e41e9 -->|defined in| 1610162b_3386_8d1b_c654_904b72105353 72abf24a_db3e_7ba5_3186_9f5df0ec3fa4["__init__()"] 0a59d8d5_2457_e267_a195_a6431d0e41e9 -->|method| 72abf24a_db3e_7ba5_3186_9f5df0ec3fa4 1d3e82ef_f0fe_231c_ca78_f84ebc04c182["embed_documents()"] 0a59d8d5_2457_e267_a195_a6431d0e41e9 -->|method| 1d3e82ef_f0fe_231c_ca78_f84ebc04c182 4e6a2169_e6c2_3147_a2ee_ad6ed4131649["embed_query()"] 0a59d8d5_2457_e267_a195_a6431d0e41e9 -->|method| 4e6a2169_e6c2_3147_a2ee_ad6ed4131649
Relationship Graph
Source Code
libs/partners/qdrant/langchain_qdrant/fastembed_sparse.py lines 11–84
class FastEmbedSparse(SparseEmbeddings):
"""An interface for sparse embedding models to use with Qdrant."""
def __init__(
self,
model_name: str = "Qdrant/bm25",
batch_size: int = 256,
cache_dir: str | None = None,
threads: int | None = None,
providers: Sequence[Any] | None = None,
parallel: int | None = None,
**kwargs: Any,
) -> None:
"""Sparse encoder implementation using FastEmbed.
Uses [FastEmbed](https://qdrant.github.io/fastembed/) for sparse text
embeddings.
For a list of available models, see [the Qdrant docs](https://qdrant.github.io/fastembed/examples/Supported_Models/).
Args:
model_name (str): The name of the model to use.
batch_size (int): Batch size for encoding.
cache_dir (str, optional): The path to the model cache directory.\
Can also be set using the\
`FASTEMBED_CACHE_PATH` env variable.
threads (int, optional): The number of threads onnxruntime session can use.
providers (Sequence[Any], optional): List of ONNX execution providers.\
parallel (int, optional): If `>1`, data-parallel encoding will be used, r\
Recommended for encoding of large datasets.\
If `0`, use all available cores.\
If `None`, don't use data-parallel processing,\
use default onnxruntime threading instead.\
kwargs: Additional options to pass to `fastembed.SparseTextEmbedding`
Raises:
ValueError: If the `model_name` is not supported in `SparseTextEmbedding`.
"""
try:
from fastembed import ( # type: ignore[import-not-found] # noqa: PLC0415
SparseTextEmbedding,
)
except ImportError as err:
msg = (
"The 'fastembed' package is not installed. "
"Please install it with "
"`pip install fastembed` or `pip install fastembed-gpu`."
)
raise ValueError(msg) from err
self._batch_size = batch_size
self._parallel = parallel
self._model = SparseTextEmbedding(
model_name=model_name,
cache_dir=cache_dir,
threads=threads,
providers=providers,
**kwargs,
)
def embed_documents(self, texts: list[str]) -> list[SparseVector]:
results = self._model.embed(
texts, batch_size=self._batch_size, parallel=self._parallel
)
return [
SparseVector(indices=result.indices.tolist(), values=result.values.tolist())
for result in results
]
def embed_query(self, text: str) -> SparseVector:
result = next(self._model.query_embed(text))
return SparseVector(
indices=result.indices.tolist(), values=result.values.tolist()
)
Extends
Source
Frequently Asked Questions
What is the FastEmbedSparse class?
FastEmbedSparse is a class in the langchain codebase, defined in libs/partners/qdrant/langchain_qdrant/fastembed_sparse.py.
Where is FastEmbedSparse defined?
FastEmbedSparse is defined in libs/partners/qdrant/langchain_qdrant/fastembed_sparse.py at line 11.
What does FastEmbedSparse extend?
FastEmbedSparse extends SparseEmbeddings.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free