base.py — langchain Source File
Architecture documentation for base.py, a python file in the langchain codebase. 15 imports, 0 dependents.
Entity Profile
Dependency Diagram
graph LR e013244a_7e0e_baa7_ce3b_16dab4320e45["base.py"] e27da29f_a1f7_49f3_84d5_6be4cb4125c8["logging"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> e27da29f_a1f7_49f3_84d5_6be4cb4125c8 f3365e3c_fb7a_bb9a_bc79_059b06cb7024["warnings"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> f3365e3c_fb7a_bb9a_bc79_059b06cb7024 2bf6d401_816d_d011_3b05_a6114f55ff58["collections.abc"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> 2bf6d401_816d_d011_3b05_a6114f55ff58 feec1ec4_6917_867b_d228_b134d0ff8099["typing"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> feec1ec4_6917_867b_d228_b134d0ff8099 082af17d_b8ac_eccd_d339_93cabe1a9b40["openai"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> 082af17d_b8ac_eccd_d339_93cabe1a9b40 48f5485f_680a_97b7_bfc7_aff0508d4ca0["tiktoken"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> 48f5485f_680a_97b7_bfc7_aff0508d4ca0 918b8514_ba55_6df2_7254_4598ec160e33["langchain_core.embeddings"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> 918b8514_ba55_6df2_7254_4598ec160e33 a8ec7563_2814_99b3_c6da_61c599efc542["langchain_core.runnables.config"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> a8ec7563_2814_99b3_c6da_61c599efc542 bd035cf2_5933_bc0f_65e9_0dfe57627ca3["langchain_core.utils"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> bd035cf2_5933_bc0f_65e9_0dfe57627ca3 dd5e7909_a646_84f1_497b_cae69735550e["pydantic"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> dd5e7909_a646_84f1_497b_cae69735550e f85fae70_1011_eaec_151c_4083140ae9e5["typing_extensions"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> f85fae70_1011_eaec_151c_4083140ae9e5 29eba672_199f_3c52_cdbc_39f5c194182e["langchain_openai.chat_models._client_utils"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> 29eba672_199f_3c52_cdbc_39f5c194182e d2b62b81_6a74_9153_fcd4_ff7470a9b3d2["httpx"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> d2b62b81_6a74_9153_fcd4_ff7470a9b3d2 c29ae04f_6e26_fc73_5938_d57db6543f18["transformers"] e013244a_7e0e_baa7_ce3b_16dab4320e45 --> c29ae04f_6e26_fc73_5938_d57db6543f18 style e013244a_7e0e_baa7_ce3b_16dab4320e45 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
"""Base classes for OpenAI embeddings."""
from __future__ import annotations
import logging
import warnings
from collections.abc import Awaitable, Callable, Iterable, Mapping, Sequence
from typing import Any, Literal, cast
import openai
import tiktoken
from langchain_core.embeddings import Embeddings
from langchain_core.runnables.config import run_in_executor
from langchain_core.utils import from_env, get_pydantic_field_names, secret_from_env
from pydantic import BaseModel, ConfigDict, Field, SecretStr, model_validator
from typing_extensions import Self
from langchain_openai.chat_models._client_utils import _resolve_sync_and_async_api_keys
logger = logging.getLogger(__name__)
MAX_TOKENS_PER_REQUEST = 300000
"""API limit per request for embedding tokens."""
def _process_batched_chunked_embeddings(
num_texts: int,
tokens: list[list[int] | str],
batched_embeddings: list[list[float]],
indices: list[int],
skip_empty: bool,
) -> list[list[float] | None]:
# for each text, this is the list of embeddings (list of list of floats)
# corresponding to the chunks of the text
results: list[list[list[float]]] = [[] for _ in range(num_texts)]
# for each text, this is the token length of each chunk
# for transformers tokenization, this is the string length
# for tiktoken, this is the number of tokens
num_tokens_in_batch: list[list[int]] = [[] for _ in range(num_texts)]
for i in range(len(indices)):
if skip_empty and len(batched_embeddings[i]) == 1:
continue
results[indices[i]].append(batched_embeddings[i])
num_tokens_in_batch[indices[i]].append(len(tokens[i]))
# for each text, this is the final embedding
embeddings: list[list[float] | None] = []
for i in range(num_texts):
# an embedding for each chunk
_result: list[list[float]] = results[i]
if len(_result) == 0:
# this will be populated with the embedding of an empty string
# in the sync or async code calling this
embeddings.append(None)
continue
if len(_result) == 1:
// ... (713 more lines)
Domain
Subdomains
Functions
Classes
Dependencies
- collections.abc
- httpx
- langchain_core.embeddings
- langchain_core.runnables.config
- langchain_core.utils
- langchain_openai.chat_models._client_utils
- logging
- openai
- pydantic
- tiktoken
- tqdm.auto
- transformers
- typing
- typing_extensions
- warnings
Source
Frequently Asked Questions
What does base.py do?
base.py is a source file in the langchain codebase, written in python. It belongs to the LangChainCore domain, LanguageModelBase subdomain.
What functions are defined in base.py?
base.py defines 1 function(s): _process_batched_chunked_embeddings.
What does base.py depend on?
base.py imports 15 module(s): collections.abc, httpx, langchain_core.embeddings, langchain_core.runnables.config, langchain_core.utils, langchain_openai.chat_models._client_utils, logging, openai, and 7 more.
Where is base.py in the architecture?
base.py is located at libs/partners/openai/langchain_openai/embeddings/base.py (domain: LangChainCore, subdomain: LanguageModelBase, directory: libs/partners/openai/langchain_openai/embeddings).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free