_agenerate_with_cache() — langchain Function Reference
Architecture documentation for the _agenerate_with_cache() function in chat_models.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 1444b9d3_5ad9_5b23_967b_eb8224746e4f["_agenerate_with_cache()"] d009a608_c505_bd50_7200_0de8a69ba4b7["BaseChatModel"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|defined in| d009a608_c505_bd50_7200_0de8a69ba4b7 e539ab1d_5151_8ba1_cfe0_47ef5adc1f67["agenerate()"] e539ab1d_5151_8ba1_cfe0_47ef5adc1f67 -->|calls| 1444b9d3_5ad9_5b23_967b_eb8224746e4f 4fc37aed_deb5_e40b_e6f9_e9ec4aba8bf4["_get_llm_string()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| 4fc37aed_deb5_e40b_e6f9_e9ec4aba8bf4 1bde846b_f21e_efdc_e2c0_f16538c870eb["_convert_cached_generations()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| 1bde846b_f21e_efdc_e2c0_f16538c870eb 85ef976a_84b2_6421_f697_db0f396b2444["_should_stream()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| 85ef976a_84b2_6421_f697_db0f396b2444 872a786c_fe60_9c05_0023_8702aaba0a90["_astream()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| 872a786c_fe60_9c05_0023_8702aaba0a90 dc6d41df_9fe4_2cb2_509a_71630863945f["_agenerate()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| dc6d41df_9fe4_2cb2_509a_71630863945f 3aa65704_c798_ec7b_b231_6600cb1a6a44["_gen_info_and_msg_metadata()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| 3aa65704_c798_ec7b_b231_6600cb1a6a44 c17d81b4_b688_fefe_8223_a5273a015f2b["generate_from_stream()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| c17d81b4_b688_fefe_8223_a5273a015f2b a8e91f99_0430_e067_258b_daa1489b6a67["_agenerate()"] 1444b9d3_5ad9_5b23_967b_eb8224746e4f -->|calls| a8e91f99_0430_e067_258b_daa1489b6a67 style 1444b9d3_5ad9_5b23_967b_eb8224746e4f fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/core/langchain_core/language_models/chat_models.py lines 1262–1386
async def _agenerate_with_cache(
self,
messages: list[BaseMessage],
stop: list[str] | None = None,
run_manager: AsyncCallbackManagerForLLMRun | None = None,
**kwargs: Any,
) -> ChatResult:
llm_cache = self.cache if isinstance(self.cache, BaseCache) else get_llm_cache()
# We should check the cache unless it's explicitly set to False
# A None cache means we should use the default global cache
# if it's configured.
check_cache = self.cache or self.cache is None
if check_cache:
if llm_cache:
llm_string = self._get_llm_string(stop=stop, **kwargs)
normalized_messages = [
(
msg.model_copy(update={"id": None})
if getattr(msg, "id", None) is not None
else msg
)
for msg in messages
]
prompt = dumps(normalized_messages)
cache_val = await llm_cache.alookup(prompt, llm_string)
if isinstance(cache_val, list):
converted_generations = self._convert_cached_generations(cache_val)
return ChatResult(generations=converted_generations)
elif self.cache is None:
pass
else:
msg = "Asked to cache, but no cache found at `langchain.cache`."
raise ValueError(msg)
# Apply the rate limiter after checking the cache, since
# we usually don't want to rate limit cache lookups, but
# we do want to rate limit API requests.
if self.rate_limiter:
await self.rate_limiter.aacquire(blocking=True)
# If stream is not explicitly set, check if implicitly requested by
# astream_events() or astream_log(). Bail out if _astream not implemented
if self._should_stream(
async_api=True,
run_manager=run_manager,
**kwargs,
):
chunks: list[ChatGenerationChunk] = []
run_id: str | None = (
f"{LC_ID_PREFIX}-{run_manager.run_id}" if run_manager else None
)
yielded = False
index = -1
index_type = ""
async for chunk in self._astream(messages, stop=stop, **kwargs):
chunk.message.response_metadata = _gen_info_and_msg_metadata(chunk)
if self.output_version == "v1":
# Overwrite .content with .content_blocks
chunk.message = _update_message_content_to_blocks(
chunk.message, "v1"
)
for block in cast(
"list[types.ContentBlock]", chunk.message.content
):
if block["type"] != index_type:
index_type = block["type"]
index += 1
if "index" not in block:
block["index"] = index
if run_manager:
if chunk.message.id is None:
chunk.message.id = run_id
await run_manager.on_llm_new_token(
cast("str", chunk.message.content), chunk=chunk
)
chunks.append(chunk)
yielded = True
# Yield a final empty chunk with chunk_position="last" if not yet yielded
if (
yielded
Domain
Subdomains
Calls
Called By
Source
Frequently Asked Questions
What does _agenerate_with_cache() do?
_agenerate_with_cache() is a function in the langchain codebase, defined in libs/core/langchain_core/language_models/chat_models.py.
Where is _agenerate_with_cache() defined?
_agenerate_with_cache() is defined in libs/core/langchain_core/language_models/chat_models.py at line 1262.
What does _agenerate_with_cache() call?
_agenerate_with_cache() calls 8 function(s): _agenerate, _agenerate, _astream, _convert_cached_generations, _gen_info_and_msg_metadata, _get_llm_string, _should_stream, generate_from_stream.
What calls _agenerate_with_cache()?
_agenerate_with_cache() is called by 1 function(s): agenerate.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free