LLMChainExtractor Class — langchain Architecture
Architecture documentation for the LLMChainExtractor class in chain_extract.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 704a479f_76d4_f2a2_69ac_65c67ff7575d["LLMChainExtractor"] 1c219081_6061_3fb9_0ccd_08e0b97c9474["BaseDocumentCompressor"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|extends| 1c219081_6061_3fb9_0ccd_08e0b97c9474 8d3a235d_a08f_2979_f52a_1772067dd1d3["LLMChain"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|extends| 8d3a235d_a08f_2979_f52a_1772067dd1d3 0eb98ce3_194e_b222_d9bd_d9a220621a2c["chain_extract.py"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|defined in| 0eb98ce3_194e_b222_d9bd_d9a220621a2c b9d45a28_6355_5ac9_1bfb_fb26756ac3b7["compress_documents()"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|method| b9d45a28_6355_5ac9_1bfb_fb26756ac3b7 7fd870dd_cd72_7f8f_cdc8_d1a14975590b["acompress_documents()"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|method| 7fd870dd_cd72_7f8f_cdc8_d1a14975590b 6db53bdd_2433_3dd0_413a_73c7a7fc118d["from_llm()"] 704a479f_76d4_f2a2_69ac_65c67ff7575d -->|method| 6db53bdd_2433_3dd0_413a_73c7a7fc118d
Relationship Graph
Source Code
libs/langchain/langchain_classic/retrievers/document_compressors/chain_extract.py lines 51–126
class LLMChainExtractor(BaseDocumentCompressor):
"""LLM Chain Extractor.
Document compressor that uses an LLM chain to extract
the relevant parts of documents.
"""
llm_chain: Runnable
"""LLM wrapper to use for compressing documents."""
get_input: Callable[[str, Document], dict] = default_get_input
"""Callable for constructing the chain input from the query and a Document."""
model_config = ConfigDict(
arbitrary_types_allowed=True,
)
def compress_documents(
self,
documents: Sequence[Document],
query: str,
callbacks: Callbacks | None = None,
) -> Sequence[Document]:
"""Compress page content of raw documents."""
compressed_docs = []
for doc in documents:
_input = self.get_input(query, doc)
output_ = self.llm_chain.invoke(_input, config={"callbacks": callbacks})
if isinstance(self.llm_chain, LLMChain):
output = output_[self.llm_chain.output_key]
if self.llm_chain.prompt.output_parser is not None:
output = self.llm_chain.prompt.output_parser.parse(output)
else:
output = output_
if len(output) == 0:
continue
compressed_docs.append(
Document(page_content=cast("str", output), metadata=doc.metadata),
)
return compressed_docs
async def acompress_documents(
self,
documents: Sequence[Document],
query: str,
callbacks: Callbacks | None = None,
) -> Sequence[Document]:
"""Compress page content of raw documents asynchronously."""
inputs = [self.get_input(query, doc) for doc in documents]
outputs = await self.llm_chain.abatch(inputs, {"callbacks": callbacks})
compressed_docs = []
for i, doc in enumerate(documents):
if len(outputs[i]) == 0:
continue
compressed_docs.append(
Document(page_content=outputs[i], metadata=doc.metadata),
)
return compressed_docs
@classmethod
def from_llm(
cls,
llm: BaseLanguageModel,
prompt: PromptTemplate | None = None,
get_input: Callable[[str, Document], str] | None = None,
llm_chain_kwargs: dict | None = None, # noqa: ARG003
) -> LLMChainExtractor:
"""Initialize from LLM."""
_prompt = prompt if prompt is not None else _get_default_chain_prompt()
_get_input = get_input if get_input is not None else default_get_input
if _prompt.output_parser is not None:
parser = _prompt.output_parser
else:
parser = StrOutputParser()
llm_chain = _prompt | llm | parser
return cls(llm_chain=llm_chain, get_input=_get_input)
Extends
Source
Frequently Asked Questions
What is the LLMChainExtractor class?
LLMChainExtractor is a class in the langchain codebase, defined in libs/langchain/langchain_classic/retrievers/document_compressors/chain_extract.py.
Where is LLMChainExtractor defined?
LLMChainExtractor is defined in libs/langchain/langchain_classic/retrievers/document_compressors/chain_extract.py at line 51.
What does LLMChainExtractor extend?
LLMChainExtractor extends BaseDocumentCompressor, LLMChain.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free