pdf.py — langchain Source File
Architecture documentation for pdf.py, a python file in the langchain codebase. 3 imports, 0 dependents.
Entity Profile
Dependency Diagram
graph LR 602562af_b736_3ad6_35e1_b1ef61b1e5f0["pdf.py"] 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"] 602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3 439a4142_6fa6_fe9a_2cba_7c9fb0cdceb7["langchain_classic._api"] 602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 439a4142_6fa6_fe9a_2cba_7c9fb0cdceb7 2ddc7ce1_7a61_1139_6121_0cfc484372d7["langchain_community.document_loaders.parsers.pdf"] 602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 2ddc7ce1_7a61_1139_6121_0cfc484372d7 style 602562af_b736_3ad6_35e1_b1ef61b1e5f0 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
from typing import TYPE_CHECKING, Any
from langchain_classic._api import create_importer
if TYPE_CHECKING:
from langchain_community.document_loaders.parsers.pdf import (
AmazonTextractPDFParser,
DocumentIntelligenceParser,
PDFMinerParser,
PDFPlumberParser,
PyMuPDFParser,
PyPDFium2Parser,
PyPDFParser,
extract_from_images_with_rapidocr,
)
# Create a way to dynamically look up deprecated imports.
# Used to consolidate logic for raising deprecation warnings and
# handling optional imports.
DEPRECATED_LOOKUP = {
"extract_from_images_with_rapidocr": (
"langchain_community.document_loaders.parsers.pdf"
),
"PyPDFParser": "langchain_community.document_loaders.parsers.pdf",
"PDFMinerParser": "langchain_community.document_loaders.parsers.pdf",
"PyMuPDFParser": "langchain_community.document_loaders.parsers.pdf",
"PyPDFium2Parser": "langchain_community.document_loaders.parsers.pdf",
"PDFPlumberParser": "langchain_community.document_loaders.parsers.pdf",
"AmazonTextractPDFParser": "langchain_community.document_loaders.parsers.pdf",
"DocumentIntelligenceParser": "langchain_community.document_loaders.parsers.pdf",
}
_import_attribute = create_importer(__package__, deprecated_lookups=DEPRECATED_LOOKUP)
def __getattr__(name: str) -> Any:
"""Look up attributes dynamically."""
return _import_attribute(name)
__all__ = [
"AmazonTextractPDFParser",
"DocumentIntelligenceParser",
"PDFMinerParser",
"PDFPlumberParser",
"PyMuPDFParser",
"PyPDFParser",
"PyPDFium2Parser",
"extract_from_images_with_rapidocr",
]
Domain
Subdomains
Functions
Dependencies
- langchain_classic._api
- langchain_community.document_loaders.parsers.pdf
- typing
Source
Frequently Asked Questions
What does pdf.py do?
pdf.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, Serialization subdomain.
What functions are defined in pdf.py?
pdf.py defines 2 function(s): __getattr__, langchain_community.
What does pdf.py depend on?
pdf.py imports 3 module(s): langchain_classic._api, langchain_community.document_loaders.parsers.pdf, typing.
Where is pdf.py in the architecture?
pdf.py is located at libs/langchain/langchain_classic/document_loaders/parsers/pdf.py (domain: CoreAbstractions, subdomain: Serialization, directory: libs/langchain/langchain_classic/document_loaders/parsers).
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free