pdf.py — langchain Source File

Architecture documentation for pdf.py, a python file in the langchain codebase. 3 imports, 0 dependents.

File python CoreAbstractions Serialization 3 imports 2 functions

Entity Profile

CoreAbstractions→ Serialization→ pdf.py — langchain Source File

Dependency Diagram

graph LR
  602562af_b736_3ad6_35e1_b1ef61b1e5f0["pdf.py"]
  8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3["typing"]
  602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 8e2034b7_ceb8_963f_29fc_2ea6b50ef9b3
  439a4142_6fa6_fe9a_2cba_7c9fb0cdceb7["langchain_classic._api"]
  602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 439a4142_6fa6_fe9a_2cba_7c9fb0cdceb7
  2ddc7ce1_7a61_1139_6121_0cfc484372d7["langchain_community.document_loaders.parsers.pdf"]
  602562af_b736_3ad6_35e1_b1ef61b1e5f0 --> 2ddc7ce1_7a61_1139_6121_0cfc484372d7
  style 602562af_b736_3ad6_35e1_b1ef61b1e5f0 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

from typing import TYPE_CHECKING, Any

from langchain_classic._api import create_importer

if TYPE_CHECKING:
    from langchain_community.document_loaders.parsers.pdf import (
        AmazonTextractPDFParser,
        DocumentIntelligenceParser,
        PDFMinerParser,
        PDFPlumberParser,
        PyMuPDFParser,
        PyPDFium2Parser,
        PyPDFParser,
        extract_from_images_with_rapidocr,
    )

# Create a way to dynamically look up deprecated imports.
# Used to consolidate logic for raising deprecation warnings and
# handling optional imports.
DEPRECATED_LOOKUP = {
    "extract_from_images_with_rapidocr": (
        "langchain_community.document_loaders.parsers.pdf"
    ),
    "PyPDFParser": "langchain_community.document_loaders.parsers.pdf",
    "PDFMinerParser": "langchain_community.document_loaders.parsers.pdf",
    "PyMuPDFParser": "langchain_community.document_loaders.parsers.pdf",
    "PyPDFium2Parser": "langchain_community.document_loaders.parsers.pdf",
    "PDFPlumberParser": "langchain_community.document_loaders.parsers.pdf",
    "AmazonTextractPDFParser": "langchain_community.document_loaders.parsers.pdf",
    "DocumentIntelligenceParser": "langchain_community.document_loaders.parsers.pdf",
}

_import_attribute = create_importer(__package__, deprecated_lookups=DEPRECATED_LOOKUP)


def __getattr__(name: str) -> Any:
    """Look up attributes dynamically."""
    return _import_attribute(name)


__all__ = [
    "AmazonTextractPDFParser",
    "DocumentIntelligenceParser",
    "PDFMinerParser",
    "PDFPlumberParser",
    "PyMuPDFParser",
    "PyPDFParser",
    "PyPDFium2Parser",
    "extract_from_images_with_rapidocr",
]

Domain

CoreAbstractions

Subdomains

Serialization

Functions

Dependencies

langchain_classic._api
langchain_community.document_loaders.parsers.pdf
typing

Source

View on GitHub

Frequently Asked Questions

What does pdf.py do?

pdf.py is a source file in the langchain codebase, written in python. It belongs to the CoreAbstractions domain, Serialization subdomain.

What functions are defined in pdf.py?

pdf.py defines 2 function(s): __getattr__, langchain_community.

What does pdf.py depend on?

pdf.py imports 3 module(s): langchain_classic._api, langchain_community.document_loaders.parsers.pdf, typing.

Where is pdf.py in the architecture?

pdf.py is located at libs/langchain/langchain_classic/document_loaders/parsers/pdf.py (domain: CoreAbstractions, subdomain: Serialization, directory: libs/langchain/langchain_classic/document_loaders/parsers).

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free