Home / Class/ PandasDataFrameOutputParser Class — langchain Architecture

PandasDataFrameOutputParser Class — langchain Architecture

Architecture documentation for the PandasDataFrameOutputParser class in pandas_dataframe.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5["PandasDataFrameOutputParser"]
  de401477_2c08_fd8e_e089_bc90d268b35b["pandas_dataframe.py"]
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5 -->|defined in| de401477_2c08_fd8e_e089_bc90d268b35b
  857c45f9_f55a_c767_8d26_7140fd3c5fae["_validate_dataframe()"]
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5 -->|method| 857c45f9_f55a_c767_8d26_7140fd3c5fae
  92fee6b5_5c28_2655_cdf7_86370f3fb843["parse_array()"]
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5 -->|method| 92fee6b5_5c28_2655_cdf7_86370f3fb843
  43f74c43_859a_4aa3_8865_6d9513a2174a["parse()"]
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5 -->|method| 43f74c43_859a_4aa3_8865_6d9513a2174a
  e8d0de31_9e8d_c310_7765_c059f82af152["get_format_instructions()"]
  3637e9ea_cbb7_422b_bcb3_aa36c4e8aea5 -->|method| e8d0de31_9e8d_c310_7765_c059f82af152

Relationship Graph

Source Code

libs/langchain/langchain_classic/output_parsers/pandas_dataframe.py lines 14–171

class PandasDataFrameOutputParser(BaseOutputParser[dict[str, Any]]):
    """Parse an output using Pandas DataFrame format."""

    """The Pandas DataFrame to parse."""
    dataframe: Any

    @field_validator("dataframe")
    @classmethod
    def _validate_dataframe(cls, val: Any) -> Any:
        import pandas as pd

        if issubclass(type(val), pd.DataFrame):
            return val
        if pd.DataFrame(val).empty:
            msg = "DataFrame cannot be empty."
            raise ValueError(msg)

        msg = "Wrong type for 'dataframe', must be a subclass \
                of Pandas DataFrame (pd.DataFrame)"
        raise TypeError(msg)

    def parse_array(
        self,
        array: str,
        original_request_params: str,
    ) -> tuple[list[int | str], str]:
        """Parse the array from the request parameters.

        Args:
            array: The array string to parse.
            original_request_params: The original request parameters string.

        Returns:
            A tuple containing the parsed array and the stripped request parameters.

        Raises:
            OutputParserException: If the array format is invalid or cannot be parsed.
        """
        parsed_array: list[int | str] = []

        # Check if the format is [1,3,5]
        if re.match(r"\[\d+(,\s*\d+)*\]", array):
            parsed_array = [int(i) for i in re.findall(r"\d+", array)]
        # Check if the format is [1..5]
        elif re.match(r"\[(\d+)\.\.(\d+)\]", array):
            match = re.match(r"\[(\d+)\.\.(\d+)\]", array)
            if match:
                start, end = map(int, match.groups())
                parsed_array = list(range(start, end + 1))
            else:
                msg = f"Unable to parse the array provided in {array}. \
                        Please check the format instructions."
                raise OutputParserException(msg)
        # Check if the format is ["column_name"]
        elif re.match(r"\[[a-zA-Z0-9_]+(?:,[a-zA-Z0-9_]+)*\]", array):
            match = re.match(r"\[[a-zA-Z0-9_]+(?:,[a-zA-Z0-9_]+)*\]", array)
            if match:
                parsed_array = list(map(str, match.group().strip("[]").split(",")))
            else:
                msg = f"Unable to parse the array provided in {array}. \
                        Please check the format instructions."
                raise OutputParserException(msg)

        # Validate the array
        if not parsed_array:
            msg = f"Invalid array format in '{original_request_params}'. \
                    Please check the format instructions."
            raise OutputParserException(msg)
        if (
            isinstance(parsed_array[0], int)
            and parsed_array[-1] > self.dataframe.index.max()
        ):
            msg = f"The maximum index {parsed_array[-1]} exceeds the maximum index of \
                    the Pandas DataFrame {self.dataframe.index.max()}."
            raise OutputParserException(msg)

        return parsed_array, original_request_params.split("[", maxsplit=1)[0]

    @override
    def parse(self, request: str) -> dict[str, Any]:
        stripped_request_params = None

Domain

Frequently Asked Questions

What is the PandasDataFrameOutputParser class?
PandasDataFrameOutputParser is a class in the langchain codebase, defined in libs/langchain/langchain_classic/output_parsers/pandas_dataframe.py.
Where is PandasDataFrameOutputParser defined?
PandasDataFrameOutputParser is defined in libs/langchain/langchain_classic/output_parsers/pandas_dataframe.py at line 14.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free