Home / Class/ XMLOutputParser Class — langchain Architecture

XMLOutputParser Class — langchain Architecture

Architecture documentation for the XMLOutputParser class in xml.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0["XMLOutputParser"]
  31a97691_3fec_0f65_0661_c4d496bb962f["BaseTransformOutputParser"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|extends| 31a97691_3fec_0f65_0661_c4d496bb962f
  f90cc11a_ee3a_6a74_4781_be7b69a7ed22["xml.py"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|defined in| f90cc11a_ee3a_6a74_4781_be7b69a7ed22
  8444f55c_f205_7d2d_13b8_cddf2b5d4999["get_format_instructions()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| 8444f55c_f205_7d2d_13b8_cddf2b5d4999
  698aa558_20db_aa12_47eb_de459e25e934["parse()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| 698aa558_20db_aa12_47eb_de459e25e934
  f2d3a631_3745_7a99_7dfc_8664b6a06b99["_transform()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| f2d3a631_3745_7a99_7dfc_8664b6a06b99
  32340f36_d7fb_913d_0253_43b61c3f0b73["_atransform()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| 32340f36_d7fb_913d_0253_43b61c3f0b73
  bc7e3795_e9e2_9111_e066_976d85188623["_root_to_dict()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| bc7e3795_e9e2_9111_e066_976d85188623
  21b0b8bb_09c2_2f54_3163_e3d62c24b0ca["_type()"]
  b51fd3dc_a25e_9ad9_8893_3725f5d436f0 -->|method| 21b0b8bb_09c2_2f54_3163_e3d62c24b0ca

Relationship Graph

Source Code

libs/core/langchain_core/output_parsers/xml.py lines 151–285

class XMLOutputParser(BaseTransformOutputParser):
    """Parse an output using xml format.

    Returns a dictionary of tags.
    """

    tags: list[str] | None = None
    """Tags to tell the LLM to expect in the XML output.

    Note this may not be perfect depending on the LLM implementation.

    For example, with `tags=["foo", "bar", "baz"]`:

    1. A well-formatted XML instance:
        `'<foo>\n   <bar>\n      <baz></baz>\n   </bar>\n</foo>'`

    2. A badly-formatted XML instance (missing closing tag for 'bar'):
        `'<foo>\n   <bar>\n   </foo>'`

    3. A badly-formatted XML instance (unexpected 'tag' element):
        `'<foo>\n   <tag>\n   </tag>\n</foo>'`
    """
    encoding_matcher: re.Pattern = re.compile(
        r"<([^>]*encoding[^>]*)>\n(.*)", re.MULTILINE | re.DOTALL
    )

    parser: Literal["defusedxml", "xml"] = "defusedxml"
    """Parser to use for XML parsing.

    Can be either `'defusedxml'` or `'xml'`.

    - `'defusedxml'` is the default parser and is used to prevent XML vulnerabilities
        present in some distributions of Python's standard library xml. `defusedxml` is
        a wrapper around the standard library parser that sets up the parser with secure
        defaults.
    - `'xml'` is the standard library parser.

    !!! warning

        Use `xml` only if you are sure that your distribution of the standard library is
        not vulnerable to XML vulnerabilities.

    Review the following resources for more information:

    * https://docs.python.org/3/library/xml.html#xml-vulnerabilities
    * https://github.com/tiran/defusedxml

    The standard library relies on [`libexpat`](https://github.com/libexpat/libexpat)
    for parsing XML.
    """

    def get_format_instructions(self) -> str:
        """Return the format instructions for the XML output."""
        return XML_FORMAT_INSTRUCTIONS.format(tags=self.tags)

    def parse(self, text: str) -> dict[str, str | list[Any]]:
        """Parse the output of an LLM call.

        Args:
            text: The output of an LLM call.

        Returns:
            A `dict` representing the parsed XML.

        Raises:
            OutputParserException: If the XML is not well-formed.
            ImportError: If defus`edxml is not installed and the `defusedxml` parser is
                requested.
        """
        # Try to find XML string within triple backticks
        # Imports are temporarily placed here to avoid issue with caching on CI
        # likely if you're reading this you can move them to the top of the file
        if self.parser == "defusedxml":
            if not _HAS_DEFUSEDXML:
                msg = (
                    "defusedxml is not installed. "
                    "Please install it to use the defusedxml parser."
                    "You can install it with `pip install defusedxml`"
                    "See https://github.com/tiran/defusedxml for more details"
                )
                raise ImportError(msg)

Domain

Frequently Asked Questions

What is the XMLOutputParser class?
XMLOutputParser is a class in the langchain codebase, defined in libs/core/langchain_core/output_parsers/xml.py.
Where is XMLOutputParser defined?
XMLOutputParser is defined in libs/core/langchain_core/output_parsers/xml.py at line 151.
What does XMLOutputParser extend?
XMLOutputParser extends BaseTransformOutputParser.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free