Home / Class/ _StreamingParser Class — langchain Architecture

_StreamingParser Class — langchain Architecture

Architecture documentation for the _StreamingParser class in xml.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc["_StreamingParser"]
  b9553aad_b797_0a7b_73ed_8d05b0819c0f["BaseMessage"]
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|extends| b9553aad_b797_0a7b_73ed_8d05b0819c0f
  f90cc11a_ee3a_6a74_4781_be7b69a7ed22["xml.py"]
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|defined in| f90cc11a_ee3a_6a74_4781_be7b69a7ed22
  50deded1_918c_1dd3_112f_623225519701["__init__()"]
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| 50deded1_918c_1dd3_112f_623225519701
  e9426db7_ce58_59db_1f0b_befb5eb1d7ae["parse()"]
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| e9426db7_ce58_59db_1f0b_befb5eb1d7ae
  6ac405fd_5160_74ca_d2a3_72428bbea335["close()"]
  7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| 6ac405fd_5160_74ca_d2a3_72428bbea335

Relationship Graph

Source Code

libs/core/langchain_core/output_parsers/xml.py lines 42–148

class _StreamingParser:
    """Streaming parser for XML.

    This implementation is pulled into a class to avoid implementation drift between
    `transform` and `atransform` of the `XMLOutputParser`.
    """

    def __init__(self, parser: Literal["defusedxml", "xml"]) -> None:
        """Initialize the streaming parser.

        Args:
            parser: Parser to use for XML parsing.

                Can be either `'defusedxml'` or `'xml'`. See documentation in
                `XMLOutputParser` for more information.

        Raises:
            ImportError: If `defusedxml` is not installed and the `defusedxml` parser is
                requested.
        """
        if parser == "defusedxml":
            if not _HAS_DEFUSEDXML:
                msg = (
                    "defusedxml is not installed. "
                    "Please install it to use the defusedxml parser."
                    "You can install it with `pip install defusedxml` "
                )
                raise ImportError(msg)
            parser_ = XMLParser(target=TreeBuilder())
        else:
            parser_ = None
        self.pull_parser = ET.XMLPullParser(["start", "end"], _parser=parser_)
        self.xml_start_re = re.compile(r"<[a-zA-Z:_]")
        self.current_path: list[str] = []
        self.current_path_has_children = False
        self.buffer = ""
        self.xml_started = False

    def parse(self, chunk: str | BaseMessage) -> Iterator[AddableDict]:
        """Parse a chunk of text.

        Args:
            chunk: A chunk of text to parse. This can be a `str` or a `BaseMessage`.

        Yields:
            A `dict` representing the parsed XML element.

        Raises:
            xml.etree.ElementTree.ParseError: If the XML is not well-formed.
        """
        if isinstance(chunk, BaseMessage):
            # extract text
            chunk_content = chunk.content
            if not isinstance(chunk_content, str):
                # ignore non-string messages (e.g., function calls)
                return
            chunk = chunk_content
        # add chunk to buffer of unprocessed text
        self.buffer += chunk
        # if xml string hasn't started yet, continue to next chunk
        if not self.xml_started:
            if match := self.xml_start_re.search(self.buffer):
                # if xml string has started, remove all text before it
                self.buffer = self.buffer[match.start() :]
                self.xml_started = True
            else:
                return
        # feed buffer to parser
        self.pull_parser.feed(self.buffer)
        self.buffer = ""
        # yield all events
        try:
            events = self.pull_parser.read_events()
            for event, elem in events:  # type: ignore[misc]
                if event == "start":
                    # update current path
                    self.current_path.append(elem.tag)  # type: ignore[union-attr]
                    self.current_path_has_children = False
                elif event == "end":
                    # remove last element from current path
                    #

Domain

Extends

Frequently Asked Questions

What is the _StreamingParser class?
_StreamingParser is a class in the langchain codebase, defined in libs/core/langchain_core/output_parsers/xml.py.
Where is _StreamingParser defined?
_StreamingParser is defined in libs/core/langchain_core/output_parsers/xml.py at line 42.
What does _StreamingParser extend?
_StreamingParser extends BaseMessage.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free