_StreamingParser Class — langchain Architecture
Architecture documentation for the _StreamingParser class in xml.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc["_StreamingParser"] b9553aad_b797_0a7b_73ed_8d05b0819c0f["BaseMessage"] 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|extends| b9553aad_b797_0a7b_73ed_8d05b0819c0f f90cc11a_ee3a_6a74_4781_be7b69a7ed22["xml.py"] 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|defined in| f90cc11a_ee3a_6a74_4781_be7b69a7ed22 50deded1_918c_1dd3_112f_623225519701["__init__()"] 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| 50deded1_918c_1dd3_112f_623225519701 e9426db7_ce58_59db_1f0b_befb5eb1d7ae["parse()"] 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| e9426db7_ce58_59db_1f0b_befb5eb1d7ae 6ac405fd_5160_74ca_d2a3_72428bbea335["close()"] 7c8ef0f6_8408_1c9b_5ed7_f17f7f527cbc -->|method| 6ac405fd_5160_74ca_d2a3_72428bbea335
Relationship Graph
Source Code
libs/core/langchain_core/output_parsers/xml.py lines 42–148
class _StreamingParser:
"""Streaming parser for XML.
This implementation is pulled into a class to avoid implementation drift between
`transform` and `atransform` of the `XMLOutputParser`.
"""
def __init__(self, parser: Literal["defusedxml", "xml"]) -> None:
"""Initialize the streaming parser.
Args:
parser: Parser to use for XML parsing.
Can be either `'defusedxml'` or `'xml'`. See documentation in
`XMLOutputParser` for more information.
Raises:
ImportError: If `defusedxml` is not installed and the `defusedxml` parser is
requested.
"""
if parser == "defusedxml":
if not _HAS_DEFUSEDXML:
msg = (
"defusedxml is not installed. "
"Please install it to use the defusedxml parser."
"You can install it with `pip install defusedxml` "
)
raise ImportError(msg)
parser_ = XMLParser(target=TreeBuilder())
else:
parser_ = None
self.pull_parser = ET.XMLPullParser(["start", "end"], _parser=parser_)
self.xml_start_re = re.compile(r"<[a-zA-Z:_]")
self.current_path: list[str] = []
self.current_path_has_children = False
self.buffer = ""
self.xml_started = False
def parse(self, chunk: str | BaseMessage) -> Iterator[AddableDict]:
"""Parse a chunk of text.
Args:
chunk: A chunk of text to parse. This can be a `str` or a `BaseMessage`.
Yields:
A `dict` representing the parsed XML element.
Raises:
xml.etree.ElementTree.ParseError: If the XML is not well-formed.
"""
if isinstance(chunk, BaseMessage):
# extract text
chunk_content = chunk.content
if not isinstance(chunk_content, str):
# ignore non-string messages (e.g., function calls)
return
chunk = chunk_content
# add chunk to buffer of unprocessed text
self.buffer += chunk
# if xml string hasn't started yet, continue to next chunk
if not self.xml_started:
if match := self.xml_start_re.search(self.buffer):
# if xml string has started, remove all text before it
self.buffer = self.buffer[match.start() :]
self.xml_started = True
else:
return
# feed buffer to parser
self.pull_parser.feed(self.buffer)
self.buffer = ""
# yield all events
try:
events = self.pull_parser.read_events()
for event, elem in events: # type: ignore[misc]
if event == "start":
# update current path
self.current_path.append(elem.tag) # type: ignore[union-attr]
self.current_path_has_children = False
elif event == "end":
# remove last element from current path
#
Domain
Extends
Source
Frequently Asked Questions
What is the _StreamingParser class?
_StreamingParser is a class in the langchain codebase, defined in libs/core/langchain_core/output_parsers/xml.py.
Where is _StreamingParser defined?
_StreamingParser is defined in libs/core/langchain_core/output_parsers/xml.py at line 42.
What does _StreamingParser extend?
_StreamingParser extends BaseMessage.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free