Home / Function/ tokenize() — langchain Function Reference

tokenize() — langchain Function Reference

Architecture documentation for the tokenize() function in mustache.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c["tokenize()"]
  88ae0419_8f6b_8b31_dd8f_a3bc6bbcb7b6["mustache.py"]
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c -->|defined in| 88ae0419_8f6b_8b31_dd8f_a3bc6bbcb7b6
  0d68eb8b_8003_f5a3_d717_126f0919d1af["render()"]
  0d68eb8b_8003_f5a3_d717_126f0919d1af -->|calls| ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c
  5af91928_c2ae_f45b_c4c0_24e4090e4ba6["grab_literal()"]
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c -->|calls| 5af91928_c2ae_f45b_c4c0_24e4090e4ba6
  abfb140d_b449_a24c_b081_77bcd120e90c["l_sa_check()"]
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c -->|calls| abfb140d_b449_a24c_b081_77bcd120e90c
  0fd46fbf_924b_0a2e_ebc6_df49a6d4a2de["parse_tag()"]
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c -->|calls| 0fd46fbf_924b_0a2e_ebc6_df49a6d4a2de
  eee11095_d17a_3de4_36db_e1eb77ac77ea["r_sa_check()"]
  ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c -->|calls| eee11095_d17a_3de4_36db_e1eb77ac77ea
  style ee317545_08fa_c1b5_d4bc_6d8f5e1e4e7c fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/core/langchain_core/utils/mustache.py lines 199–322

def tokenize(
    template: str, def_ldel: str = "{{", def_rdel: str = "}}"
) -> Iterator[tuple[str, str]]:
    """Tokenize a mustache template.

    Tokenizes a mustache template in a generator fashion, using file-like objects. It
    also accepts a string containing the template.

    Args:
        template: a file-like object, or a string of a mustache template
        def_ldel: The default left delimiter
            (`'{{'` by default, as in spec compliant mustache)
        def_rdel: The default right delimiter
            (`'}}'` by default, as in spec compliant mustache)

    Yields:
        Mustache tags in the form of a tuple `(tag_type, tag_key)` where `tag_type` is
            one of:

            * literal
            * section
            * inverted section
            * end
            * partial
            * no escape

            ...and `tag_key` is either the key or in the case of a literal tag, the
            literal itself.

    Raises:
        ChevronError: If there is a syntax error in the template.
    """
    global _CURRENT_LINE, _LAST_TAG_LINE
    _CURRENT_LINE = 1
    _LAST_TAG_LINE = None

    is_standalone = True
    open_sections = []
    l_del = def_ldel
    r_del = def_rdel

    while template:
        literal, template = grab_literal(template, l_del)

        # If the template is completed
        if not template:
            # Then yield the literal and leave
            yield ("literal", literal)
            break

        # Do the first check to see if we could be a standalone
        is_standalone = l_sa_check(template, literal, is_standalone)

        # Parse the tag
        tag, template = parse_tag(template, l_del, r_del)
        tag_type, tag_key = tag

        # Special tag logic

        # If we are a set delimiter tag
        if tag_type == "set delimiter":
            # Then get and set the delimiters
            dels = tag_key.strip().split(" ")
            l_del, r_del = dels[0], dels[-1]

        # If we are a section tag
        elif tag_type in {"section", "inverted section"}:
            # Then open a new section
            open_sections.append(tag_key)
            _LAST_TAG_LINE = _CURRENT_LINE

        # If we are an end tag
        elif tag_type == "end":
            # Then check to see if the last opened section
            # is the same as us
            try:
                last_section = open_sections.pop()
            except IndexError as e:
                msg = (
                    f'Trying to close tag "{tag_key}"\n'
                    "Looks like it was not opened.\n"

Domain

Subdomains

Called By

Frequently Asked Questions

What does tokenize() do?
tokenize() is a function in the langchain codebase, defined in libs/core/langchain_core/utils/mustache.py.
Where is tokenize() defined?
tokenize() is defined in libs/core/langchain_core/utils/mustache.py at line 199.
What does tokenize() call?
tokenize() calls 4 function(s): grab_literal, l_sa_check, parse_tag, r_sa_check.
What calls tokenize()?
tokenize() is called by 1 function(s): render.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free