Home / Function/ test_huggingface_tokenizer() — langchain Function Reference

test_huggingface_tokenizer() — langchain Function Reference

Architecture documentation for the test_huggingface_tokenizer() function in test_text_splitter.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  1b832fd4_2970_09b2_1327_16a008bb0436["test_huggingface_tokenizer()"]
  d35bbf8f_3f92_b567_0710_bd1ead1e275e["test_text_splitter.py"]
  1b832fd4_2970_09b2_1327_16a008bb0436 -->|defined in| d35bbf8f_3f92_b567_0710_bd1ead1e275e
  style 1b832fd4_2970_09b2_1327_16a008bb0436 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/tests/integration_tests/test_text_splitter.py lines 24–31

def test_huggingface_tokenizer() -> None:
    """Test text splitter that uses a HuggingFace tokenizer."""
    tokenizer = AutoTokenizer.from_pretrained("gpt2")
    text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(
        tokenizer, separator=" ", chunk_size=1, chunk_overlap=0
    )
    output = text_splitter.split_text("foo bar")
    assert output == ["foo", "bar"]

Domain

Subdomains

Frequently Asked Questions

What does test_huggingface_tokenizer() do?
test_huggingface_tokenizer() is a function in the langchain codebase, defined in libs/text-splitters/tests/integration_tests/test_text_splitter.py.
Where is test_huggingface_tokenizer() defined?
test_huggingface_tokenizer() is defined in libs/text-splitters/tests/integration_tests/test_text_splitter.py at line 24.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free