test_huggingface_tokenizer() — langchain Function Reference
Architecture documentation for the test_huggingface_tokenizer() function in test_text_splitter.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 1b832fd4_2970_09b2_1327_16a008bb0436["test_huggingface_tokenizer()"] d35bbf8f_3f92_b567_0710_bd1ead1e275e["test_text_splitter.py"] 1b832fd4_2970_09b2_1327_16a008bb0436 -->|defined in| d35bbf8f_3f92_b567_0710_bd1ead1e275e style 1b832fd4_2970_09b2_1327_16a008bb0436 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/tests/integration_tests/test_text_splitter.py lines 24–31
def test_huggingface_tokenizer() -> None:
"""Test text splitter that uses a HuggingFace tokenizer."""
tokenizer = AutoTokenizer.from_pretrained("gpt2")
text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(
tokenizer, separator=" ", chunk_size=1, chunk_overlap=0
)
output = text_splitter.split_text("foo bar")
assert output == ["foo", "bar"]
Domain
Subdomains
Source
Frequently Asked Questions
What does test_huggingface_tokenizer() do?
test_huggingface_tokenizer() is a function in the langchain codebase, defined in libs/text-splitters/tests/integration_tests/test_text_splitter.py.
Where is test_huggingface_tokenizer() defined?
test_huggingface_tokenizer() is defined in libs/text-splitters/tests/integration_tests/test_text_splitter.py at line 24.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free