Home / Class/ TestTokenCountingWithGPT2Tokenizer Class — langchain Architecture

TestTokenCountingWithGPT2Tokenizer Class — langchain Architecture

Architecture documentation for the TestTokenCountingWithGPT2Tokenizer class in test_schema.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1["TestTokenCountingWithGPT2Tokenizer"]
  cb0ded9f_b36d_e4cf_eeeb_3d41fc3719e3["test_schema.py"]
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1 -->|defined in| cb0ded9f_b36d_e4cf_eeeb_3d41fc3719e3
  04ec2c54_5612_70ab_f4f8_451e702a60c4["test_tokenization()"]
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1 -->|method| 04ec2c54_5612_70ab_f4f8_451e702a60c4
  055049be_6dd0_0c09_db28_aeb2ba1e8e7d["test_empty_token()"]
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1 -->|method| 055049be_6dd0_0c09_db28_aeb2ba1e8e7d
  e7680fd2_4100_1a84_3e15_bb66d733fd32["test_multiple_tokens()"]
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1 -->|method| e7680fd2_4100_1a84_3e15_bb66d733fd32
  55cd1802_deec_8c12_a03c_f1bd8bfd4088["test_special_tokens()"]
  76ce3c98_8b2b_4359_3dd9_9cadd328e8f1 -->|method| 55cd1802_deec_8c12_a03c_f1bd8bfd4088

Relationship Graph

Source Code

libs/langchain/tests/integration_tests/test_schema.py lines 6–19

class TestTokenCountingWithGPT2Tokenizer:
    def test_tokenization(self) -> None:
        # Check that the tokenization is consistent with the GPT-2 tokenizer
        assert _get_token_ids_default_method("This is a test") == [1212, 318, 257, 1332]

    def test_empty_token(self) -> None:
        assert len(_get_token_ids_default_method("")) == 0

    def test_multiple_tokens(self) -> None:
        assert len(_get_token_ids_default_method("a b c")) == 3

    def test_special_tokens(self) -> None:
        # test for consistency when the default tokenizer is changed
        assert len(_get_token_ids_default_method("a:b_c d")) == 6

Frequently Asked Questions

What is the TestTokenCountingWithGPT2Tokenizer class?
TestTokenCountingWithGPT2Tokenizer is a class in the langchain codebase, defined in libs/langchain/tests/integration_tests/test_schema.py.
Where is TestTokenCountingWithGPT2Tokenizer defined?
TestTokenCountingWithGPT2Tokenizer is defined in libs/langchain/tests/integration_tests/test_schema.py at line 6.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free