Home / Function/ __init__() — langchain Function Reference

__init__() — langchain Function Reference

Architecture documentation for the __init__() function in nltk.py from the langchain codebase.

Entity Profile

Dependency Diagram

graph TD
  91006521_b301_e1cb_c4a5_d534eba20af9["__init__()"]
  213eacdd_0034_0c08_d507_c27d0748affc["NLTKTextSplitter"]
  91006521_b301_e1cb_c4a5_d534eba20af9 -->|defined in| 213eacdd_0034_0c08_d507_c27d0748affc
  style 91006521_b301_e1cb_c4a5_d534eba20af9 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

libs/text-splitters/langchain_text_splitters/nltk.py lines 22–55

    def __init__(
        self,
        separator: str = "\n\n",
        language: str = "english",
        *,
        use_span_tokenize: bool = False,
        **kwargs: Any,
    ) -> None:
        """Initialize the NLTK splitter.

        Args:
            separator: The separator to use when combining splits.
            language: The language to use.
            use_span_tokenize: Whether to use `span_tokenize` instead of
                `sent_tokenize`.

        Raises:
            ImportError: If NLTK is not installed.
            ValueError: If `use_span_tokenize` is `True` and separator is not `''`.
        """
        super().__init__(**kwargs)
        self._separator = separator
        self._language = language
        self._use_span_tokenize = use_span_tokenize
        if self._use_span_tokenize and self._separator:
            msg = "When use_span_tokenize is True, separator should be ''"
            raise ValueError(msg)
        if not _HAS_NLTK:
            msg = "NLTK is not installed, please install it with `pip install nltk`."
            raise ImportError(msg)
        if self._use_span_tokenize:
            self._tokenizer = nltk.tokenize._get_punkt_tokenizer(self._language)  # noqa: SLF001
        else:
            self._tokenizer = nltk.tokenize.sent_tokenize

Subdomains

Frequently Asked Questions

What does __init__() do?
__init__() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/nltk.py.
Where is __init__() defined?
__init__() is defined in libs/text-splitters/langchain_text_splitters/nltk.py at line 22.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free