split_text_from_url() — langchain Function Reference
Architecture documentation for the split_text_from_url() function in html.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 982f8e7f_63e2_a8f4_7f7f_3def7fb3d84b["split_text_from_url()"] 86dc20d4_404a_b608_01da_8dea923ef2c9["HTMLHeaderTextSplitter"] 982f8e7f_63e2_a8f4_7f7f_3def7fb3d84b -->|defined in| 86dc20d4_404a_b608_01da_8dea923ef2c9 3a8f906a_02bf_a0ff_6dbb_2ffbc48f937d["split_text()"] 982f8e7f_63e2_a8f4_7f7f_3def7fb3d84b -->|calls| 3a8f906a_02bf_a0ff_6dbb_2ffbc48f937d style 982f8e7f_63e2_a8f4_7f7f_3def7fb3d84b fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/html.py lines 189–210
def split_text_from_url(
self, url: str, timeout: int = 10, **kwargs: Any
) -> list[Document]:
"""Fetch text content from a URL and split it into documents.
Args:
url: The URL to fetch content from.
timeout: Timeout for the request.
**kwargs: Additional keyword arguments for the request.
Returns:
A list of split `Document` objects.
Each `Document` contains `page_content` holding the extracted text and
`metadata` that maps the header hierarchy to their corresponding titles.
Raises:
requests.RequestException: If the HTTP request fails.
"""
response = requests.get(url, timeout=timeout, **kwargs)
response.raise_for_status()
return self.split_text(response.text)
Domain
Subdomains
Calls
Source
Frequently Asked Questions
What does split_text_from_url() do?
split_text_from_url() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/html.py.
Where is split_text_from_url() defined?
split_text_from_url() is defined in libs/text-splitters/langchain_text_splitters/html.py at line 189.
What does split_text_from_url() call?
split_text_from_url() calls 1 function(s): split_text.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free