split_text_from_file() — langchain Function Reference
Architecture documentation for the split_text_from_file() function in html.py from the langchain codebase.
Entity Profile
Dependency Diagram
graph TD 170f66e6_a026_8fd5_9128_33eeefb7dd62["split_text_from_file()"] 0c8a5f97_7cb0_fe24_746d_9689c4e5426c["HTMLSectionSplitter"] 170f66e6_a026_8fd5_9128_33eeefb7dd62 -->|defined in| 0c8a5f97_7cb0_fe24_746d_9689c4e5426c cf1e77cb_9fca_ca93_1428_c967d5cb0c97["split_text_from_file()"] cf1e77cb_9fca_ca93_1428_c967d5cb0c97 -->|calls| 170f66e6_a026_8fd5_9128_33eeefb7dd62 cdce0dab_74f2_fff9_b284_195643913ed5["split_text()"] cdce0dab_74f2_fff9_b284_195643913ed5 -->|calls| 170f66e6_a026_8fd5_9128_33eeefb7dd62 c2708424_8958_9cb1_390e_d816b56479f3["convert_possible_tags_to_header()"] 170f66e6_a026_8fd5_9128_33eeefb7dd62 -->|calls| c2708424_8958_9cb1_390e_d816b56479f3 219ab3b6_0b12_7f58_ba5f_9bfbebda0057["split_html_by_headers()"] 170f66e6_a026_8fd5_9128_33eeefb7dd62 -->|calls| 219ab3b6_0b12_7f58_ba5f_9bfbebda0057 cf1e77cb_9fca_ca93_1428_c967d5cb0c97["split_text_from_file()"] 170f66e6_a026_8fd5_9128_33eeefb7dd62 -->|calls| cf1e77cb_9fca_ca93_1428_c967d5cb0c97 style 170f66e6_a026_8fd5_9128_33eeefb7dd62 fill:#6366f1,stroke:#818cf8,color:#fff
Relationship Graph
Source Code
libs/text-splitters/langchain_text_splitters/html.py lines 530–553
def split_text_from_file(self, file: StringIO) -> list[Document]:
"""Split HTML content from a file into a list of `Document` objects.
Args:
file: A file path or a file-like object containing HTML content.
Returns:
A list of split `Document` objects.
"""
file_content = file.getvalue()
file_content = self.convert_possible_tags_to_header(file_content)
sections = self.split_html_by_headers(file_content)
return [
Document(
cast("str", section["content"]),
metadata={
self.headers_to_split_on[str(section["tag_name"])]: section[
"header"
]
},
)
for section in sections
]
Domain
Subdomains
Called By
Source
Frequently Asked Questions
What does split_text_from_file() do?
split_text_from_file() is a function in the langchain codebase, defined in libs/text-splitters/langchain_text_splitters/html.py.
Where is split_text_from_file() defined?
split_text_from_file() is defined in libs/text-splitters/langchain_text_splitters/html.py at line 530.
What does split_text_from_file() call?
split_text_from_file() calls 3 function(s): convert_possible_tags_to_header, split_html_by_headers, split_text_from_file.
What calls split_text_from_file()?
split_text_from_file() is called by 2 function(s): split_text, split_text_from_file.
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free