Home / Function/ htmlTokenTransform() — astro Function Reference

htmlTokenTransform() — astro Function Reference

Architecture documentation for the htmlTokenTransform() function in html-token-transform.ts from the astro codebase.

Entity Profile

Dependency Diagram

graph TD
  e9ca3bf1_6e0d_113b_7cf1_21a7c23f6932["htmlTokenTransform()"]
  b514e53d_6f58_26e2_b444_6fdf850f8be2["html-token-transform.ts"]
  e9ca3bf1_6e0d_113b_7cf1_21a7c23f6932 -->|defined in| b514e53d_6f58_26e2_b444_6fdf850f8be2
  ae279694_8174_cef4_9b86_503e9ce07415["mutateAndCollapseExtraParagraphsUnderHtml()"]
  e9ca3bf1_6e0d_113b_7cf1_21a7c23f6932 -->|calls| ae279694_8174_cef4_9b86_503e9ce07415
  style e9ca3bf1_6e0d_113b_7cf1_21a7c23f6932 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

packages/integrations/markdoc/src/html/transform/html-token-transform.ts lines 8–177

export function htmlTokenTransform(tokenizer: Tokenizer, tokens: Token[]): Token[] {
	const output: Token[] = [];

	// hold a lazy buffer of text and process it only when necessary
	let textBuffer = '';

	let inCDATA = false;

	const appendText = (text: string) => {
		textBuffer += text;
	};

	// process the current text buffer w/ Markdoc's Tokenizer for tokens
	const processTextBuffer = () => {
		if (textBuffer.length > 0) {
			// tokenize the text buffer to look for structural markup tokens
			const toks = tokenizer.tokenize(textBuffer);

			// when we tokenize some raw text content, it's basically treated like Markdown, and will result in a paragraph wrapper, which we don't want
			// in this scenario, we just want to generate a text token, but, we have to tokenize it in case there's other structural markup
			if (toks.length === 3) {
				const first = toks[0];
				const second = toks[1];
				const third: Token | undefined = toks.at(2);

				if (
					first.type === 'paragraph_open' &&
					second.type === 'inline' &&
					third &&
					third.type === 'paragraph_close' &&
					Array.isArray(second.children)
				) {
					for (const tok of second.children as Token[]) {
						// if the given token is a 'text' token and its trimmed content is the same as the pre-tokenized text buffer, use the original
						// text buffer instead to preserve leading/trailing whitespace that is lost during tokenization of pure text content
						if (tok.type === 'text') {
							if (tok.content.trim() == textBuffer.trim()) {
								tok.content = textBuffer;
							}
						}
						output.push(tok);
					}
				} else {
					// some other markup that happened to be 3 tokens, push tokens as-is
					for (const tok of toks) {
						output.push(tok);
					}
				}
			} else {
				// some other tokenized markup, push tokens as-is
				for (const tok of toks) {
					output.push(tok);
				}
			}

			// reset the current lazy text buffer
			textBuffer = '';
		}
	};

	// create an incremental HTML parser that tracks HTML tag open, close and text content
	const parser = new Parser(
		{
			oncdatastart() {
				inCDATA = true;
			},

			oncdataend() {
				inCDATA = false;
			},

			// when an HTML tag opens...
			onopentag(name, attrs) {
				// process any buffered text to be treated as text node before the currently opening HTML tag
				processTextBuffer();

				// push an  'html-tag' 'tag_open' Markdoc node instance for the currently opening HTML tag onto the resulting Token stack
				output.push({
					type: 'tag_open',
					nesting: 1,
					meta: {

Domain

Subdomains

Frequently Asked Questions

What does htmlTokenTransform() do?
htmlTokenTransform() is a function in the astro codebase, defined in packages/integrations/markdoc/src/html/transform/html-token-transform.ts.
Where is htmlTokenTransform() defined?
htmlTokenTransform() is defined in packages/integrations/markdoc/src/html/transform/html-token-transform.ts at line 8.
What does htmlTokenTransform() call?
htmlTokenTransform() calls 1 function(s): mutateAndCollapseExtraParagraphsUnderHtml.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free