Skip to content

Why document.body.innerHTML Ruins LLM Context Windows

Gasoline MCP gives AI coding assistants real-time browser context via the Model Context Protocol. One of the hardest problems it solves is this: how do you represent a web page to an LLM without blowing up the context window?

The most common answer in the wild is wrong.

Many MCP tools, browser automation scripts, and AI coding workflows grab DOM content the obvious way:

document.body.innerHTML

This dumps the entire raw HTML of the page into the LLM’s context window. Every ad banner. Every tracking pixel. Every inline style. Every SVG path definition. Every base64-encoded image. Every third-party script tag. Every CSS class name generated by your framework’s hash function.

A typical web page might contain 500KB of raw HTML. The actual meaningful content — the text, the form fields, the error messages your AI assistant needs to see — might be 5KB. That’s 99% waste in a context window with hard token limits.

Consider a React dashboard page. A SaaS admin panel with a sidebar, a data table, some charts, and a modal.

ApproachToken CountMeaningful Content
document.body.innerHTML~200,000 tokens~2,000 tokens
Accessibility tree~3,000 tokens~2,000 tokens

With innerHTML, you are burning 99% of your context budget on <div class="css-1a2b3c"> wrappers, Webpack chunk references, SVG coordinate data, and analytics scripts. In a model with a 128K token context window, a single innerHTML dump can consume more than the entire window — leaving zero room for conversation history, system prompts, or the code your assistant is actually working on.

Worse, the signal-to-noise ratio is so low that the LLM struggles to locate the relevant content even when it fits. Buried somewhere in 200K tokens of markup is the error message you need it to read.

Gasoline takes a fundamentally different approach. Instead of raw HTML, it uses the accessibility tree — the structured, semantic representation that browsers build for screen readers.

The accessibility tree contains only meaningful elements:

  • Headings and document structure
  • Buttons, links, and interactive controls
  • Form fields with their labels and current values
  • Text content that a user would actually read
  • ARIA labels and roles that describe element purpose
  • State information — checked, expanded, disabled, selected

It strips out everything else. No CSS. No scripts. No SVG paths. No base64 blobs. No tracking pixels. What remains is a clean, hierarchical representation of what the page actually shows and does.

Beyond the full accessibility tree, Gasoline provides a query_dom MCP tool that lets AI assistants query specific elements using CSS selectors:

query_dom(".error-message")
query_dom("form#login input")
query_dom("[role='alert']")

Instead of dumping the entire page and hoping the LLM finds the relevant piece, the assistant can request exactly what it needs. A targeted query might return 50 tokens instead of 200,000.

This changes the interaction model from “here’s everything, good luck” to “ask for what you need.”

Three reasons:

  1. Token waste. Raw HTML is mostly structural noise — closing tags, class attributes, data attributes, script contents. LLMs pay per token. You are paying to process markup that carries zero information about your bug.

  2. Signal dilution. Even when it fits in context, the LLM must locate a needle in a haystack. Error messages, form validation failures, and visible text get buried under layers of generated markup. Model attention is a finite resource.

  3. Fragility. innerHTML output changes with every framework update, CSS-in-JS hash rotation, and ad network injection. The representation is unstable and framework-dependent. The accessibility tree is stable because it represents semantics, not implementation.

Gasoline captures the accessibility tree directly from the browser via its Chrome extension. When an AI assistant calls the get_console_logs, get_accessibility_tree, or query_dom MCP tools, Gasoline returns structured, token-efficient data:

  • Accessibility tree: Full semantic structure of the page, typically 50-100x smaller than innerHTML
  • DOM queries: Targeted CSS selector queries returning only matching elements
  • Console logs: Errors and warnings already captured in real time, no DOM parsing needed

The result: your AI assistant gets the information it needs to debug your application without consuming the context window budget it needs to actually reason about the problem.

Terminal window
npx gasoline-mcp@latest

One command. Zero dependencies. Your AI assistant gets clean, structured browser context instead of raw HTML noise.

Learn more about DOM queries ->