A web scraper pulls product prices from a retail site every morning. One day it returns empty results. The prices are still there, displayed correctly on the page. But the site's underlying structure changed—like rearranging furniture in a room where someone's memorized exactly where everything sits. The prices exist, but the agent can't find them because its map of the room is now wrong.
Metadata broke.
Definitions
Data is the content itself: product prices, inventory counts, customer records, transaction details. What you're trying to extract or analyze.
Metadata is everything about that content. Where it lives on a page. How it's structured. What it means in business terms. How it relates to other information. For web agents, metadata includes:
- CSS selectors and XPath expressions that locate elements
- Session cookies that maintain authentication
- Site structure maps that describe page organization
- Semantic information that explains what data represents
Failure Patterns
Metadata failures compound. One break triggers others, creating operational problems that cascade through systems.
Location knowledge: A scraper searches for an HTML element using a CSS selector. The element moved or changed its identifier. The scraper returns nothing—the pointer to that data became stale. The content is right there on the page. The agent just can't locate it anymore.
Timing: Single-page applications load content after the initial page render. The HTML a scraper receives may not contain the desired data yet. Agents need metadata about what to wait for, which elements trigger loading, how long processes typically take. Without this structural metadata, they capture incomplete pages at the wrong moment.
Context tracking: Agents track conversation history, tool execution results, authentication tokens through session metadata. When this tracking fails, agents lose all context after receiving tool responses. An agent executes a database query, receives results, then forgets what it was looking for. The query returned correctly, but the metadata about what happened and why disappeared.
Semantic understanding: An agent analyzes sales transactions and sees numbers: revenue figures, order counts, customer IDs. AI systems recognize patterns in data. Semantic metadata provides the business context those patterns represent—customer hierarchies, seasonal variations, product relationships that explain what the numbers mean operationally.
Production Visibility
Organizations investigate data quality when their metadata infrastructure has broken.
A team notices their pricing intelligence agent returning incomplete results. They check data pipelines: Are competitor sites down? Is the data malformed? Days pass while they validate data quality. Meanwhile, page layouts changed and CSS selectors no longer match the new structure.
Teams encounter different symptoms depending on which layer failed. Incorrect values, missing records, inconsistent formats—these point to data quality issues. Agents that can't find anything, lose context between steps, or misinterpret what they're looking at even when the underlying data is perfectly accurate—these point to metadata infrastructure that broke.
Things to follow up on...
-
Knowledge graphs for reasoning: Graph structures enable agents to perform multi-hop traversal for planning workflows where Task A must complete before Task B because A provides input to B.
-
Session state architecture: Google's ADK documentation shows how state modifications are automatically tracked through EventActions and persisted by SessionService when agents execute tools.
-
Metadata quality gap: Only 22% of organizations have comprehensive metadata for all their assets, yet AI performance depends on consistent metadata to retrieve correct information at the right time.
-
Silent failure patterns: Scrapers may "succeed" while data pipelines quietly fail when JavaScript execution is required but not happening, with issues only appearing when running at volume.

