
Market Pulse
Reading the agent ecosystem through a practitioner's lens
Market Pulse
Reading the agent ecosystem through a practitioner's lens

What $8 Billion Reveals About Where Agent Value Crystallizes

When enterprises evaluate web agent platforms, the first questions aren't about reasoning. They're operational: What happens when sites change overnight? How do I audit what agents actually did? Can I measure whether this works? Salesforce just spent over $8 billion on answers—eight companies acquired in 2025, each purchase revealing where defensible value actually crystallizes. The pattern only becomes visible when you're running production infrastructure. And it points somewhere most people aren't looking.
What $8 Billion Reveals About Where Agent Value Crystallizes

When enterprises evaluate web agent platforms, the first questions aren't about reasoning. They're operational: What happens when sites change overnight? How do I audit what agents actually did? Can I measure whether this works? Salesforce just spent over $8 billion on answers—eight companies acquired in 2025, each purchase revealing where defensible value actually crystallizes. The pattern only becomes visible when you're running production infrastructure. And it points somewhere most people aren't looking.

What Pricing Fragmentation Reveals About Running AI Agents at Scale

A login sequence that completes in two seconds on one site might require thirty seconds and three retry attempts on another. When you're orchestrating thousands of browser sessions simultaneously, that variance creates resource consumption differences of 10x or more. Most pricing models treat every "action" as equivalent.
Salesforce now offers three different ways to pay for the same agent platform. ServiceNow charges per "assist." Microsoft uses flat per-user fees. The fragmentation looks chaotic, but it's actually vendors learning in public what costs money when agents run continuously. And that learning is happening faster than anyone expected.

What Pricing Fragmentation Reveals About Running AI Agents at Scale

A login sequence that completes in two seconds on one site might require thirty seconds and three retry attempts on another. When you're orchestrating thousands of browser sessions simultaneously, that variance creates resource consumption differences of 10x or more. Most pricing models treat every "action" as equivalent.
Salesforce now offers three different ways to pay for the same agent platform. ServiceNow charges per "assist." Microsoft uses flat per-user fees. The fragmentation looks chaotic, but it's actually vendors learning in public what costs money when agents run continuously. And that learning is happening faster than anyone expected.
Surface Story, Deeper Pattern

Why MCP's Rapid Adoption Validates More Than Market Momentum
When OpenAI, Google, and Microsoft adopted the Model Context Protocol within months of each other, skeptics saw middleware hype. Over 1,000 community connectors by February 2025. The usual ecosystem enthusiasm. But sometimes adoption numbers reveal something specific about architectural choices. MCP's rapid growth validates a particular kind of problem and a particular kind of solution. Understanding what problem it actually solves matters more than counting connectors.

Where MCP's Cooperative Design Meets Adversarial Reality
MCP's architecture handles cooperative data sources elegantly. Then it meets web environments that actively resist automation. Sites implement bot detection. Sessions track behavioral legitimacy. Structure changes without warning. When Microsoft released Playwright-MCP, the implementation needed three session modes and lost storage state on close. Those aren't just technical details. They're signals about where protocol assumptions designed for cooperative environments encounter ones operating on fundamentally different principles.

Why MCP's Rapid Adoption Validates More Than Market Momentum
When OpenAI, Google, and Microsoft adopted the Model Context Protocol within months of each other, skeptics saw middleware hype. Over 1,000 community connectors by February 2025. The usual ecosystem enthusiasm. But sometimes adoption numbers reveal something specific about architectural choices. MCP's rapid growth validates a particular kind of problem and a particular kind of solution. Understanding what problem it actually solves matters more than counting connectors.

Where MCP's Cooperative Design Meets Adversarial Reality
MCP's architecture handles cooperative data sources elegantly. Then it meets web environments that actively resist automation. Sites implement bot detection. Sessions track behavioral legitimacy. Structure changes without warning. When Microsoft released Playwright-MCP, the implementation needed three session modes and lost storage state on close. Those aren't just technical details. They're signals about where protocol assumptions designed for cooperative environments encounter ones operating on fundamentally different principles.

Why MCP's Rapid Adoption Validates More Than Market Momentum
When OpenAI, Google, and Microsoft adopted the Model Context Protocol within months of each other, skeptics saw middleware hype. Over 1,000 community connectors by February 2025. The usual ecosystem enthusiasm. But sometimes adoption numbers reveal something specific about architectural choices. MCP's rapid growth validates a particular kind of problem and a particular kind of solution. Understanding what problem it actually solves matters more than counting connectors.

Where MCP's Cooperative Design Meets Adversarial Reality
MCP's architecture handles cooperative data sources elegantly. Then it meets web environments that actively resist automation. Sites implement bot detection. Sessions track behavioral legitimacy. Structure changes without warning. When Microsoft released Playwright-MCP, the implementation needed three session modes and lost storage state on close. Those aren't just technical details. They're signals about where protocol assumptions designed for cooperative environments encounter ones operating on fundamentally different principles.
Production Gap Reality Check
OpenAI announced Operator in January 2025: an agent that handles web tasks autonomously. Book restaurants, order groceries, plan vacations. The demo looked smooth.
You get a $200/month research preview that stops at every CAPTCHA and password field. It refuses financial transactions. Early users compared performance to "watching an arthritic half-blind grandma use a rusty typewriter."
The 38.1% benchmark success rate tells you everything. Rate limits on concurrent tasks. Ninety-day data retention. Computational costs OpenAI calls "cost-prohibitive for widespread use." The gap between "can interact with browsers" and "can reliably complete tasks" remains enormous.
Novel architecture, genuine innovation. But production reality? Still distant.
OpenAI announced Operator in January 2025: an agent that handles web tasks autonomously. Book restaurants, order groceries, plan vacations. The demo looked smooth.
You get a $200/month research preview that stops at every CAPTCHA and password field. It refuses financial transactions. Early users compared performance to "watching an arthritic half-blind grandma use a rusty typewriter."
The 38.1% benchmark success rate tells you everything. Rate limits on concurrent tasks. Ninety-day data retention. Computational costs OpenAI calls "cost-prohibitive for widespread use." The gap between "can interact with browsers" and "can reliably complete tasks" remains enormous.
Novel architecture, genuine innovation. But production reality? Still distant.
Agent autonomously navigates websites, processes pixel data to understand interfaces, achieves state-of-the-art benchmark performance on WebVoyager evaluation suite.
Research preview requires constant supervision, fails at CAPTCHAs and payment screens, refuses financial transactions, operates slower than human baseline.
Rate limits prevent concurrent task execution, 90-day data retention window, computational costs acknowledged as prohibitive for scaling beyond research users.
Mandatory human oversight for sensitive operations, U.S.-only availability with no European timeline, Pro subscription tier gatekeeping to manage computational load.
Real architectural progress on browser interaction, but the reliability gap for production deployment hasn't meaningfully closed. Still research, not product.
Quiet Tech That Compounds
The latest model announcement gets the headlines. The newest agent demo that can order pizza and book flights gets the social media buzz. But something else is happening that matters more for anyone building systems meant to run in production.
Infrastructure that makes agents actually work is reaching maturity. Not with press releases, but with incremental progress that compounds: observability standards that prevent vendor lock-in, evaluation frameworks measuring what enterprises care about, cost optimization making continuous operation economically viable.
This won't trend. But it's what separates impressive demos from systems that ship and stay shipped. Six developments that serious builders are watching because they solve the problems that kill production deployments.
The latest model announcement gets the headlines. The newest agent demo that can order pizza and book flights gets the social media buzz. But something else is happening that matters more for anyone building systems meant to run in production.
Infrastructure that makes agents actually work is reaching maturity. Not with press releases, but with incremental progress that compounds: observability standards that prevent vendor lock-in, evaluation frameworks measuring what enterprises care about, cost optimization making continuous operation economically viable.
This won't trend. But it's what separates impressive demos from systems that ship and stay shipped. Six developments that serious builders are watching because they solve the problems that kill production deployments.
