
Market Pulse
Reading the agent ecosystem through a practitioner's lens
Market Pulse
Reading the agent ecosystem through a practitioner's lens

The Infrastructure Gap Between Promising Outcomes and Charging for Them

Been watching the clouds gather around seat-based pricing for months—every customer conversation circles back to "why am I paying for licenses when agents could do this?" Automation Anywhere buying Aisera this week feels like the first real attempt to actually build infrastructure for charging by outcomes, not just talking about it. Their 40% fewer ITSM seats claim isn't what got me—it's that they're betting they can prove what agents accomplished and charge for that at enterprise scale.
That's a completely different infrastructure problem than most vendors are solving. You need telemetry customers actually trust, ways to handle authentication breaks gracefully, observability that catches failures before customers do. Most "outcome-based pricing" is just seat licenses with creative accounting. If this works, it rewrites software economics when agents replace people. If it doesn't, it's expensive theater showing why seats persist.
The Infrastructure Gap Between Promising Outcomes and Charging for Them

Been watching the clouds gather around seat-based pricing for months—every customer conversation circles back to "why am I paying for licenses when agents could do this?" Automation Anywhere buying Aisera this week feels like the first real attempt to actually build infrastructure for charging by outcomes, not just talking about it. Their 40% fewer ITSM seats claim isn't what got me—it's that they're betting they can prove what agents accomplished and charge for that at enterprise scale.
That's a completely different infrastructure problem than most vendors are solving. You need telemetry customers actually trust, ways to handle authentication breaks gracefully, observability that catches failures before customers do. Most "outcome-based pricing" is just seat licenses with creative accounting. If this works, it rewrites software economics when agents replace people. If it doesn't, it's expensive theater showing why seats persist.

Which Web Automation Components Are Actually Swappable

Been watching the composability wave build—swap models, connect tools, build from whatever works best. Forty-five enterprise providers just committed to interoperability. The story says everything becomes modular. But when you're actually running web agents at scale, something weird happens: the components that look easiest to replace turn out to be the hardest. And the pieces that seem locked-in? Those compose fine.
The next six months will sort this. Teams operating at scale will figure out which parts of their agent infrastructure can actually be swapped versus which need to work together. The signals are already there—just not where the composability narrative predicts. Worth watching, because the teams that read this right will have infrastructure that scales. The ones that don't will spend 2026 rebuilding what they thought they could outsource.

Which Web Automation Components Are Actually Swappable

Been watching the composability wave build—swap models, connect tools, build from whatever works best. Forty-five enterprise providers just committed to interoperability. The story says everything becomes modular. But when you're actually running web agents at scale, something weird happens: the components that look easiest to replace turn out to be the hardest. And the pieces that seem locked-in? Those compose fine.
The next six months will sort this. Teams operating at scale will figure out which parts of their agent infrastructure can actually be swapped versus which need to work together. The signals are already there—just not where the composability narrative predicts. Worth watching, because the teams that read this right will have infrastructure that scales. The ones that don't will spend 2026 rebuilding what they thought they could outsource.
Surface Story, Deeper Pattern

ServiceNow's Agent Play Isn't About Better Technology
Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Why Vertical Platforms Let Horizontal Tools Do the Hard Work
But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.

ServiceNow's Agent Play Isn't About Better Technology
Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Why Vertical Platforms Let Horizontal Tools Do the Hard Work
But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.

ServiceNow's Agent Play Isn't About Better Technology
Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Why Vertical Platforms Let Horizontal Tools Do the Hard Work
But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.
Production Gap Reality Check
OpenAI launched ChatGPT agent in January 2025 promising autonomous browser-based task execution. The demo showed smooth navigation, intelligent decision-making, workflows that run themselves.
Then people tried to run it.
Leon Furze tested it extensively. His conclusion: "OpenAI hasn't released this product because it works. They've released it to be first." The system takes 1-2 seconds per action. It fails silently on dynamic interfaces. No telemetry when things break. Users report treating it "like a junior intern with admin access: watch carefully and limit what they see."
The vision-based approach creates inherent constraints. Taking screenshots and deciding where to click works differently than structured API calls. Modern web interfaces confuse the visual understanding model. Hidden state, hover effects, multi-step forms. What scores 68.9% on BrowseComp breaks in production environments where consistency matters.
The gap between announcement and production reveals where agent technology actually stands right now.
OpenAI launched ChatGPT agent in January 2025 promising autonomous browser-based task execution. The demo showed smooth navigation, intelligent decision-making, workflows that run themselves.
Then people tried to run it.
Leon Furze tested it extensively. His conclusion: "OpenAI hasn't released this product because it works. They've released it to be first." The system takes 1-2 seconds per action. It fails silently on dynamic interfaces. No telemetry when things break. Users report treating it "like a junior intern with admin access: watch carefully and limit what they see."
The vision-based approach creates inherent constraints. Taking screenshots and deciding where to click works differently than structured API calls. Modern web interfaces confuse the visual understanding model. Hidden state, hover effects, multi-step forms. What scores 68.9% on BrowseComp breaks in production environments where consistency matters.
The gap between announcement and production reveals where agent technology actually stands right now.
Autonomous execution of complex workflows including calendar analysis, research synthesis, and slide deck creation with intelligent website navigation and real-time decision-making.
1-2 seconds per action, frequent failures on dynamic interfaces, silent errors without telemetry, 30-minute runtime limits, white screen crashes from WebSocket failures.
Launch delayed by unresolved prompt injection vulnerabilities. No session isolation means tasks share browser state. Vision-based approach struggles with hidden UI elements and hover effects.
Enterprise pricing requires $100K+ annual commitment with 150-user minimums. Agents consume substantially more compute per user than standard ChatGPT. Hybrid API approaches needed for reliability.
Vision-based web automation hits fundamental constraints that API-structured interfaces avoid. Racing to ship first, not because production requirements are solved.
Quiet Tech That Compounds
While everyone watches model benchmarks and demo videos, a different set of developments is quietly solving the problems that actually keep agents out of production.
How do agents from different providers communicate without custom integrations? How do you make API costs predictable when context windows keep growing? How do you guarantee valid JSON without post-processing failures? These questions don't generate headlines. They generate working systems.
This week's developments share a pattern: standards reaching maturity, cost optimization enabled by default, deterministic outputs replacing probabilistic hope, observability that integrates with existing enterprise stacks. Infrastructure that fades into the background. Which is exactly what production systems need.
While everyone watches model benchmarks and demo videos, a different set of developments is quietly solving the problems that actually keep agents out of production.
How do agents from different providers communicate without custom integrations? How do you make API costs predictable when context windows keep growing? How do you guarantee valid JSON without post-processing failures? These questions don't generate headlines. They generate working systems.
This week's developments share a pattern: standards reaching maturity, cost optimization enabled by default, deterministic outputs replacing probabilistic hope, observability that integrates with existing enterprise stacks. Infrastructure that fades into the background. Which is exactly what production systems need.
