CURRENT | Market Pulse

The Signal

The Infrastructure Gap Between Promising Outcomes and Charging for Them

By Rina Takahashi— November 4, 2025

Feature image for article: The Infrastructure Gap Between Promising Outcomes and Charging for Them

Been watching the clouds gather around seat-based pricing for months—every customer conversation circles back to "why am I paying for licenses when agents could do this?" Automation Anywhere buying Aisera this week feels like the first real attempt to actually build infrastructure for charging by outcomes, not just talking about it. Their 40% fewer ITSM seats claim isn't what got me—it's that they're betting they can prove what agents accomplished and charge for that at enterprise scale.

That's a completely different infrastructure problem than most vendors are solving. You need telemetry customers actually trust, ways to handle authentication breaks gracefully, observability that catches failures before customers do. Most "outcome-based pricing" is just seat licenses with creative accounting. If this works, it rewrites software economics when agents replace people. If it doesn't, it's expensive theater showing why seats persist.

The Signal

The Infrastructure Gap Between Promising Outcomes and Charging for Them

By Rina Takahashi— November 4, 2025

Been watching the clouds gather around seat-based pricing for months—every customer conversation circles back to "why am I paying for licenses when agents could do this?" Automation Anywhere buying Aisera this week feels like the first real attempt to actually build infrastructure for charging by outcomes, not just talking about it. Their 40% fewer ITSM seats claim isn't what got me—it's that they're betting they can prove what agents accomplished and charge for that at enterprise scale.

That's a completely different infrastructure problem than most vendors are solving. You need telemetry customers actually trust, ways to handle authentication breaks gracefully, observability that catches failures before customers do. Most "outcome-based pricing" is just seat licenses with creative accounting. If this works, it rewrites software economics when agents replace people. If it doesn't, it's expensive theater showing why seats persist.

Field Notes

Which Web Automation Components Are Actually Swappable

By Rina Takahashi— November 4, 2025

Feature image for article: Which Web Automation Components Are Actually Swappable

Been watching the composability wave build—swap models, connect tools, build from whatever works best. Forty-five enterprise providers just committed to interoperability. The story says everything becomes modular. But when you're actually running web agents at scale, something weird happens: the components that look easiest to replace turn out to be the hardest. And the pieces that seem locked-in? Those compose fine.

The next six months will sort this. Teams operating at scale will figure out which parts of their agent infrastructure can actually be swapped versus which need to work together. The signals are already there—just not where the composability narrative predicts. Worth watching, because the teams that read this right will have infrastructure that scales. The ones that don't will spend 2026 rebuilding what they thought they could outsource.

Field Notes

Which Web Automation Components Are Actually Swappable

By Rina Takahashi— November 4, 2025

Been watching the composability wave build—swap models, connect tools, build from whatever works best. Forty-five enterprise providers just committed to interoperability. The story says everything becomes modular. But when you're actually running web agents at scale, something weird happens: the components that look easiest to replace turn out to be the hardest. And the pieces that seem locked-in? Those compose fine.

The next six months will sort this. Teams operating at scale will figure out which parts of their agent infrastructure can actually be swapped versus which need to work together. The signals are already there—just not where the composability narrative predicts. Worth watching, because the teams that read this right will have infrastructure that scales. The ones that don't will spend 2026 rebuilding what they thought they could outsource.

Surface Story, Deeper Pattern

Dual Lens

ServiceNow's Agent Play Isn't About Better Technology

Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Dual Lens

Why Vertical Platforms Let Horizontal Tools Do the Hard Work

But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.

Surface Story, Deeper Pattern

Dual Lens

ServiceNow's Agent Play Isn't About Better Technology

Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Dual Lens

Why Vertical Platforms Let Horizontal Tools Do the Hard Work

But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.

Surface Story, Deeper Pattern

Dual Lens

ServiceNow's Agent Play Isn't About Better Technology

Watched a team deploy ServiceNow agents last week and the thing that got me wasn't the tech quality... it's how each workflow they automate makes leaving harder. Every incident management process that gets agent-ified is another reason they can't switch platforms. That's the real story—ServiceNow doesn't need better AI than LangChain, they just own the layer where work happens. So adding agents extends the moat, right? We see the opposite at TinyFish—workflows that cross system boundaries, no platform owns them—and it's a completely different infrastructure problem.

Dual Lens

Why Vertical Platforms Let Horizontal Tools Do the Hard Work

But here's what ServiceNow's actually doing while everyone watches those deployment numbers... they're letting horizontal frameworks do all the hard work. LangChain, AutoGen, all of them—basically free R&D. They take the risk figuring out orchestration patterns, what works at scale, what doesn't. ServiceNow just observes, waits to see what proves out, then integrates the good stuff. No experimentation cost. We're doing this too, stress-testing infrastructure approaches that might not work... and vertical platforms learn from everyone's failures without funding any of it. Kind of brilliant, honestly.

Production Gap Reality Check

OpenAI's Agent Hits Production Reality

OpenAI launched ChatGPT agent in January 2025 promising autonomous browser-based task execution. The demo showed smooth navigation, intelligent decision-making, workflows that run themselves.

Then people tried to run it.

Leon Furze tested it extensively. His conclusion: "OpenAI hasn't released this product because it works. They've released it to be first." The system takes 1-2 seconds per action. It fails silently on dynamic interfaces. No telemetry when things break. Users report treating it "like a junior intern with admin access: watch carefully and limit what they see."

The vision-based approach creates inherent constraints. Taking screenshots and deciding where to click works differently than structured API calls. Modern web interfaces confuse the visual understanding model. Hidden state, hover effects, multi-step forms. What scores 68.9% on BrowseComp breaks in production environments where consistency matters.

The gap between announcement and production reveals where agent technology actually stands right now.

Production Gap Reality Check

OpenAI's Agent Hits Production Reality

OpenAI launched ChatGPT agent in January 2025 promising autonomous browser-based task execution. The demo showed smooth navigation, intelligent decision-making, workflows that run themselves.

Then people tried to run it.

Leon Furze tested it extensively. His conclusion: "OpenAI hasn't released this product because it works. They've released it to be first." The system takes 1-2 seconds per action. It fails silently on dynamic interfaces. No telemetry when things break. Users report treating it "like a junior intern with admin access: watch carefully and limit what they see."

The vision-based approach creates inherent constraints. Taking screenshots and deciding where to click works differently than structured API calls. Modern web interfaces confuse the visual understanding model. Hidden state, hover effects, multi-step forms. What scores 68.9% on BrowseComp breaks in production environments where consistency matters.

The gap between announcement and production reveals where agent technology actually stands right now.

Promised capability:

Autonomous execution of complex workflows including calendar analysis, research synthesis, and slide deck creation with intelligent website navigation and real-time decision-making.

Current reality:

1-2 seconds per action, frequent failures on dynamic interfaces, silent errors without telemetry, 30-minute runtime limits, white screen crashes from WebSocket failures.

Marketing gap:

Launch delayed by unresolved prompt injection vulnerabilities. No session isolation means tasks share browser state. Vision-based approach struggles with hidden UI elements and hover effects.

Scale requirements:

Enterprise pricing requires $100K+ annual commitment with 150-user minimums. Agents consume substantially more compute per user than standard ChatGPT. Hybrid API approaches needed for reliability.

TinyFish read:

Vision-based web automation hits fundamental constraints that API-structured interfaces avoid. Racing to ship first, not because production requirements are solved.

Quiet Tech That Compounds

While everyone watches model benchmarks and demo videos, a different set of developments is quietly solving the problems that actually keep agents out of production.

How do agents from different providers communicate without custom integrations? How do you make API costs predictable when context windows keep growing? How do you guarantee valid JSON without post-processing failures? These questions don't generate headlines. They generate working systems.

This week's developments share a pattern: standards reaching maturity, cost optimization enabled by default, deterministic outputs replacing probabilistic hope, observability that integrates with existing enterprise stacks. Infrastructure that fades into the background. Which is exactly what production systems need.

Quiet Tech That Compounds

While everyone watches model benchmarks and demo videos, a different set of developments is quietly solving the problems that actually keep agents out of production.

How do agents from different providers communicate without custom integrations? How do you make API costs predictable when context windows keep growing? How do you guarantee valid JSON without post-processing failures? These questions don't generate headlines. They generate working systems.

This week's developments share a pattern: standards reaching maturity, cost optimization enabled by default, deterministic outputs replacing probabilistic hope, observability that integrates with existing enterprise stacks. Infrastructure that fades into the background. Which is exactly what production systems need.

Protocol Maturity

Agent Communication Gets Its HTTP Moment

Google's A2A protocol under Linux Foundation stewardship establishes JSON-RPC over HTTPS standards for agent-to-agent communication. Discovery, authentication, secure task delegation between providers. Multi-agent systems need this kind of foundational agreement before they can scale beyond single-vendor demos.

Caching Economics

Context Caching Becomes Default, Not Optional

Gemini 2.5 enables caching automatically. DeepSeek's disk-based approach cuts cache hits to $0.014 per million tokens, delivering 90% cost reductions for repeated contexts. First-token latency drops from 13 seconds to 500ms. Production economics just shifted from unpredictable to manageable.

Structured Outputs

Pre³ Guarantees Valid JSON Without Parsing

Deterministic pushdown automata eliminate runtime parsing overhead. Pre³ integrates into inference frameworks, reducing time-per-output-token by 40% and boosting throughput 36%. When you're processing millions of requests, guaranteed valid JSON matters more than impressive demos with occasional failures.

Agent Governance

IBM Makes Agent Behavior Actually Observable

AgentOps adds lifecycle monitoring and policy-based control to production agents. Built on OpenTelemetry standards, it integrates with existing observability stacks like Datadog and Splunk. You can't run agents at scale without visibility into what they're actually doing. Now you have it.

Infrastructure Graphs

HashiCorp Builds Knowledge Layer for Autonomous Systems

Project infragraph creates real-time knowledge graphs connecting infrastructure, applications, services, and ownership across cloud providers and Kubernetes clusters. Currently in beta. This creates the foundational layer for AI agents to reason about infrastructure state and propose changes autonomously.