Nearly $8 billion flowed into agent infrastructure companies in February 2026. Temporal, Databricks, Daytona, Basis, Astelia. Every one of them, in announcements and investor quotes, pitched some version of the same word: reliability. They do not mean the same thing.
Temporal's CEO Samar Abbas frames reliability as durability: agents crash mid-workflow and need to resume with full context intact. The orchestration layer never forgets what was happening. Daytona's Ivan Burazin locates the failure somewhere else entirely. Today's cloud, he argues, was built for stateless, immutable production workloads. Agents explore, branch, backtrack. Reliability means the execution environment fits the shape of the work. Two companies, both saying "reliability," describing problems that don't even share a layer of the stack.
Then the scale shifts. Ali Ghodsi told CNBC that 80% of databases on Databricks' platform are now built by AI agents, not people. Their $7 billion raise is partly earmarked for Lakebase, a serverless Postgres database designed for agent workloads. Reliability here means the data layer keeps up with what agents actually do to it. Astelia's $35 million raise targets security prioritization, using agents to surface which vulnerabilities are genuinely exploitable out of millions of candidates. A different definition again. A different layer.
Four plausible diagnoses, each shaped by where that company comes from and what it already knows how to build. Four encoded theories of where agents fail.
Infrastructure choices are load-bearing in ways that application choices simply aren't. Pick the wrong project management tool and you migrate. Painful, but bounded. Infrastructure compounds differently. Choose a durability-first orchestration layer and your agents inherit assumptions about workflow structure. Choose sandboxed runtimes and you inherit assumptions about execution isolation. Those assumptions propagate upward into everything built on top of them, shaping what's easy and what's expensive for years. The organizations choosing agent infrastructure right now are committing to a theory of failure before the market has settled on which failures matter most, and often before they've fully articulated what success looks like for agents in their own environment.
That would be fine if these theories were converging. There's no evidence they are. Orchestration, runtime, data layer, security. Each leads to a different architecture that gets harder to unwind over time. And each vendor's pitch sounds equally reasonable, because each describes real problems that real agents actually have.
Gartner predicts over 40% of agentic AI projects will be canceled by end of 2027. Some of those cancellations will trace back to infrastructure that solved the wrong layer of the reliability problem.
Model limitations will account for some of those cancellations. Organizational readiness will account for others. But a meaningful share will trace to this moment, when teams locked into infrastructure tuned for one layer of the reliability problem while a different layer turned out to be the one that mattered for their workload.
Agents probably fail in all of these places, at different times, for different reasons. And categories do sometimes develop shared vocabulary. "Observability" eventually came to mean something more specific than "monitoring but fancier." But there's no guarantee this category will converge before the commitments being made now have already compounded. Every infrastructure bet carries a quiet additional risk: you may have picked a perfectly good vendor solving a problem that turns out to be the wrong one for your workload.
A genuinely hard thing to evaluate in a procurement cycle. And worth naming clearly, because nearly $8 billion suggests the market is moving considerably faster than the definitions underneath it.
Things to follow up on...
-
NIST wants your input: The new AI Agent Standards Initiative has open comment periods on agent security (March 9 deadline) and agent identity/authorization (April 2), which could begin imposing shared definitions on a category that currently lacks them.
-
Protocol security is uncharted: A February 2026 preprint analyzing four leading agent communication protocols found that none have formal threat models, meaning the interoperability layer connecting these infrastructure bets has its own unresolved reliability questions.
-
Half of agents work alone: Salesforce's latest connectivity report found that 50% of enterprise agents operate in isolated silos rather than as part of multi-agent systems, suggesting the infrastructure fragmentation described here extends into how organizations actually deploy.
-
Amazon fired the first shot: Amazon's January 2026 lawsuit against Perplexity's Comet over automated shopping agents is the first legal challenge to agentic browser technology, adding a legal layer to the question of what "reliable" agent infrastructure even gets to do.

