CURRENT | Market Pulse

The Signal

The Eight-Hour Agent Nobody's Running

By Nora Kaplan— October 27, 2025

Feature image for article: The Eight-Hour Agent Nobody's Running

Been tracking AWS's eight-hour runtime thing since July—longest in the industry, apparently—and the weirdest part is nobody's talking about actually using it. Spent enough time thinking about collaboration tools to know: trusting software to work unsupervised for a full business day requires organizational muscles most teams haven't built yet. The infrastructure's here. The confidence? Nowhere close. Feels like watching storm clouds roll in while everyone's still deciding if they need an umbrella. Capability exists, readiness doesn't.

The Signal

The Eight-Hour Agent Nobody's Running

By Nora Kaplan— October 27, 2025

Been tracking AWS's eight-hour runtime thing since July—longest in the industry, apparently—and the weirdest part is nobody's talking about actually using it. Spent enough time thinking about collaboration tools to know: trusting software to work unsupervised for a full business day requires organizational muscles most teams haven't built yet. The infrastructure's here. The confidence? Nowhere close. Feels like watching storm clouds roll in while everyone's still deciding if they need an umbrella. Capability exists, readiness doesn't.

Field Notes

The Velocity Mismatch

By Nora Kaplan— October 27, 2025

Feature image for article: The Velocity Mismatch

Been watching this pattern for weeks now and it's getting clearer—teams are deploying agents way faster than they're building the governance infrastructure to manage them. McKinsey dropped numbers this month showing 80% already running agents, 96% planning expansion, but when you ask about barriers it's all governance tools. Classic mismatch between what moves fast (spinning up agents) and what moves slow (audit trails, access controls, observability across systems).

The weather's shifting though. Keycard Labs just raised $38M specifically for agent governance, which tells you investors see this gap widening into a real category. Next few months will separate the teams building controls alongside deployment from the ones who'll hit a wall when they can't reconstruct what an agent did overnight. Feels like we're heading into a governance retrofitting wave, similar to technical debt cleanup cycles but with compliance stakes instead of velocity stakes.

Field Notes

The Velocity Mismatch

By Nora Kaplan— October 27, 2025

Been watching this pattern for weeks now and it's getting clearer—teams are deploying agents way faster than they're building the governance infrastructure to manage them. McKinsey dropped numbers this month showing 80% already running agents, 96% planning expansion, but when you ask about barriers it's all governance tools. Classic mismatch between what moves fast (spinning up agents) and what moves slow (audit trails, access controls, observability across systems).

The weather's shifting though. Keycard Labs just raised $38M specifically for agent governance, which tells you investors see this gap widening into a real category. Next few months will separate the teams building controls alongside deployment from the ones who'll hit a wall when they can't reconstruct what an agent did overnight. Feels like we're heading into a governance retrofitting wave, similar to technical debt cleanup cycles but with compliance stakes instead of velocity stakes.

Surface Story, Deeper Pattern

Dual Lens

When Infrastructure Becomes a Category

Kernel raised $22M for browser automation infrastructure this month. Not remarkable because it's novel—remarkable because it's becoming its own category. When Perplexity pays for browser automation instead of building it, that's the signal. Something crossed from "feature we should own" to "infrastructure someone else handles." Spent years watching this pattern: observability, auth, now browser automation. The complexity isn't in spinning up Puppeteer. It's in making it work reliably when websites actively resist you. First piece tracks when that threshold gets crossed.

Dual Lens

The Problem Infrastructure Can't Solve

Second piece is the uncomfortable part. All that infrastructure—session management, CAPTCHA solving, fingerprint rotation—exists because we're making agents pretend to be humans. The web resists programmatic access by design. Not a bug. An architectural reality accumulated through millions of decisions about bot prevention, personalization, security. Infrastructure can wrap that complexity. Can't resolve it. Can't make websites behave like APIs when they're fundamentally experiences testing whether you have eyes and a mouse. We're not fixing the web. We're learning to work around what it became.

Surface Story, Deeper Pattern

Dual Lens

When Infrastructure Becomes a Category

Kernel raised $22M for browser automation infrastructure this month. Not remarkable because it's novel—remarkable because it's becoming its own category. When Perplexity pays for browser automation instead of building it, that's the signal. Something crossed from "feature we should own" to "infrastructure someone else handles." Spent years watching this pattern: observability, auth, now browser automation. The complexity isn't in spinning up Puppeteer. It's in making it work reliably when websites actively resist you. First piece tracks when that threshold gets crossed.

Dual Lens

The Problem Infrastructure Can't Solve

Second piece is the uncomfortable part. All that infrastructure—session management, CAPTCHA solving, fingerprint rotation—exists because we're making agents pretend to be humans. The web resists programmatic access by design. Not a bug. An architectural reality accumulated through millions of decisions about bot prevention, personalization, security. Infrastructure can wrap that complexity. Can't resolve it. Can't make websites behave like APIs when they're fundamentally experiences testing whether you have eyes and a mouse. We're not fixing the web. We're learning to work around what it became.

Surface Story, Deeper Pattern

Dual Lens

When Infrastructure Becomes a Category

Kernel raised $22M for browser automation infrastructure this month. Not remarkable because it's novel—remarkable because it's becoming its own category. When Perplexity pays for browser automation instead of building it, that's the signal. Something crossed from "feature we should own" to "infrastructure someone else handles." Spent years watching this pattern: observability, auth, now browser automation. The complexity isn't in spinning up Puppeteer. It's in making it work reliably when websites actively resist you. First piece tracks when that threshold gets crossed.

Dual Lens

The Problem Infrastructure Can't Solve

Second piece is the uncomfortable part. All that infrastructure—session management, CAPTCHA solving, fingerprint rotation—exists because we're making agents pretend to be humans. The web resists programmatic access by design. Not a bug. An architectural reality accumulated through millions of decisions about bot prevention, personalization, security. Infrastructure can wrap that complexity. Can't resolve it. Can't make websites behave like APIs when they're fundamentally experiences testing whether you have eyes and a mouse. We're not fixing the web. We're learning to work around what it became.

Production Gap Reality Check

AgentKit Promises Complete Tools, Delivers Partial Assembly

OpenAI launched AgentKit as "a complete set of tools for developers and enterprises to build, deploy, and optimize agents." The pitch: solve fragmented tooling, eliminate weeks of frontend work, cut development time by half.

You get visual workflow builders, connector registries, and evaluation hooks. These package existing capabilities into more accessible forms. Useful? Yes. Complete? Only if you're building demos.

What's missing from the announcement: pricing models for inference at scale, security frameworks for prompt injection vulnerabilities, governance structures for multi-agent systems running continuously. The cited success stories—Klarna handling two-thirds of support tickets, Clay's 10x growth—preceded AgentKit's launch. These companies succeeded despite tooling fragmentation, not because someone solved it.

Running agents at scale still requires technical teams to develop and tune systems, hybrid approaches mixing quick wins with long-term architecture, and governance frameworks that treat agents as production infrastructure.

AgentKit lowers the barrier to experimentation. But it shifts complexity from orchestration to operations. The hard parts remain hard, just differently packaged.

Production Gap Reality Check

AgentKit Promises Complete Tools, Delivers Partial Assembly

OpenAI launched AgentKit as "a complete set of tools for developers and enterprises to build, deploy, and optimize agents." The pitch: solve fragmented tooling, eliminate weeks of frontend work, cut development time by half.

You get visual workflow builders, connector registries, and evaluation hooks. These package existing capabilities into more accessible forms. Useful? Yes. Complete? Only if you're building demos.

What's missing from the announcement: pricing models for inference at scale, security frameworks for prompt injection vulnerabilities, governance structures for multi-agent systems running continuously. The cited success stories—Klarna handling two-thirds of support tickets, Clay's 10x growth—preceded AgentKit's launch. These companies succeeded despite tooling fragmentation, not because someone solved it.

Running agents at scale still requires technical teams to develop and tune systems, hybrid approaches mixing quick wins with long-term architecture, and governance frameworks that treat agents as production infrastructure.

AgentKit lowers the barrier to experimentation. But it shifts complexity from orchestration to operations. The hard parts remain hard, just differently packaged.

Timeline Reality:

OpenAI's customer wins represent established deployments that preceded AgentKit's October launch, suggesting the platform packages learnings from production systems rather than enabling fundamentally new capabilities for enterprise adoption.

Security Omission:

The announcement makes no mention of prompt injection vulnerabilities, which remain an unsolved security problem that browser agents and production systems continue to expose in real-world deployments across the industry.

Hybrid Strategies:

Organizations report adopting mixed approaches—piloting quick wins with provider solutions while building internal expertise via frameworks—to capture near-term value without sacrificing long-term control or accumulating vendor lock-in debt.

Inference Economics:

Industry data shows 74% of builders now run majority inference workloads, up from 48% a year ago, but AgentKit provides no detail on pricing models, cost predictability, or evaluation feature impacts.

Framework Competition:

Microsoft launched its Agent Framework the same week, emphasizing open standards and interoperability, revealing how vendors race to define the production stack while the market discovers what production actually requires.

Quiet Tech That Compounds

Standards Work

OpenTelemetry Conventions for Agent Observability

OpenTelemetry is establishing semantic conventions for agent telemetry. The payoff: compare performance across frameworks, switch implementations without rebuilding your observability stack. Interoperability at the infrastructure layer means your monitoring investment survives framework churn.

Framework Consolidation

Microsoft Merges AutoGen and Semantic Kernel

The Agent Framework public preview consolidates research tooling with enterprise foundations. Built-in observability and durability as native features, not afterthoughts. Fewer frameworks to evaluate, more reliability assumptions you can count on. Maturity looks like convergence.

Observability Evolution

Evaluations Join Metrics as Core Observability

Azure now treats evaluations and governance enforcement as observability capabilities, not separate concerns. Non-deterministic systems need systematic quality assessment to run in production. Traditional metrics tell you what happened. Evaluations tell you if it was correct.

Governance Infrastructure

Governance Tooling Blocks AI Adoption

McKinsey found governance tooling is the top barrier to AI adoption. Only 26% of organizations have integrated GenAI standards into governance frameworks. Task adherence, prompt shields, PII detection determine what ships. Demos are easy. Compliance is the gate.

Regulatory Compliance

EU AI Act Enforcement Underway

The EU AI Act started enforcement in August 2024, full compliance expected by 2026. Risk-based classification means high-risk applications face stricter scrutiny. Architectural decisions about data governance, bias prevention, and audit trails are happening now, not later.

Monitoring Maturity

SRE Agents Built Into Observability Platforms

Grafana's Assistant investigations and Middleware's OpsAI shift monitoring from reactive to proactive. AI-powered root cause analysis, predictive alerts, automated remediation. The infrastructure that lets agents run 24/7 without human intervention. Operations capability, not monitoring theater.

Quiet Tech That Compounds

Standards Work

OpenTelemetry Conventions for Agent Observability

OpenTelemetry is establishing semantic conventions for agent telemetry. The payoff: compare performance across frameworks, switch implementations without rebuilding your observability stack. Interoperability at the infrastructure layer means your monitoring investment survives framework churn.

Framework Consolidation

Microsoft Merges AutoGen and Semantic Kernel

The Agent Framework public preview consolidates research tooling with enterprise foundations. Built-in observability and durability as native features, not afterthoughts. Fewer frameworks to evaluate, more reliability assumptions you can count on. Maturity looks like convergence.

Observability Evolution

Evaluations Join Metrics as Core Observability

Azure now treats evaluations and governance enforcement as observability capabilities, not separate concerns. Non-deterministic systems need systematic quality assessment to run in production. Traditional metrics tell you what happened. Evaluations tell you if it was correct.

Governance Infrastructure

Governance Tooling Blocks AI Adoption

McKinsey found governance tooling is the top barrier to AI adoption. Only 26% of organizations have integrated GenAI standards into governance frameworks. Task adherence, prompt shields, PII detection determine what ships. Demos are easy. Compliance is the gate.

Regulatory Compliance

EU AI Act Enforcement Underway

The EU AI Act started enforcement in August 2024, full compliance expected by 2026. Risk-based classification means high-risk applications face stricter scrutiny. Architectural decisions about data governance, bias prevention, and audit trails are happening now, not later.

Monitoring Maturity

SRE Agents Built Into Observability Platforms

Grafana's Assistant investigations and Middleware's OpsAI shift monitoring from reactive to proactive. AI-powered root cause analysis, predictive alerts, automated remediation. The infrastructure that lets agents run 24/7 without human intervention. Operations capability, not monitoring theater.