CURRENT | Practitioner's Corner

The Rubber Stamp

The Checkpoint That Approves Everything

By Nora Kaplan— May 13, 2026

Feature image for article: The Checkpoint That Approves Everything

Most enterprise teams running agent workflows have never checked the approval rate on their human-in-the-loop controls. When someone finally does, the number lands above 95%. Nearly everything gets waved through. The reviewers are behaving rationally given where the gate sits. How organizations build these checkpoints where they do, why the checkpoints persist long after anyone could justify their placement, and why replacing them with something real is structurally harder than it sounds: that's where the number becomes interesting.

The Rubber Stamp

The Checkpoint That Approves Everything

By Nora Kaplan— May 13, 2026

Most enterprise teams running agent workflows have never checked the approval rate on their human-in-the-loop controls. When someone finally does, the number lands above 95%. Nearly everything gets waved through. The reviewers are behaving rationally given where the gate sits. How organizations build these checkpoints where they do, why the checkpoints persist long after anyone could justify their placement, and why replacing them with something real is structurally harder than it sounds: that's where the number becomes interesting.

The Autonomy Boundary

When Should an Agent Stop Thinking?

By Rina Takahashi— May 13, 2026

Feature image for article: When Should an Agent Stop Thinking?

Government portals, insurance dashboards, vendor procurement systems. No API, no programmatic access. Just a browser and forms built for humans. Suchintan Singh's Skyvern navigates these surfaces by reasoning through a workflow once with an LLM, then compiling the result into deterministic code. No model in the loop on the second run. The architecture is designed to need less intelligence over time, which amounts to a specific claim about where judgment actually matters in agent systems, and where it's just cost.

The Autonomy Boundary

When Should an Agent Stop Thinking?

By Rina Takahashi— May 13, 2026

Government portals, insurance dashboards, vendor procurement systems. No API, no programmatic access. Just a browser and forms built for humans. Suchintan Singh's Skyvern navigates these surfaces by reasoning through a workflow once with an LLM, then compiling the result into deterministic code. No model in the loop on the second run. The architecture is designed to need less intelligence over time, which amounts to a specific claim about where judgment actually matters in agent systems, and where it's just cost.

Three Hundred Approvals

Bev Okoro Approves 340 Items on a Tuesday

Three Hundred Approvals

Bev Okoro Approves 340 Items on a Tuesday

The Counter-Example

BakerHostetler's ROSS Deployment: A Checkpoint Placed Where It Counts

BakerHostetler pointed ROSS Intelligence at 27 terabytes of bankruptcy filings and told it to surface citations, retrieve documents, and identify precedent. Attorneys kept one job: deciding what the research meant.

The 60% reduction in research hours didn't come from removing humans. It came from concentrating them at the interpretation layer, where professional judgment is both legally required and genuinely load-bearing. Pattern-matching got automated. The moment where context meets liability stayed human.

That's a checkpoint designed around consequence, not comfort.

The Counter-Example

BakerHostetler's ROSS Deployment: A Checkpoint Placed Where It Counts

BakerHostetler pointed ROSS Intelligence at 27 terabytes of bankruptcy filings and told it to surface citations, retrieve documents, and identify precedent. Attorneys kept one job: deciding what the research meant.

The 60% reduction in research hours didn't come from removing humans. It came from concentrating them at the interpretation layer, where professional judgment is both legally required and genuinely load-bearing. Pattern-matching got automated. The moment where context meets liability stayed human.

That's a checkpoint designed around consequence, not comfort.

IN SUMMARY

Regulatory teeth:

ABA Model Rule 2.1 requires independent professional judgment for legal advice, making this checkpoint enforceable rather than aspirational

Performance baseline:

Independent testing showed 22-30% research time reductions through improved retrieval accuracy across large document sets

Design principle:

Research separated from interpretation, with queued attorney review precisely where context and liability converge

Failure surface:

Imprecisely framed queries return plausible but wrong results, a failure mode no automated validation layer catches

Monitoring distinction:

ROSS tracked new court decisions continuously, but notification and interpretation are different acts requiring different minds

Further Reading

The Agentic Enterprise: Where Should Humans Stay in the Loop?The sharpest move here is the distinction between deliberate and reflexive checkpoints, and the diagnostic that follows: if your approval rate sits above 95%, the human in your loo...

Enforcing Human-in-the-Loop Controls for AI AgentsWhere the Board of Innovation piece asks where to place checkpoints, Prefactor asks what happens at the checkpoint itself. The practical controls perspective reveals a problem that...

Quick links

AI Reliability Is a Decade-Old Problem — And We're Still Only Solving Half of It

The 5 AI Agent Failure Modes: Why They Fail in Production

EU AI Act Article 14: Human Oversight Requirements

Past Articles

The Body Problem

Before an AI agent processes a single webpage, anti-bot systems have already judged it. The evaluation is about identity...

Why Better Agents Fail More Quietly

OpenAI's function-calling documentation has a telling phrase. When strict mode is off, the model "tries its best." That ...

What Meta Found When It Sent Agents Into Its Own Codebase

Inside one of Meta's data pipelines, two configuration fields refer to the same operation. Use the wrong one and the cod...

The Wrong Layer

A single-pixel font. An instruction buried in an HTML comment. CSS that hides a payload from every human eye but leaves ...

Past Articles

The Body Problem

Before an AI agent processes a single webpage, anti-bot systems have already judged it. The evaluation is about identity...

Why Better Agents Fail More Quietly

OpenAI's function-calling documentation has a telling phrase. When strict mode is off, the model "tries its best." That ...

What Meta Found When It Sent Agents Into Its Own Codebase

Inside one of Meta's data pipelines, two configuration fields refer to the same operation. Use the wrong one and the cod...

The Wrong Layer

A single-pixel font. An instruction buried in an HTML comment. CSS that hides a payload from every human eye but leaves ...