Practitioner's Corner
Human-in-the-loop was the safety net for agent systems. When approval rates hit 95%, it's worth asking what the net actually catches.

Practitioner's Corner
Human-in-the-loop was the safety net for agent systems. When approval rates hit 95%, it's worth asking what the net actually catches.

The Checkpoint That Approves Everything

Most enterprise teams running agent workflows have never checked the approval rate on their human-in-the-loop controls. When someone finally does, the number lands above 95%. Nearly everything gets waved through. The reviewers are behaving rationally given where the gate sits. How organizations build these checkpoints where they do, why the checkpoints persist long after anyone could justify their placement, and why replacing them with something real is structurally harder than it sounds: that's where the number becomes interesting.

The Checkpoint That Approves Everything
Most enterprise teams running agent workflows have never checked the approval rate on their human-in-the-loop controls. When someone finally does, the number lands above 95%. Nearly everything gets waved through. The reviewers are behaving rationally given where the gate sits. How organizations build these checkpoints where they do, why the checkpoints persist long after anyone could justify their placement, and why replacing them with something real is structurally harder than it sounds: that's where the number becomes interesting.
When Should an Agent Stop Thinking?

Government portals, insurance dashboards, vendor procurement systems. No API, no programmatic access. Just a browser and forms built for humans. Suchintan Singh's Skyvern navigates these surfaces by reasoning through a workflow once with an LLM, then compiling the result into deterministic code. No model in the loop on the second run. The architecture is designed to need less intelligence over time, which amounts to a specific claim about where judgment actually matters in agent systems, and where it's just cost.
When Should an Agent Stop Thinking?
Government portals, insurance dashboards, vendor procurement systems. No API, no programmatic access. Just a browser and forms built for humans. Suchintan Singh's Skyvern navigates these surfaces by reasoning through a workflow once with an LLM, then compiling the result into deterministic code. No model in the loop on the second run. The architecture is designed to need less intelligence over time, which amounts to a specific claim about where judgment actually matters in agent systems, and where it's just cost.


The Counter-Example
BakerHostetler pointed ROSS Intelligence at 27 terabytes of bankruptcy filings and told it to surface citations, retrieve documents, and identify precedent. Attorneys kept one job: deciding what the research meant.
The 60% reduction in research hours didn't come from removing humans. It came from concentrating them at the interpretation layer, where professional judgment is both legally required and genuinely load-bearing. Pattern-matching got automated. The moment where context meets liability stayed human.
That's a checkpoint designed around consequence, not comfort.
Further Reading




Past Articles

Before an AI agent processes a single webpage, anti-bot systems have already judged it. The evaluation is about identity...

OpenAI's function-calling documentation has a telling phrase. When strict mode is off, the model "tries its best." That ...

Inside one of Meta's data pipelines, two configuration fields refer to the same operation. Use the wrong one and the cod...

A single-pixel font. An instruction buried in an HTML comment. CSS that hides a payload from every human eye but leaves ...


