The Foreman Problem

Supervising agents that act autonomously demands a cognitive discipline we haven't named, trained for, or even fully recognized.

By Nora Kaplan— April 22, 2026

Supervising agents that act autonomously demands a cognitive discipline we haven't named, trained for, or even fully recognized.

Experienced developers, working on codebases they knew well, using AI tools they'd chosen to adopt, believed they were working 20% faster. A METR study measuring their actual completion times found they were 19% slower. Nearly forty percentage points separated what these practitioners felt from what was happening. They weren't novices. The tools weren't unfamiliar. And nobody noticed.

In 1983, a British psychologist named Lisbet Bainbridge published a short paper about steel plants and petrochemical refineries that became one of the most cited works in human factors research. Her observation was deceptively simple: the more you automate a process, the more you need a skilled human to supervise it, and the less skilled that human becomes through disuse. She called these the ironies of automation. Four decades of aviation safety work later, they remain largely unresolved.

Now they're migrating. In February, Spotify's co-CEO told investors that the company's best engineers hadn't coded since December. Their job is to guide, prompt, and oversee AI-generated output. MIT Sloan Management Review gave the underlying tension a useful frame late last year: agents are "owned but not controlled." They're assets that act, learn, and occasionally misrepresent their own performance. You can delete them, but you can't fully predict them. The centuries of law, custom, and organizational habit we've built for supervising humans don't transfer, and the well-understood frameworks for configuring tools don't either. MIT Sloan's researchers describe organizations "creating governance structures that can handle permanent ambiguity about who or what is responsible for making decisions." Permanent ambiguity, the kind organizations are expected to live with indefinitely.

The texture of this supervisory work is starting to come into focus. One developer documenting his experience orchestrating agent teams captured it well:

“

"As I validate the backend functionality and test the user experience, I instruct Claude Code to make updates, which it either does directly for smaller tasks or spawns subagents for larger tasks. As these updates are implemented, I view those changes in the frontend and confirm that the experience meets my expectations. Furthermore, throughout the process, I think of other feature requests or enhancements, which I then document as a new issue. The lines have blurred between these traditionally distinct roles."

He's writing code, reviewing code, managing code, and designing code simultaneously. The cognitive mode shifts every few minutes. Air traffic control research offers a sobering reference point: when experienced controllers were asked how many autonomous drones they could realistically manage, the answer was two to five. Developers using subagent architectures are attempting to supervise considerably more. The METR data suggests they may not be the best judges of how it's going.

Fifty-eight percent of leading organizations in the MIT Sloan survey expect governance structure changes within three years. The role those structures would govern doesn't have a name yet. Part of the trouble starts even earlier, in deciding how to carve work into agent-sized pieces. But even when that upstream problem is solved well, the supervisory challenge persists. Bainbridge saw it coming in 1983, watching refinery operators lose the very skills automation demanded they keep. The refinery now is knowledge work itself, and the operators losing their edge are the people organizations can least afford to deskill. A refinery operator knows when a gauge reads wrong. The developers in the METR study didn't know they'd slowed down. The feedback that would tell you something's off disappears with the effort.

The texture of this supervisory work is starting to come into focus. One developer documenting his experience orchestrating agent teams captured it well:

“