Practitioner's Corner
For every task an agent automates, new work emerges that nobody budgeted for. This issue follows the labor that never made it into the spreadsheet.

Practitioner's Corner
For every task an agent automates, new work emerges that nobody budgeted for. This issue follows the labor that never made it into the spreadsheet.

The Work the Spreadsheet Can't See

A single agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.

The Work the Spreadsheet Can't See
A single agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.
Sumeet Vaidya and the Distance Between Writing Code and Shipping It

An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.
Sumeet Vaidya and the Distance Between Writing Code and Shipping It
An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.


The Professional Noticer Keeping AI Agents From Quietly Losing Their Minds
CONTINUE READINGThe Maintenance Curve
Gartner predicts over 40% of agentic AI projects face cancellation by end of 2027. Most will be narrated as technology failures. Look closer and the pattern is financial: teams that built fast discover they've inherited platform-scale obligations on a prototype-scale budget.
The trajectory is remarkably consistent. Ship an agent, wire up basic logging, call it supervised. Within months, evaluation suites, audit infrastructure, model migration cycles, and governance layers arrive uninvited. Engineering maintenance alone runs $3,000 to $6,000 monthly per mid-complexity agent. Development environments, with their clean data and cooperative inputs, never hinted at any of this.
By the time the true operating cost surfaces, the project is already under executive scrutiny with no clean exit.
Further Reading




Past Articles

During their January 2026 Launch Week, Skyvern shipped a feature that lets users upload a PDF of a human standard operat...

A single step succeeds 95% of the time. Chain twenty of those steps together and the workflow completes 36% of the time....

Most browser agent frameworks begin by taking a screenshot. Feed pixels to a model, ask it where to click. Magnus Müller...

The IRS's Individual Master File has been running since 1961. It was supposed to be replaced decades ago. Instead, layer...


