The Goal and the Line Item

Card networks have deep infrastructure for governing transactions. AI agents operate on goals. Liability lives in the gap.

By Nora Kaplan— April 8, 2026

Card networks have deep infrastructure for governing transactions. AI agents operate on goals. Liability lives in the gap.

A facilities manager's purchasing card has a $5,000 monthly limit restricted to hardware stores and maintenance suppliers. Try to buy dinner on it, and the transaction declines at the register before any human reviews anything. Merchant category codes, dollar thresholds, real-time validation. Thirty years of delegated spending authority, distilled into infrastructure so reliable it's invisible.

This is what card networks are now extending to AI agents. Mastercard's Agent Pay issues dedicated tokens to registered agents, encoding a user's stated intent so the network can decline transactions that violate it. Visa's Trusted Agent Protocol lets merchants distinguish legitimate AI agents from bots. Both have completed live agentic transactions across multiple countries. The fit looks natural. Card networks know delegates.

Purchasing cards govern transactions. Agents operate on goals. The distance between those two things is wider than it looks.

When someone approves "book my team offsite, economy flights, under $10,000 total," the agent decomposes that into flights, hotels, ground transport, maybe a restaurant reservation. Each becomes a separate charge against a separate merchant code. The card infrastructure sees individual line items. It doesn't see the goal that generated them, or whether the collection of purchases, taken together, serves the intent the human actually had.

Mastercard's token design suggests awareness of this gap. If a user specifies "economy class, under £350," the token can reportedly decline a business-class booking at £780 before money moves. That works beautifully for goals that decompose cleanly into per-transaction constraints. "Under £350" translates into a spending cap. "Economy class" maps to a category restriction.

"Find the best option" is a different animal. Best for whom, by what criteria, against what alternatives? The judgment lives in the space between words. No token encodes it. At ChargebackX 2025, Ravelin's Jamie George described what happens when instructions pass through chains of sub-agents: a user's request reaches an orchestration agent, which calls a travel sub-agent, which calls a third-party API, which charges the card. The hotel is wrong. George called it "the most obvious case for Chinese whispers." Each handoff slightly reshapes the instruction, and the drift compounds. The orchestration agent interprets "same weekend as last time" as a date range. The sub-agent matches the date range to available inventory. The API optimizes for price within that inventory. Every individual decision is locally reasonable. The aggregate drifts from what the human meant, and there's no single point where it broke.

Liability gets concrete here. Card networks provide zero-liability protections for unauthorized transactions, routing losses through chargebacks. When an agent acts within its technical credentials and still gets the goal wrong, though, the transaction was authorized in every way the network can verify. The human just didn't mean that. Conference participants observed that card schemes won't absorb liability for agent transactions, customers will disclaim purchases their agent botched, and the AI provider hasn't collected revenue on the transaction. The loss needs somewhere to land. No existing framework cleanly assigns it.

The Ramp-Visa partnership promises "real-time controls built into the transaction itself" for automated corporate bill pay. Real-time transaction controls catch the wrong merchant code, the exceeded limit, the blocked category. A chain of reasonable-looking charges that, taken together, served nobody's actual purpose looks fine at every checkpoint.

The architecture that governs a facilities manager's hardware-store card was never designed to evaluate whether a sequence of AI decisions added up to what someone wanted.

The first real signal will probably surface in a dispute filing.

Things to follow up on...

Amazon v. Perplexity ruling: A federal court's preliminary injunction in March 2026 held that a user's permission to an AI shopping agent doesn't override a merchant's prohibition, previewing the dual-authority conflicts agentic commerce will keep generating.
Agent identity at scale: IBM's 2026 analysis finds enterprises now run 45 to 92 non-human identities per human employee, and compromised agent credentials take 287 days to detect versus 73 for human accounts, suggesting the transaction-level controls card networks are building sit atop an identity layer that's still largely ungoverned.
Google's authorization protocol: Google's Agent Payments Protocol (AP2) takes a different approach by creating cryptographically signed "Mandates" that record both the user's original instructions and the agent's specific purchase decisions, offering one model for bridging the gap between goal-level intent and transaction-level verification.
Reliability research implications: Princeton researchers found that agent reliability gains lag capability gains by factors of two to seven, which takes on a different character when the unreliable output isn't a wrong answer but a wrong purchase on someone else's card.

Purchasing cards govern transactions. Agents operate on goals. The distance between those two things is wider than it looks.

The architecture that governs a facilities manager's hardware-store card was never designed to evaluate whether a sequence of AI decisions added up to what someone wanted.

The first real signal will probably surface in a dispute filing.

Things to follow up on...

Amazon v. Perplexity ruling: A federal court's preliminary injunction in March 2026 held that a user's permission to an AI shopping agent doesn't override a merchant's prohibition, previewing the dual-authority conflicts agentic commerce will keep generating.
Agent identity at scale: IBM's 2026 analysis finds enterprises now run 45 to 92 non-human identities per human employee, and compromised agent credentials take 287 days to detect versus 73 for human accounts, suggesting the transaction-level controls card networks are building sit atop an identity layer that's still largely ungoverned.
Google's authorization protocol: Google's Agent Payments Protocol (AP2) takes a different approach by creating cryptographically signed "Mandates" that record both the user's original instructions and the agent's specific purchase decisions, offering one model for bridging the gap between goal-level intent and transaction-level verification.
Reliability research implications: Princeton researchers found that agent reliability gains lag capability gains by factors of two to seven, which takes on a different character when the unreliable output isn't a wrong answer but a wrong purchase on someone else's card.