The Round Trip

Computing power once moved outward to individuals. AI's cloud architecture is quietly pushing it back. The cycle may be turning again.

By Nora Kaplan— April 22, 2026

Computing power once moved outward to individuals. AI's cloud architecture is quietly pushing it back. The cycle may be turning again.

In 1978, Bob Frankston sat at a DEC terminal with no screen, just an endless roll of paper, and remoted into a mainframe to write the program that would make mainframes unnecessary. VisiCalc, the first spreadsheet for personal computers, was coded entirely on a timesharing system, at night, when hourly rates dropped. The software that would liberate a generation of business managers from the data processing department was born inside the data processing department.

Apple turned fifty this month. The company's origin story is well-rehearsed; the architectural proposition it happened to carry tends to disappear inside it. Computation belongs where the person is. The Apple II shipped in 1977 with seven expansion slots and a full schematic in the user manual. The owner decided what the machine became. Not a vendor, not a systems administrator, not whoever controlled the hourly rate on a timesharing account. Ninety percent of Apple IIs ended up in small businesses, because VisiCalc turned a $2,000 microcomputer into something that replaced a subscription costing hundreds per month. Everything the machine knew, it knew locally.

The Macintosh pushed this further in 1984. Steve Jobs described what fit inside 64 kilobytes of ROM: the entire operating system, the graphics subsystem, the windowing and menu and mouse logic. MacPaint ran in 128KB of RAM on an 8 MHz processor. The Xerox Alto, which pioneered the same graphical ideas, had required institutional infrastructure around it. The Mac needed a desk and a wall outlet. Unplug it, carry it somewhere, plug it back in. Everything still there. No connection to maintain. No dependency to manage. No round trip.

You type a question into ChatGPT and the experience looks almost identical. Screen, keyboard, a response appearing as if the machine is thinking. But your keystrokes travel to a data center you will never visit, running models with hundreds of billions of parameters your laptop cannot hold, and the answer travels back. OpenAI handles 2.5 billion of these round trips daily. The surface of personal computing survived while the architecture underneath quietly inverted.

In January, the Brookings Institution named this directly:

“

"Today's AI resembles the mainframe phase of that earlier era."

They noted that devices increasingly ship with neural processing units, that smaller models keep gaining capability, that the forces which once pushed computing outward could activate again. Apple's own silicon can now run a 30-billion-parameter model on a MacBook Pro, time-to-first-token under three seconds.

Brookings offered no timeline, though. And the distance between a capable local model and a frontier system doesn't close with better chips alone. Global AI data center spending is projected at $400 to $450 billion this year. That kind of capital concentration builds its own gravity, the same way mainframe economics did before a $100 program on a $2,000 computer broke the hold.

Frankston wrote VisiCalc on the system it was about to replace. He could see the whole shape of the thing from inside it. Fifty years into the cycle Apple helped start: is anyone building on today's centralized infrastructure looking at the same opening? Or is the terminal, this time around, where the architecture settles?

Things to follow up on...

Inference eats the budget: Deloitte projects that inference workloads will account for roughly two-thirds of all AI compute by year-end 2026, up from one-third in 2023, with the inference-optimized chip market crossing $50 billion.
Apple's local AI push: Apple's M5 chip with Neural Accelerators can push time-to-first-token under three seconds for a 30-billion-parameter model, as detailed in their MLX and Neural Accelerators research.
The energy question, contested: IEEE Spectrum reports that while OpenAI claims 0.34 watt-hours per average query, researchers estimate the most capable models can consume over 20 Wh for complex queries, a gap that matters as inference scales to trillions of annual requests.
The System/360 echo: IBM's 1964 bet that architecture should be separated from implementation nearly collapsed under its own software crisis, a pattern IEEE Spectrum traced in detail as the project that almost destroyed the company.

In January, the Brookings Institution named this directly:

“

"Today's AI resembles the mainframe phase of that earlier era."

Things to follow up on...

Inference eats the budget: Deloitte projects that inference workloads will account for roughly two-thirds of all AI compute by year-end 2026, up from one-third in 2023, with the inference-optimized chip market crossing $50 billion.
Apple's local AI push: Apple's M5 chip with Neural Accelerators can push time-to-first-token under three seconds for a 30-billion-parameter model, as detailed in their MLX and Neural Accelerators research.
The energy question, contested: IEEE Spectrum reports that while OpenAI claims 0.34 watt-hours per average query, researchers estimate the most capable models can consume over 20 Wh for complex queries, a gap that matters as inference scales to trillions of annual requests.
The System/360 echo: IBM's 1964 bet that architecture should be separated from implementation nearly collapsed under its own software crisis, a pattern IEEE Spectrum traced in detail as the project that almost destroyed the company.