This interview has not happened yet. It is set in April 2028, constructed from the trajectory of real shifts already underway — IBM's redesigned entry-level roles, KPMG's "Great Skills Reset" framework, the governance infrastructure being built as you read this. The subject is fictional. The situation he describes probably isn't.
By 2028, the first full cohort of "agent oversight" hires will be two years into careers that didn't exist when they were in college. They walked into job descriptions rewritten from scratch, roles that once meant producing work now meaning reviewing what agents produce.1 They were screened for critical thinking and systems reasoning, not domain knowledge.2 They are the human-in-the-loop.
We sat down with one of them.
Tomás "TJ" Jurado-Walsh, 25, is an Associate in Agent Oversight & Validation at a large professional services firm. Philosophy degree from UC Santa Cruz. Six-month AI Operations certificate. Hired in 2026 into a role his manager describes as "the conscience layer." He reads audit trails, validates agent-generated client deliverables, handles escalations, and explains automated decisions to people who experienced the consequences.
He arrived fifteen minutes early and spent the time checking something on his phone that he later described as "a flagged anomaly that turned out to be nothing, which is most of them, which is sort of the whole problem."
Two years in. Walk me through a normal day.
TJ: The boring version is I review agent outputs before they go to clients. Compliance reports, risk assessments, synthesized briefings. I'm looking for reasoning gaps, bias patterns, things that don't track. I also handle exceptions, cases the agents flag as uncertain, or cases that should have been flagged but weren't. That second category is the more interesting one.
The less boring version: I'm a translator. Half my job is explaining to a client why an automated system made a specific recommendation. The other half is explaining to the people who maintain the system why a client is upset about something the system doesn't understand it did.
You came straight out of a certificate program. Did you feel prepared?
TJ: Extremely. Which I now recognize is a data point about the problem, not about my preparation.
How so?
TJ: When I started, I scored really well on the screening assessments. The systems-thinking stuff, the decision-making simulations.3 And the onboarding was genuinely good. They taught us the governance frameworks, the audit trail architecture, the escalation protocols. I could read a Purview log by week three. I felt like I knew what I was doing.
Then about eight months in, I approved a deliverable, a synthesized market analysis, and my manager pulled me aside afterward. "Did anything feel off about the competitive landscape section?" I said no. She said, "The agent weighted two data sources equally that any analyst who'd built these manually would know aren't comparable. One's a survey with a 200-person sample and the other's transaction data from 40,000 accounts."
I just didn't have the reflex. I'd never built one of those analyses by hand. I didn't know what the texture of good sourcing felt like. I knew the checklist. I didn't know the thing the checklist was trying to approximate.
That's a very specific gap.
TJ: Extremely specific, and that's what makes it hard. I'm not missing some big conceptual framework. I have more conceptual frameworks than the people who came before me. I can talk about agent failure modes, I can cite the taxonomies. What I'm missing is... okay, my manager calls it "the flinch." That moment where something looks fine but feels wrong. She built that over years of doing the work herself. I'm supposed to have it from reading the output of the work.
Those are not the same thing.
Can you develop it?
TJ: I think I'm developing something. Whether it's the same thing, I genuinely don't know. I've gotten better at pattern recognition. I can spot when an agent's confidence score doesn't match the hedging in its language. But that's a heuristic I built from reading agent outputs. My baseline for "normal" is agent-generated. I don't have a human-generated baseline to compare against.
And look, I know this sounds like I'm spiraling. I'm good at my job. My reviews are thorough. But there's research suggesting the vast majority of AI failures are invisible, that the system gets it wrong and nobody catches it.4 My entire role exists to catch those. And I sometimes wonder how I'd know if I were sitting inside one of those invisible failures right now.
Does that keep you up at night?
TJ: Not really. Which also might be the problem.
Your firm has invested heavily in governance infrastructure. Does that help?
TJ: Massively. The tooling now versus when I started is night and day. Agent registries, identity controls, data access scoping. I can see exactly what an agent touched and in what order.5 The scaffolding is real. But here's the thing nobody talks about: I grew up inside the scaffolding. I know what the guardrails are. I don't always know what they were built to prevent. My manager lived through the period when this stuff was getting figured out. She has scar tissue from specific failures. I have documentation about those failures.
Scar tissue and documentation do different things to your nervous system.
There's a concern in the industry about the management pipeline, that redesigning entry-level roles might create a gap in mid-level talent.6 Do you think about that?
TJ: Oh, constantly. My firm is very explicit about it. They're investing in us specifically because they're worried about the pipeline. Which creates this weird dynamic where I know I'm being groomed for something, and I know the reason I'm being groomed is that the normal path to getting there was removed. They took out the stairs and installed an elevator and now they're worried we won't have strong legs. Their solution is to put a Peloton in the elevator.
Is the Peloton working?
TJ: My quads look great. Whether I can climb stairs if the power goes out... ask me in 2030.
Last question. If you could go back and do the old version of your job, the pre-redesign entry-level role, the one the agents handle now, would you want to?
TJ: (long pause)
Yeah. For like six months. Not because I think the old way was better. I think what I do now is genuinely more interesting and probably more valuable. But there's something I can't get from oversight alone. I can validate an output. I can flag an anomaly. I can escalate with the right context. What I can't do is feel the difference between an answer that's right and an answer that's almost right, because I've never been wrong in the specific way that teaches you that.
My manager was wrong hundreds of times before she got good. I've been approximately correct since day one. And I'm starting to think approximately correct is the most dangerous place to be.
TJ checked his phone again as we wrapped up. Another flagged anomaly. He opened it, scanned it for about four seconds, and closed it. "Nothing," he said. He was probably right.
Footnotes
-
IBM CHRO Nickle LaMoreaux stated entry-level job descriptions from two to three years ago have been formally rewritten, shifting from task-driven work to analysis, problem-solving, and responsible AI use. https://itbrief.news/story/explainer-ibm-to-triple-us-entry-level-hiring-amid-ai-shift ↩
-
McKinsey's gamified assessment tool Solve screens for critical thinking, decision-making, and systems thinking rather than prior business knowledge. 73% of recruiters rank critical thinking as their top concern for 2026 hiring. https://www.ibm.com/think/news/entry-level-roles-get-reset-ai ↩
-
McKinsey Chief Learning and Development Officer Heather Stefanski: "We are doubling down on what makes you uniquely human — and inserting more tech." https://www.ibm.com/think/news/entry-level-roles-get-reset-ai ↩
-
Bessemer Venture Partners, "AI Infrastructure Roadmap," March 31, 2026: "78% of AI failures are invisible — AI gets something wrong, but no one catches it." ↩
-
KPMG Q4 2025 AI Pulse Survey: 75% of enterprises prioritize security, compliance, and auditability as the most critical requirements for agent deployment; 60% restrict agent access to sensitive data without human oversight. https://kpmg.com/us/en/media/news/q4-ai-pulse.html ↩
-
Korn Ferry found 37% of organizations plan to replace early career roles with AI, risking an eventual shortage of mid-level managers. IBM warns that cutting entry-level hiring may save money short-term but weaken the management pipeline. https://fortune.com/2026/02/13/tech-giant-ibm-tripling-gen-z-entry-level-hiring-according-to-chro-rewriting-jobs-ai-era/ ↩
