There's a claim running through a lot of AI writing right now: execution gets cheap, judgment stays scarce. I believed it when I first wrote it. Now I think it's half right, and the wrong half is the one people are building on.
Frontier models already make judgment calls that would have impressed a senior analyst a year ago. They weigh tradeoffs, catch edge cases, reason through ambiguity. "Judgment is what humans uniquely bring" is starting to sound a lot like "creativity is what humans uniquely bring." Which aged badly.
Follow the logic all the way down and the scarce thing isn't judgment. It's context.
The window
A model can reason brilliantly, but only over what you put in its context window. The quality of its output is bounded by the quality of what goes in: what's included, what's left out, what format it arrives in, what memory gets surfaced, what tools get connected.
Assembling that context is a skill that compounds with experience. Knowing which three documents matter out of three hundred. Knowing the client's tone shifted last Tuesday and that changes what "urgent" means. Knowing the CRM data is six months stale and the real numbers live in someone's spreadsheet.
The model can reason over whatever you hand it. It can't go find what it doesn't know is missing.
This is why the shift from prompt engineering to context engineering matters more than it sounds. Prompt engineering was about phrasing your question well. Context engineering is about assembling the right world for the model to reason inside: what to surface, what to connect, what to leave out, what format makes the relationships visible. The skill is older than AI. Good managers have always done it: set up the situation so a capable person can make a good call without understanding everything from scratch.
The bottleneck
Memory is the mechanism that turns context from a one-shot input into something that accumulates.
Most AI systems today start each session with whatever you feed them plus whatever they remember from before. That memory is thin: conversation snippets, preference flags, maybe some project context. But it's deepening fast, and the gap between a system with good memory and one without is already the gap between useful and useless.
The person who curates and corrects an AI system's memory is doing context engineering whether they call it that or not. They're shaping what the model knows about their work, preferences, and blind spots. That accumulated context is what makes the output good. Not the model's raw capability.
The system that remembers you best has better context, which means better output, which means you trust it more, which means you give it more. The flywheel runs on memory, not on benchmarks.
Horses and riders
There's an overused analogy about cars replacing horses that I keep coming back to despite itself. Progress in engines was steady for two hundred years. Horses were fine for most of that time. Then between 1930 and 1950, roughly 90% of them disappeared. Progress was steady. Replacement was sudden.
The useful version isn't "AI will replace us." It's: which parts of what you do are the horse, and which are the rider?
The horse parts are the ones models already do well: synthesis, formatting, first drafts, structured analysis. The rider parts are harder to name. Not judgment in the abstract. Something closer to knowing the terrain before the horse does.
That's context engineering. And it looks, in practice, like product management.
Not the job title. The function. Scoping the problem. Deciding what matters. Sequencing work. Knowing which questions to ask before you let something run. Reviewing output against intent, not just against correctness.
Andy Jones put it plainly: "While it took horses decades to be overcome, and chess masters years, it took me all of six months to be surpassed. Surpassed by a system that costs one thousand times less than I do."
That's the horse talking. The question is what the rider does next.
What you build on
A brilliant model with bad context makes bad calls confidently. A decent model with great context makes surprisingly good calls. The bottleneck was never intelligence.
The people who learn this early won't all be engineers. But they'll be the ones who make these systems work, because they understood what was scarce before everyone else noticed.
It was always context.