Not everyone benefits equally from working with AI. The obvious prediction — that less skilled people gain the most — turns out to be only half right. A growing body of experimental research reveals that the hidden variable determining who benefits from AI assistance is not ability alone but metacognition: the accuracy of people's beliefs about their own abilities. Those who know what they do not know gain enormously. Those who do not know what they do not know may gain little or nothing — and some may actually perform worse.
The ABC Framework
Caplin, Deming, Li et al. (2024), in an NBER working paper from a team spanning NYU, Harvard, and UCSB, present the clearest experimental evidence for this moderating effect. Their controlled experiment examines how AI assistance interacts with three variables: Ability (baseline skill level), Beliefs (what people think their ability is), and Calibration (the accuracy of beliefs relative to actual ability).
The headline finding confirms the intuitive prediction: AI improves performance more for people with low baseline ability. The compression effect is real — AI narrows the gap between high and low performers by lifting the floor more than it raises the ceiling.
But the second finding reframes the entire picture. Holding ability constant, AI assistance is more valuable for people who are calibrated — people whose beliefs about their own ability match their actual ability. A person who accurately recognizes their weakness in, say, data analysis will appropriately defer to AI on analytical questions while trusting their own judgment on tasks where they are strong. A person who overestimates their analytical skill will override correct AI suggestions. A person who underestimates it will defer unnecessarily to AI on tasks they could handle better themselves.
The interaction between ability and calibration produces a sharp hierarchy: people who know they have low ability gain the most from AI. They defer appropriately, accept help where it matters, and apply their own judgment where the AI falls short. Their metacognitive accuracy turns AI from a crutch into a genuine cognitive partnership.
The counterfactual analysis makes the stakes concrete: eliminating miscalibration would cause AI to reduce performance inequality nearly twice as much as it currently does. The gap between what AI could do for equality and what it actually does is almost entirely explained by inaccurate self-knowledge.
The Metacognitive Laziness Trap
Fan, Tang, and Le (2024), in a paper that has attracted over 300 citations since publication in the British Journal of Educational Technology, identify the mechanism by which AI can undermine the very metacognition it requires. They introduce the concept of metacognitive laziness — the tendency for AI-assisted learners to reduce their monitoring and regulation of their own learning processes.
The phenomenon works through a straightforward psychological pathway. When a student uses ChatGPT to solve problems, the AI's fluent and confident output creates an illusion of understanding. The student reads the solution, recognizes it as correct, and moves on — without engaging the effortful metacognitive processes (self-questioning, error checking, strategy evaluation) that would build genuine understanding. The ease of obtaining answers substitutes for the difficulty of generating them, and it is precisely the difficulty that drives learning.
Fan et al. find that this metacognitive laziness affects learning motivation, learning processes, and learning performance in interconnected ways. Students who rely heavily on AI show reduced effort allocation, less self-monitoring during problem solving, and ultimately worse performance on assessments where AI is not available. The AI becomes a cognitive prosthetic that weakens the muscle it replaces.
Calibration as Teachable Skill
Lee, Han, and colleagues (2025), presenting at CHI 2025, investigate whether AI-powered systems can themselves help solve the calibration problem. Their research examines how learning behaviors mediate AI-powered metacognitive calibration — how interaction with AI tutoring systems can be designed to improve rather than erode learners' self-knowledge.
The approach treats metacognitive calibration not as a fixed trait but as a learnable skill that AI systems can scaffold. When the AI provides feedback not just on the correctness of answers but on the accuracy of confidence judgments — "you said you were 80% confident, but your accuracy on similar problems is closer to 50%" — learners gradually adjust their self-assessments. The learning behaviors that mediate this calibration include self-testing, reflection on past performance, and deliberate practice on areas of recognized weakness.
This research opens a practical pathway through the metacognitive paradox. If the problem with AI assistance is that it erodes self-monitoring, and if the solution is better calibration, then AI systems can be designed to build the metacognitive capacity they would otherwise undermine. The key is shifting the AI's role from answer provider to calibration coach — from a system that tells you what to think to one that helps you understand how well you think.
The Organizational Implication
These findings have implications that extend well beyond education. In any organization deploying AI tools, the distribution of benefits will be shaped by metacognitive calibration. A software engineering team where developers accurately assess their own coding strengths will leverage AI code assistants more effectively than a team with inflated or deflated self-assessments. A consulting firm where analysts know the limits of their domain expertise will delegate to AI more productively than one where confidence and competence are misaligned.
The intervention point is clear: organizations that invest in metacognitive training — helping people develop accurate models of what they know and do not know — will capture more value from AI than organizations that invest only in AI tooling. The technology is necessary but not sufficient. The bottleneck is not the AI's capability but the human's capacity to know when to use it.