A new paper from Christian Catalini (MIT), Xiang Hui (WashU), and Jane Wu (UCLA) reframes the entire AI economics debate. Their argument is simple and devastating: the binding constraint on the AGI economy is not intelligence. It is human verification bandwidth.
As the cost to automate any measurable task races toward zero, a different cost refuses to budge: the cost for humans to verify that the output is correct, safe, and aligned with intent. The gap between these two curves is what Catalini et al. call the Measurability Gap, and it is the central risk of the agentic economy.
The binding constraint on growth is no longer intelligence. It is human verification bandwidth: the scarce capacity to validate outcomes, audit behavior, and underwrite meaning and responsibility when execution is abundant.
For anyone building AI agents, this paper is required reading. It provides the economic framework for why raw model capability is commoditizing, why your agent's outputs will increasingly be questioned, and what kind of infrastructure actually creates durable value.
The short version: memory is not a feature. It is verification infrastructure.
The Task Regime Map
Catalini et al. introduce a 2x2 that maps every task by two dimensions: how easy it is to automate (Cost to Automate, cA) and how easy it is for a human to verify the result (Cost to Verify, cH). This replaces the old routine/non-routine divide with something more precise: measurable vs. non-measurable.
The critical quadrant is Runaway Risk: tasks that are cheap to automate but expensive to verify. This is where autonomous agents generate what the paper calls the Trojan Horse Externality (XA): unverified output that looks correct but silently parasitizes the system.
Think about it: when your agent makes an architectural decision, writes a migration, or sends an outreach email, the execution cost is near zero. The verification cost, the cost of a human checking whether that decision was actually correct, is bounded by biology. Your agents can generate 100x faster than you can verify.
The core equation: Only the verifiable share (sv) of agent output contributes real economic value. Everything else is latent debt, not productivity. A map that expands faster than it can be verified does not go blank. It keeps looking like a map.
Three Ways Verification Fails
The paper identifies three dynamic failure modes that erode verification capacity over time. Each one is already visible in the AI agent ecosystem.
1. The Missing Junior Loop
When you automate entry-level work, you destroy the apprenticeship pipeline that trains future verifiers. Companies doing this are liquidating their future verification capacity into current earnings.
OMEGA: OMEGA persists expert verification knowledge across sessions. When a senior engineer stores a decision, that reasoning is available to every future agent session, functioning as synthetic apprenticeship.
2. The Codifier's Curse
Experts who verify AI output inevitably generate the training data that automates their own expertise. The expert is constantly shrinking the surface area of uncertainty that justifies their premium.
OMEGA: OMEGA's memory evolution pipeline codifies expertise deliberately, but keeps humans in the loop via contradiction detection and decision supersession. The knowledge compounds instead of evaporating.
3. Alignment Drift
Using AI to verify AI creates a false confidence trap. Architecturally similar models share correlated blind spots. Measured verification appears stable while true alignment silently collapses.
OMEGA: OMEGA's drift detection (omega_drift_check) catches when agent outputs diverge from prior verified decisions. Cross-session contradiction detection provides the independent signal that same-model verification cannot.
Memory Is Verification Infrastructure
The paper introduces the concept of verification-grade KIP: the accumulated knowledge that makes future verification cheaper. This includes failure logs, near-miss traces, edge-case libraries, and decision history. It is, as Catalini et al. put it, the “negative space of expertise” that general models cannot infer.
This is exactly what a persistent memory system stores. When an agent records a decision, captures a lesson learned, or logs an error pattern, it is building verification-grade KIP. The next agent session can query “what did we decide last time?” and “what went wrong before?” before acting. That is not context. That is verification infrastructure.
The moat is not reasoning; it is context plus underwriting.
The strategic principle that falls out of this framework is what the paper calls “rent cognition, own trust.” Use the best available model for broad reasoning. But strictly privatize your domain context and your verification stack. The model is a commodity. The memory is the moat.
This maps directly to how OMEGA works architecturally. OMEGA is model-agnostic: it works with Claude, GPT, Gemini, or any model that speaks MCP. The intelligence is rented. What OMEGA owns is the trust layer: persistent decisions, contradiction detection, coordination protocols, and audit trails.
Where Durable Moats Actually Live
The paper provides a defensibility hierarchy for network effects in the AI era, ranked from most fragile to most durable. The top two tiers are verification-grade networks and coordination equilibria. These are the moats that agents cannot manufacture.
Community norms, identity, legitimacy. Path-dependent, cannot be manufactured.
Accumulated adjudication histories. Increasing returns to safety.
Value shifts from catalog breadth to liability-backed integrations.
Migration friction evaporates, but compliance lock-in persists.
User-side wrapper agents disintermediate the platform.
Agents manufacture apparent thickness at near-zero cost.
The key metric the paper introduces is Verified Network Scale: NV = ρ × N, where ρ is the authenticated share of activity. Gross scale (N) is meaningless if your network is full of unverified agent interactions. What matters is the fraction that has been validated.
Traditional liquidity-based network effects (tier 1-2) are actively fragile in the agentic era because agents can simulate engagement at scale. The paper warns of network effect inversion: when synthetic content degrades signal-to-noise, more activity becomes actively harmful, driving high-quality participants to exit.
What This Means for Agent Builders
If the paper is right, and the framework is rigorous enough that it deserves serious engagement, then several implications follow for anyone building AI agent infrastructure.
1. Store decisions, not just context
Most agent memory systems store what the agent saw. Verification-grade memory stores what the agent decided, why, and what happened afterward. Decisions with rationale, lessons from failures, and contradiction histories are the building blocks of the “negative space of expertise” that Catalini et al. identify as the durable moat.
2. Verification must be independent from execution
Never use the same model to both generate and verify. The paper formalizes why: architecturally similar models share correlated blind spots, creating a false confidence trap. Cross-session memory provides independent verification by grounding today's agent in yesterday's verified decisions, not in the same model's re-derivation of the same answer.
3. Track your verification rate
The paper's ρ metric (verified actions / total actions) should be a first-class dashboard metric for any agent deployment. How many of your agent's decisions went through a coordination gate? How many sessions ended with a structured handoff vs. silent abandonment? How many external actions completed successfully vs. silently failed? If you cannot answer these questions, your Measurability Gap is growing.
4. Risk accumulates convexly
While execution scales linearly with compute, verification capacity scales sublinearly. This means risk appears manageable at small scale, then breaches safety thresholds abruptly. A company aggressively scaling agents without proportional verification investment is building on a foundation that looks solid until it suddenly is not.
Economic progress has always rested on an implicit compact: that the value claimed was the value produced. The Measurability Gap is the first force in the history of production capable of systematically breaking that compact, not through crisis, but through the ordinary economics of cost minimization.
The Bottom Line
The race to build smarter agents is over before it started. Intelligence is commoditizing. The real race is to build the verification layer: the infrastructure that makes agent output trustworthy, auditable, and accountable.
Persistent memory is not a nice-to-have bolted onto an agent framework. It is the substrate of verification itself. Every decision stored, every contradiction caught, every lesson persisted across sessions is a direct reduction in the Measurability Gap.
Rent cognition. Own trust.
Catalini, C., Hui, X., Wu, J. (2026). Some Simple Economics of AGI. arXiv:2602.20946. arxiv.org/abs/2602.20946
Related reading
