AI Models Solving Open Problems + Capacity Constraints = Restructuring What Gets Built Next

May 21, 2026

The Signal

Models are now publishing novel solutions to unsolved mathematical problems. Simultaneously, the labs offering the best models are hitting hard capacity limits and moving to long-term token commitments. This combination is forcing a structural shift: companies and researchers are switching from "what model should I use" to "how do I architect around scarcity and let different specialized models handle different tasks." The chatbot era is officially over.

IMPORTANT
We're moving from "best single model" to "right model for the job" architecture, driven by both capability breakthroughs and supply constraints.

What's Moving

  • Capacity Scarcity as Strategy — Sama announced discounted 1-3 year token commitments, signaling that production capacity will remain constrained despite scale. This locks in customer behavior and forces builders to optimize compute spend rather than just chase capability. (via @sama)
  • Math Solved By Generalists — OpenAI's "unit distance result" breaking open problems in mathematics is the inflection point Emostaque flagged: once models start solving research-grade problems in novel ways, the industrial adoption curve accelerates vertically. This isn't incremental—this is different work. (via @sama, @emostaque)
  • Multi-Model Routing Wins — Bindureddy and Svpino both surface the emerging pattern: agent swarms and routing systems that send reasoning to Opus, coding to GPT, images to DALL-E, video elsewhere. Single-model tools are structurally obsolete. (via @bindureddy, @svpino)
  • Google Re-Enters as Price Disruptor — Gemini Flash 3.5 is competitive with Sonnet on benchmarks at a fraction of cost. This matters not because Google wins the capability race, but because it forces a cost-based sorting: which tasks warrant expensive frontier models vs. Flash-tier efficiency? (via @bindureddy)
  • Open-Weight Practical Parity — MiniMax-M2.7 hitting Opus/GPT-5.4 performance league while running at 440 tokens/sec on commodity infra. This decouples "best model" from "most economical," splitting the market. (via @svpino)

Crosscurrents

  • LeCun's Cold Water — Ylecun's counter-narrative remains essential: models work via brute-force declarative knowledge accumulation, not reasoning. Flash benchmarks may be gamed for agentic loops. The gap between "solves a math problem" and "understands reality" is still a chasm. (via @ylecun)
  • Regulatory Friction Incoming — Bindureddy flagged White House plans for 90-day government review of frontier models pre-release. This throttles velocity for closed labs and creates an implicit moat for open-weight. Still unpriced by most builders. (via @bindureddy)

Tradecraft

BULL
Novel capability breakthroughs + capacity scarcity + multi-model routing maturity = companies that can efficiently compose best-of-breed models will own the next 18 months.
WATCH
Government review mandate timeline and whether it drives builders toward open-weight models. Watch if 1-3 year token commitments fill up—signals real supply bottleneck vs. pricing signal.

Desk Notes

  • @sama — Betting hard on "personal AGI" as third pillar; capacity scarcity is real constraint, not rhetoric.
  • @bindureddy — Flash is good but overpriced in practice; open-source + China models are the actual disruption vector.
  • @svpino — Multi-model routing + agent protocols (AG-UI, MCP) are the infrastructure that matters now, not single models.
  • @emostaque — We're in "final stage of human solutions"—models solving research problems is the inflection.

Get AI Intelligence Brief delivered — AI-synthesized from curated sources, daily.

🔔 Subscribe