The compute deficit is now the binding constraint—and it's reshaping competitive advantage from model to harness

The Signal

Compute scarcity has stopped being theoretical and become operational. OpenAI is explicitly flagging that token supply won't meet demand through 2027-28, forcing a strategic pivot away from raw model capability toward personalization, memory, and enterprise harness—the features that can't be commoditized. This isn't a supply-chain problem anymore; it's a competitive moat problem. The race is no longer "whose LLM is best" but "whose infrastructure can ship usable AI first when compute is rationed."

IMPORTANT

Best model means nothing if you can't serve users at scale. Competition shifts from training to deployment, from inference to latency, from tokens to harness.

What's Moving

Model-agnostic harness layers — OpenAI is positioning itself explicitly around memory, context windows, and personalization rather than claiming GPT 5.6 as a category killer. The play is multi-interface (ChatGPT, Codeex, enterprise tools) not one mega-model. Anthropic and others chasing raw benchmark wins are optimizing for the wrong metric (via @allin podcast)
Vibe-coding security collapse — Agentic code generation is accelerating into production with zero security defaults. Multiple reports of data loss from incompetent deployments. This creates an immediate wedge for platforms offering better defaults (VPC, auth, observability) and a tail risk of regulatory backlash (via @svpino)
Open-source asymmetry — China's 1.5B+ user base running local models (Kimi, DeepSeek, GLM) is scaling compute efficiency in parallel to US token maximization. Token consumption parity with Gemini-class models is imminent. US market is underestimating this velocity (via @bindureddy)
Diversification over allegiance — OpenAI is explicitly multi-cloud (Microsoft, Oracle, CoreWeave, AWS, GCP) and multi-chip (Nvidia, AMD, Cerebras, proprietary) to avoid chokepoint dependency. This is now table stakes for any serious AI player (via @allin)

Crosscurrents

Pricing pressure vs. margin defense — OpenAI can't claim cost advantage while keeping prices high if compute stays constrained. Anthropic and others will undercut. But the real margin move is harness lock-in, not token pricing—this tension is unresolved (via @bindureddy vs. enterprise pricing reality)
Enterprise adoption reality check — McKinsey says 95% of enterprise AI initiatives fail; insiders on @allin confirm "we haven't seen much success yet." This massive gap between hype and execution means whoever ships usable AI with support infrastructure wins, not whoever has the smartest model (via @allin)

Tradecraft

BULL

Multi-interface, harness-first strategy scales faster than chasing benchmarks when compute is scarce. OpenAI's explicit acknowledgment of this shift signals confidence in execution.

BEAR

If compute remains constrained through 2028 and open-source models hit parity on inference efficiency, the value accrual moves entirely to applications and harness. Foundation model moats compress.

WATCH

Next trigger: Q3 2026 data center delivery rates. If 2027-28 compute shortfall is real, expect aggressive enterprise SLA announcements and price tiering wars in H2.

Desk Notes

@allin (Sam Altman) — Signaling harness lock-in over model superiority; multi-interface strategy; compute as the real constraint through 2028
@bindureddy — Open-source parity narrative; aggressive on China's scaling; skeptical of incremental model releases (Opus 4.8)
@svpino — Agentic dev tooling acceleration + security debt as the next crisis surface; orchestration layers now separate junior from senior builders
@ylecun — Structural critique: research budget cuts are the anti-innovation move; world models as the real foundation architecture conversation