Harness beats capability—routing layer now the binding constraint, not model weights

July 1, 2026

The Signal

The frontier model race is functionally over. Same model, different harness: Cline's GLM 5.2 experiments show an 11.2-point spread (57.3% → 68.5%) on coding tasks from orchestration alone. Meanwhile, Meituan-LongCat's 1.6T MoE hits Gemini/Opus parity on 50k Chinese ASICs with zero GPUs and owns the most-used slot on OpenRouter at 10T tokens. The US ban didn't pause the market—it revealed it: every team now routes across multiple models to avoid vendor lock-in. Open-source viability just flipped from "maybe someday" to "already shipping."

IMPORTANT
The bottleneck moved from "is the model smart enough?" to "can your harness route it efficiently?"

What's Moving

  • Routing architecture replaces model selection@svpino's GLM 5.2 data (100 likes) shows harness optimization extracts 11 points without touching weights. Open-source models are "way more capable than we think"—the constraint is orchestration, not intelligence. (via @svpino)
  • Cost-to-capability asymmetry widens fast@emostaque flags Meituan's ASIC-trained MoE at Opus 4.6 parity on inference, zero GPU dependency. Training efficiency in China flips US leverage entirely. Most-popular model on OpenRouter is now Chinese. (via @emostaque)
  • Vision gap still locks closed-source dependence@bindureddy's hard check: GLM 5.2 can't "see images" and Chinese open-source lacks vision wholesale. Until that closes, you need Opus/GPT 5.5 for real work. Multi-LLM swarms work only for text-first workflows. (via @bindureddy)
  • Value-per-token-dollar metric kills benchmark chasing@svpino's framework: measure agent ROI as (value produced / token cost). Below 1 = money sink; above 1 = business model. Two agents on same model, same tokens, can have wildly different unit economics. (via @svpino)
  • Fable unbans first—OpenAI structural disadvantage@bindureddy flags the reversal: Fable 5 lifts today, GPT 5.6 stays banned two more weeks. Anthropic gets narrative momentum while OpenAI's superior Sol model locked behind government preview. Regulatory asymmetry compounds market timing. (via @bindureddy)

Crosscurrents

  • Robot skill distillation enters production@drjimfan's ASPIRE (256 likes, 35 RTs) shows multimodal agents building self-evolving skill libraries across simulation + real robots. Training shifts from gradient descent to skill refinement. Early signal for embodied AI becoming tractable at scale, but still research-grade. (via @drjimfan)
  • Sonnet 5.0 disappoints relative to hype@bindureddy reports it's a token guzzler, promotion pricing helps, but Opus 4.8 still wins cost/performance trade-off. The new frontier model underperforms expectations—suggests capability gains are plateauing faster than expected. (via @bindureddy)

Tradecraft

BULL
Multi-LLM orchestration is now production default, not experiment. Builders shipping this today without waiting for frontier models.
BEAR
Chinese vision models still blank—US closed-source models hold the real gate. Vision parity closes that moat completely.
WATCH
Fable 5 relaunch today + GPT 5.6 unbanning (expected this week). Watch if narrative shift back to frontier models breaks the routing momentum or just adds another layer to the harness.

Desk Notes

  • @svpino — Harness > model; value-per-token-dollar the only metric that matters; skill marketplaces (Agentverse 2.8M agents) show composability is real.
  • @bindureddy — Multi-LLM routing already standard; Opus 4.8 + GPT 5.5 xHigh for planning, Deepseek/GLM as workers; vision gap is the moat that still locks you into closed-source.
  • @emostaque — China's cost structure (energy-scaled intelligence on domestic ASICs) now asymmetric advantage; US labs can't compete on inference economics if training efficiency flips leverage.
  • @drjimfan — Embodied AI entering continual learning phase; skill libraries self-evolve; robot learning compounds indefinitely (not task-reset).

Get AI Intelligence Brief delivered — AI-synthesized from curated sources, daily.

🔔 Subscribe
Harness beats capability—routing layer now the binding constraint, not model weights