Analysis
You lost Fable 5. Don't buy three models in a trenchcoat.
Orchestration tools promise capabilities beyond any model you can still access. For most people that's a 4-7x tax to paper over one missing model — when a cheap open-weight model already gets you most of the way.
Quick summary
- Three orchestration tools — Hermes MoA, OpenRouter Fusion, Sakana Fugu — promise to beat any single model you can access by combining several. Every headline number is self-reported.
- Two of them cost more and run slower than calling a top model directly; Fugu's real-world quality lagged its benchmarks badly.
- For most people who lost Fable 5, the smart default is the boring one: a strong, cheap, open model (GLM-5.2) — not three models stitched together.
The pitch writes itself — which is the problem
When Fable 5 went dark on June 12, the most-quoted line of the orchestration boom came from Nous Research's own launch post: 'The strongest models are gated and access is granted only to a select few.' The answer they, OpenRouter, and Sakana offer is to stitch together the models you can still reach and call the result a new, better model. It's seductive precisely because it's pitched at the moment you feel the loss most.
But 'capabilities beyond the publicly available frontier' is a claim, not a receipt. Mixture-of-Agents — fan a prompt out to several models, let an aggregator fuse the answers — has been in the literature since 2024. The 2026 products are real engineering, but the honest question isn't 'is this clever?' It's 'am I buying capability, or renting a 4-7x cost-and-latency markup to feel whole again?'
The contenders, side by side
Here's the honest read on each, before we get to the verdict:
| Tool | What it is | Best for | The catch |
|---|---|---|---|
| Hermes MoA | Self-hostable reference models + aggregator, as one "virtual model" | Tinkerers who want to tune the quality/cost dial | Headline win is self-reported on an unreleased benchmark; more calls per query |
| OpenRouter Fusion | Server-side panel + judge synthesis, one API call | Rare high-stakes questions worth a paid second opinion | ~3x cost and 2-3x slower by OpenRouter's own number; its benchmark has no coding |
| Sakana Fugu | A trained orchestrator routing across a swappable frontier pool | Almost nobody yet — wait and see | Sub-frontier in real use (~30-min waits), black-box routing, undisclosed cost |
| GLM-5.2 · our pick | Open-weight (MIT) single model, near Opus 4.8 on coding | Most people who lost Fable 5 and want capability now | Trails Opus on ultra-long-horizon tasks; frontier-adjacent, not frontier |
Read the full analysis — free
Sign in to unlock the full Analysis column — no password. You'll also get an alert the moment Fable 5 returns.
or use email instead
No password, no spam. We email about model status and what we ship. How we handle your data.
Hermes MoA: real, tunable, and self-graded
Nous shipped MoA presets as 'virtual models' in late June: reference models run first, then an aggregator writes the actual response and tool calls using their analyses as private context. The headline — about 8% over Opus 4.8 and 11% over GPT-5.5 — comes from the launch post, on an upcoming, unpublished benchmark. Worth noting: Hermes's own docs report the win as roughly 6 points, not 8% — a gap they don't reconcile.
Either way it's self-reported on a benchmark nobody else can run yet, so treat it as a hypothesis. To its credit, the docs are candid that MoA multiplies the number of model calls, and you can tune which models it uses. Our verdict: the most honest of the three, and the only one you self-host and tune — but its proof is a benchmark of one.
OpenRouter Fusion: a great escalation lane, a terrible default
Fusion is the slickest productization: one endpoint fans out to a panel of up to eight models, each with web search, and a judge synthesizes consensus, contradictions, and blind spots. On OpenRouter's own deep-research set, a Fable-5-plus-GPT-5.5 panel beat every individual model, and a budget panel came within a point of Fable 5 at half the cost.
The catch is in the fine print. OpenRouter says a Fusion call is often 2-3x longer than a standard one; one developer who rebuilt the pattern measured 7x slower and 4x the cost versus calling Opus directly. And the benchmark it aces has no coding domain, so it tells you little about your IDE. Our verdict: brilliant for the rare high-stakes question where you'd happily pay 3x for a fused second opinion. Use it as an escalation lane, never autopilot.
Sakana Fugu: the cautionary tale
Fugu is the most ambitious: not a hard-coded pipeline but a trained orchestrator that learns when to answer directly versus delegate to a swappable pool of frontier models. Sakana claims Fugu Ultra stands shoulder-to-shoulder with Fable 5 and Mythos, topping 10 of 11 benchmark rows.
Then people used it. Within a day, independent testers — Ethan Mollick among them — reported a sharp gap between the leaderboard and real work: routine tasks taking around half an hour, with output that was 'fine' but didn't match Fable in practice. Routing is proprietary, so you can't see which models ran, and the announcement is silent on how much orchestration multiplies token cost. Our verdict: the clearest illustration of the genre's risk — frontier benchmarks, sub-frontier reality, black-box billing. Watch it; don't bet on it yet.
The quiet contender: just use GLM-5.2
Here's the take the orchestration pitch needs you not to consider: maybe you don't need an orchestration layer at all. GLM-5.2 is open-weight, MIT-licensed, and lands within about a point of Opus 4.8 on mainstream coding benchmarks — at roughly a sixth of the cost. It's not magic: it trails Opus badly on ultra-long-horizon work. But notice the arithmetic.
Two of the three orchestration tools cost more than running Opus directly and add latency; GLM-5.2 costs a fraction and runs as a single fast call. For the everyday work that fills most days, one strong cheap model beats three expensive ones in a trenchcoat. Our verdict: the smart default for anyone who lost Fable 5 and wants 90% of the capability at a sixth of the cost, today.
The takeaway
Just need to ship? Switch to GLM-5.2 today. Frontier-adjacent quality at about a sixth of the cost, no orchestration tax, no black box.
Keep one escalation lane, not a default. Fusion (or a hand-rolled MoA via Hermes) earns its 3x bill on the rare high-stakes question where a fused second opinion matters. Don't make it autopilot.
Treat every "beats Opus 4.8" claim as self-reported until someone you trust reproduces it. Fugu is the reminder why: leaderboard parity and real-world parity are different products.
Need the short answer?
Fable 5 is back worldwide as of July 1 — but capped at 50% of your weekly limit until July 7. See the live status, or use GLM-5.2 or the new Sonnet 5 for cheaper work.
Read the brief Fabel 5 spelling guideTrack what's worth using now
Fable 5 is back — get one email when the next big move lands (Fable pricing, Sonnet 5, GPT-5.6), plus the occasional practical update. No spam, leave anytime.
Sources
This page is independent. Official provider pages are the source of record for access, pricing, and policy.
- Hermes Agent — Mixture of Agents docs
- Cryptobriefing — Hermes MoA presets vs Opus 4.8 / GPT-5.5
- OpenRouter — Surpassing the frontier with Fusion
- Sakana AI — Fugu release
- MarkTechPost — Sakana Fugu launch
- The Decoder — Fugu's benchmark-vs-real-world gap
- DigitalApplied — GLM-5.2 vs Opus 4.8 benchmarks and cost