Analysis

You lost Fable 5. Don't buy three models in a trenchcoat.

Orchestration tools promise capabilities beyond any model you can still access. For most people that's a 4-7x tax to paper over one missing model — when a cheap open-weight model already gets you most of the way.

Check current status Review alternatives

Quick summary

Three orchestration tools — Hermes MoA, OpenRouter Fusion, Sakana Fugu — promise to beat any single model you can access by combining several. Every headline number is self-reported.
Two of them cost more and run slower than calling a top model directly; Fugu's real-world quality lagged its benchmarks badly.
For most people who lost Fable 5, the smart default is the boring one: a strong, cheap, open model (GLM-5.2) — not three models stitched together.

HypedMoA / Fusion / Fugu

Our pickGLM-5.2

RuleSelf-reported until proven

The pitch writes itself — which is the problem

When Fable 5 went dark on June 12, the most-quoted line of the orchestration boom came from Nous Research's own launch post: 'The strongest models are gated and access is granted only to a select few.' The answer they, OpenRouter, and Sakana offer is to stitch together the models you can still reach and call the result a new, better model. It's seductive precisely because it's pitched at the moment you feel the loss most.

But 'capabilities beyond the publicly available frontier' is a claim, not a receipt. Mixture-of-Agents — fan a prompt out to several models, let an aggregator fuse the answers — has been in the literature since 2024. The 2026 products are real engineering, but the honest question isn't 'is this clever?' It's 'am I buying capability, or renting a 4-7x cost-and-latency markup to feel whole again?'

The contenders, side by side

Here's the honest read on each, before we get to the verdict:

Tool	What it is	Best for	The catch
Hermes MoA	Self-hostable reference models + aggregator, as one "virtual model"	Tinkerers who want to tune the quality/cost dial	Headline win is self-reported on an unreleased benchmark; more calls per query
OpenRouter Fusion	Server-side panel + judge synthesis, one API call	Rare high-stakes questions worth a paid second opinion	~3x cost and 2-3x slower by OpenRouter's own number; its benchmark has no coding
Sakana Fugu	A trained orchestrator routing across a swappable frontier pool	Almost nobody yet — wait and see	Sub-frontier in real use (~30-min waits), black-box routing, undisclosed cost
GLM-5.2 · our pick	Open-weight (MIT) single model, near Opus 4.8 on coding	Most people who lost Fable 5 and want capability now	Trails Opus on ultra-long-horizon tasks; frontier-adjacent, not frontier

Read the full analysis — free

or use email instead

No password, no spam. We email about model status and what we ship. How we handle your data.

Sign out

Hermes MoA: real, tunable, and self-graded

Nous shipped MoA presets as 'virtual models' in late June: reference models run first, then an aggregator writes the actual response and tool calls using their analyses as private context. The headline — about 8% over Opus 4.8 and 11% over GPT-5.5 — comes from the launch post, on an upcoming, unpublished benchmark. Worth noting: Hermes's own docs report the win as roughly 6 points, not 8% — a gap they don't reconcile.

Either way it's self-reported on a benchmark nobody else can run yet, so treat it as a hypothesis. To its credit, the docs are candid that MoA multiplies the number of model calls, and you can tune which models it uses. Our verdict: the most honest of the three, and the only one you self-host and tune — but its proof is a benchmark of one.

OpenRouter Fusion: a great escalation lane, a terrible default

Fusion is the slickest productization: one endpoint fans out to a panel of up to eight models, each with web search, and a judge synthesizes consensus, contradictions, and blind spots. On OpenRouter's own deep-research set, a Fable-5-plus-GPT-5.5 panel beat every individual model, and a budget panel came within a point of Fable 5 at half the cost.

The catch is in the fine print. OpenRouter says a Fusion call is often 2-3x longer than a standard one; one developer who rebuilt the pattern measured 7x slower and 4x the cost versus calling Opus directly. And the benchmark it aces has no coding domain, so it tells you little about your IDE. Our verdict: brilliant for the rare high-stakes question where you'd happily pay 3x for a fused second opinion. Use it as an escalation lane, never autopilot.

Sakana Fugu: the cautionary tale

Fugu is the most ambitious: not a hard-coded pipeline but a trained orchestrator that learns when to answer directly versus delegate to a swappable pool of frontier models. Sakana claims Fugu Ultra stands shoulder-to-shoulder with Fable 5 and Mythos, topping 10 of 11 benchmark rows.

Then people used it. Within a day, independent testers — Ethan Mollick among them — reported a sharp gap between the leaderboard and real work: routine tasks taking around half an hour, with output that was 'fine' but didn't match Fable in practice. Routing is proprietary, so you can't see which models ran, and the announcement is silent on how much orchestration multiplies token cost. Our verdict: the clearest illustration of the genre's risk — frontier benchmarks, sub-frontier reality, black-box billing. Watch it; don't bet on it yet.

The quiet contender: just use GLM-5.2

Here's the take the orchestration pitch needs you not to consider: maybe you don't need an orchestration layer at all. GLM-5.2 is open-weight, MIT-licensed, and lands within about a point of Opus 4.8 on mainstream coding benchmarks — at roughly a sixth of the cost. It's not magic: it trails Opus badly on ultra-long-horizon work. But notice the arithmetic.

Two of the three orchestration tools cost more than running Opus directly and add latency; GLM-5.2 costs a fraction and runs as a single fast call. For the everyday work that fills most days, one strong cheap model beats three expensive ones in a trenchcoat. Our verdict: the smart default for anyone who lost Fable 5 and wants 90% of the capability at a sixth of the cost, today.

The takeaway

Just need to ship? Switch to GLM-5.2 today. Frontier-adjacent quality at about a sixth of the cost, no orchestration tax, no black box.

Keep one escalation lane, not a default. Fusion (or a hand-rolled MoA via Hermes) earns its 3x bill on the rare high-stakes question where a fused second opinion matters. Don't make it autopilot.

Treat every "beats Opus 4.8" claim as self-reported until someone you trust reproduces it. Fugu is the reminder why: leaderboard parity and real-world parity are different products.

Need the short answer?

Fable 5 is back worldwide as of July 1 — but capped at 50% of your weekly limit until July 7. See the live status, or use GLM-5.2 or the new Sonnet 5 for cheaper work.

Read the brief Fabel 5 spelling guide

Track what's worth using now

Fable 5 is back — get one email when the next big move lands (Fable pricing, Sonnet 5, GPT-5.6), plus the occasional practical update. No spam, leave anytime.

Sources

This page is independent. Official provider pages are the source of record for access, pricing, and policy.