Analysis

The model is becoming the agent

Frontier models now orchestrate their own sub-agents and run long-horizon work natively — absorbing the easy 80% of what agent frameworks were for. But the week a US order switched Fable 5 off in 90 minutes is the best argument yet for the layer the labs can't sell you.

Check current status Review alternatives

Quick summary

Frontier models now orchestrate their own sub-agents natively — absorbing the easy part of what agent frameworks did.
But the labs also sell orchestration as a product, and benchmark scores don't equal production reliability — so the framework layer isn't dying, it's splitting into commodity convenience (absorbed) and resilience plus governance (more valuable).
Fable 5 being switched off by a government order in about 90 minutes is the sharpest argument for owning the multi-vendor resilience layer yourself.

ThesisThe model is the agent

ButReliability ≠ benchmarks

So buildThe resilience layer

The orchestration layer is moving inside the model

On June 26 OpenAI previewed GPT-5.6 — three models, Sol, Terra, and Luna — and the headline isn't raw IQ, it's structure. Sol ships a 'max' mode for deeper reasoning and an 'ultra' mode that, in OpenAI's own framing, spawns sub-agents to fan a task across parallel workers instead of grinding through it as one process. The model doesn't just think harder; it delegates to copies of itself.

It isn't a one-off. Anthropic's Opus 4.6 already breaks complex tasks into independent sub-tasks and runs tools and sub-agents in parallel as a native feature; Google's Gemini line is explicitly tuned for long-horizon agentic work that holds a plan across thousands of steps. The things you used to need a framework for — task decomposition, parallel fan-out, a planner that keeps state across many steps — are becoming load-bearing features of the model itself. If your agent product's core pitch was 'we'll split the work and run the steps for you,' the model just learned to do that for free.

Even the 'virtual model' is now a product

The clearest signal comes from the open-source side. Around June 26–27, Nous Research shipped Mixture of Agents in Hermes Agent: you compose several models — even across different providers — into a named preset that Hermes then treats as a single, selectable model. Reference models run first and feed private analysis to an 'aggregator' that writes the actual answer and makes the tool calls. A small orchestration graph, collapsed into one virtual model you pick from a dropdown.

Nous's own numbers — self-reported, on a benchmark they haven't released yet — claim these presets beat the individual models they're built from. Take the figures or leave them; the architecture is the point. The unit of consumption is shifting from 'a model' to 'an orchestrated bundle that behaves like a model.' Which makes the tempting conclusion — 'agent frameworks are dead' — far too glib.

Read the full analysis — free

or use email instead

No password, no spam. We email about model status and what we ship. How we handle your data.

Sign out

The labs are absorbing orchestration — and selling it

Here's the tell. The same companies baking sub-agents into their models are also selling orchestration as a standalone product. OpenAI puts sub-agents inside GPT-5.6 and ships Codex, a cloud service whose manager decomposes work and dispatches it to parallel workers, each with its own context window. Anthropic gives Opus 4.6 native sub-agents and sells Managed Agents — composable APIs that handle the infrastructure, state, permissioning, and upgrade compatibility. If orchestration were really absorbed into the model, you wouldn't also need to buy it as managed infrastructure.

So what's being absorbed is the convenience layer — the happy-path glue you wired up to get a demo working. What's being sold is the part a model can't be: durable state, permissioning, retries, audit — the things that make an agent survive contact with production. The frameworks aren't dying. They're splitting in two.

Benchmark capability is not production reliability

The reason the hard layer survives is arithmetic. A model that absorbs the easy 80% still leaves the 20% where money and trust live — and that 20% compounds. An agent that chains 20 tool calls, each 95% reliable, finishes the whole job only about a third of the time (0.95^20 ≈ 36%). Multiple 2026 analyses of deployed agents tell the same story: leaderboard scores overstate real-world reliability, and success rates fall sharply as tasks lengthen and as systems move from demo to production.

A better single model raises the per-step number. It does not erase the compounding, the retries, the monitoring, or the question of who is accountable when step 14 of 20 quietly does the wrong thing. That work doesn't live in the model. It lives in the layer around it.

A 95%-reliable step, chained

End-to-end success as the agent takes more steps

1 step95%

5 steps77%

10 steps60%

20 steps36%

0.95 to the power of n. A better model nudges the per-step number up; it doesn't stop the compounding — which is why the layer around the model survives.

The argument this site exists to make

And then there's the failure mode no benchmark scores. On June 12, a US Commerce Department directive ordered Anthropic to suspend Fable 5 and Mythos 5 for any foreign national, inside or outside the United States — including the company's own non-citizen employees. Because the platform can't verify nationality in real time, the only compliant move was to take both models fully offline. The most capable models in Anthropic's history went dark in roughly an hour and a half, and stayed dark for weeks.

This is the counterweight to 'just use the one best model.' When your single great model can be gated by a government faster than your on-call can read the incident page, multi-vendor routing, graceful fallback, and lock-in resilience stop being architectural luxuries — they're the only thing between a policy letter and your product going down. Concentration is the risk; the layer around the model is where you manage it.

So the synthesis: agent frameworks aren't dying, they're bifurcating. The commodity convenience layer — decompose, fan out, run the steps — is being absorbed into the models, and will keep getting cheaper and better for free. The resilience, governance, and multi-vendor layer is becoming more valuable, not less — because models getting more powerful is exactly what makes depending on any single one more dangerous.

The takeaway

Stop building the convenience layer. Anything that just decomposes a task and runs steps in parallel is now a native model feature (GPT-5.6 ultra, Opus 4.6, Hermes MoA). Call the model's orchestration directly — it'll be cheaper and better next quarter.

Invest in the layer the labs can't sell you. Vendor-agnostic routing, a warm second model to fail over to, durable state, retries with audit, and accountability for the compounding 20% that benchmarks never test.

Treat single-vendor dependence as a live risk. Fable 5 proved a model can go dark in 90 minutes for reasons that have nothing to do with quality. If you can't fail over to another vendor today, that's your most urgent backlog item.

Need the short answer?

Fable 5 is back worldwide as of July 1 — but capped at 50% of your weekly limit until July 7. See the live status, or use GLM-5.2 or the new Sonnet 5 for cheaper work.

Read the brief Fabel 5 spelling guide

FAQ

Does this mean I should stop using agent frameworks?

Stop building the parts the model now does for you — task decomposition and parallel step-running. Keep and invest in the parts it can't be: multi-vendor routing, fallback, durable state, governance, and accountability.

What's the one concrete action?

Make sure you can fail over to a second vendor's model. The Fable 5 shutdown showed a top model can disappear in minutes for non-technical reasons; single-vendor dependence is now an operational risk, not a footnote.

Track what's worth using now

Fable 5 is back — get one email when the next big move lands (Fable pricing, Sonnet 5, GPT-5.6), plus the occasional practical update. No spam, leave anytime.

Sources

This page is independent. Official provider pages are the source of record for access, pricing, and policy.