Fable Shutdown, GLM-5.2 & OpenRouter Fusion, Cognitive Coverage

The week control over AI moved to the state — and why that's the case for running your own.

Last week made one thing clear: control over AI is shifting to the state. The US government took Fable 5 offline for everyone, and in a separate case stepped in to keep an xAI data center running — both in the name of national security. In between, an open-weight Chinese model arrived as the obvious option for anyone who'd rather not depend on a model that can be switched off from outside. That thread runs through the whole edition. We also look at Cognitive Coverage: the part of software work that AI still can't take off your hands.

Fable 5 Shutdown: Why It Happened

Fable 5 didn't go dark over one thing. Two unrelated triggers landed in the same week, and together they were enough. We could only report the shutdown last time; the reporting since — mostly the Washington Post and Wired, both behind paywalls — filled in the why.

First, Mythos — the powerful model that Fable is the guardrailed version of — had been given to partners through a program called Project Glasswing. The government found out that the South Korean telecom company SK Telecom had access too, and the administration suspected it of ties to China (SK Telecom denies this). Anthropic was asked to revoke that access and did so immediately — the first red flag.

The second came via Amazon, which reported it could turn Fable 5 into a vulnerability finder with a simple "fix this code" prompt. Many argued the same is possible with other models like GPT-5.5 — around 150 security leaders, including Alex Stamos and Katie Moussouris, signed an open letter asking for the controls to be lifted — but together with the SK Telecom case it was too much, and both models were pulled.

Anthropic has been negotiating to get them back, with no resolution in sight — though in an interview on The Axios Show, President Trump said he no longer sees the company as a national security threat ("not now, but a week ago, maybe").

The uncomfortable part for anyone building on hosted models: access is only as stable as a government's mood, and the customer gets no vote. None of Anthropic's customers did anything wrong, and none of them could have prevented this.

GLM-5.2 & OpenRouter Fusion

New models, new approaches

Which is the strongest argument we've seen all year for keeping an open-weight option you can run yourself. The market delivered one within days.

Z.ai (formerly Zhipu) is a Chinese AI lab in the same league as DeepSeek or Kimi. They've released version 5.2 of their GLM model: open-weight, free to download. The big surprise? In most benchmarks it places right next to the top frontier models we have, Opus 4.8 and GPT-5.5. For now those are the official numbers only — like every model, GLM will have to prove itself in real work.

The economics are the story. The full weights are about 1.51 TB, and even a quantized build runs to a couple of hundred gigabytes, so running it locally means an enterprise-grade setup — or you reach it through OpenRouter or Z.ai itself. But the price is striking:

Model	Input / M tokens	Output / M tokens
GLM-5.2	USD 1.40	USD 4.40
GPT-5.5	≈ USD 5	≈ USD 30
Fable 5	USD 10	USD 50

That's more than ten times cheaper than Fable on output. The one catch is the hidden cost of thinking: the longer a model reasons, the more tokens it spends, so the gap narrows on hard tasks.

Whether GLM is Chinese or American matters less than one fact: you hold the weights.

A self-hosted model can't be switched off by any government — which is exactly the exposure Fable just demonstrated.

The low price opens a second approach: run the same prompt against several cheaper models at once, then have a judge model pick the best answer. That's what OpenRouter built with Fusion. On the DRACO deep-research benchmark, a panel of Gemini 3 Flash, Kimi K2.6 and DeepSeek V4 Pro beats the individual frontier models and comes within a point of Fable at half the cost, while an Opus 4.8 + GPT-5.5 panel outscores solo Fable. A clever hedge — though, being hosted, it doesn't give you what open weights do.

Cognitive Coverage

Should we review code ourselves?

The case for dropping human code review is getting louder, and it isn't crazy. The AI review assistants are good — they catch bugs humans miss, and they're faster. So why review anything yourself?

Addy Osmani wrote on exactly this in Agentic Code Review, drawing the line between what AI review does and doesn't do:

What it does not supply is the human judgment about whether this is the right change to build in the first place. That judgment stays with a person, and it happens to be the most interesting part of the job, the part worth keeping.

The numbers cut against dropping it. Osmani reports AI roughly quadruples how much code gets written (≈400%) while delivered value rises only about 12% (GitClear) — and the extra code is buggier, with other datasets he cites showing around 1.7 times more issues, security problems among them. Yet the tendency is to review less, not more.

The real question was never "everything or nothing." Reviewing every line was a myth long before AI; how closely you look has always depended on how business-critical the change is and what the impact is. What changed is that the code is now written by something that doesn't grasp the whole system it's editing — so the judgment falls to whoever owns the merge.

A separate study, which Sarah Guo cites in her essay The Untrainable, points the same way: MIT researchers tracked more than 100,000 developers and found AI raised the amount of code written by about 180%, but the amount actually shipped by only about 30%. Different dataset, same conclusion — writing code is cheap; knowing whether it was the right thing to build for that particular situation is not, and no test and no AI reviewer can settle it.

Microsoft's CEO Satya Nadella put a name to this in his interview on the Hard Fork podcast: "cognitive coverage" — a term he credited to a colleague — the developer's job becoming to understand what the system does and why, not just whether the tests pass. He still sees developers as the ones who connect systems together. That has been our task all along: most of the code in our applications was never written by us, but by libraries and frameworks — and that won't change with AI.

In Other News

SpaceX went public last week in the largest IPO ever — priced at USD 135, up to USD 225 within days, now back around USD 160. Days later it bought Cursor for USD 60 billion, an option it had secured earlier. Cursor is known for its AI coding assistant and its Composer model, which matches frontier models like Opus 4.8 and GPT-5.5 on coding benchmarks at a fraction of the cost. With it, SpaceX now owns the editor, the xAI Grok models, and the Colossus data centers that sell compute to others — including Anthropic.

Separately, the Department of Justice stepped into an ongoing case against xAI over allegations it runs gas turbines illegally and pollutes nearby communities, asking the court to dismiss it on national-security grounds. No judge has ruled yet.

Put next to Fable, the pattern is hard to miss: the state is taking a growing hand in who runs AI, and on what terms.

Watching next: whether Fable and Mythos come back — and on what terms.

Soverius AI

Local models don't need to hide behind the frontier — GLM-5.2 proved that again. The open question is how to get that quality out of them on hardware you actually have, because a 1.51 TB model is not something you run on a laptop.

That's what our webinar next week is about. We won't run GLM-5.2 itself — it's too big to demo live — but you'll learn what running capable local models takes, and what it costs.

Quantized AI 26/03: Fable Shutdown, GLM-5.2 & OpenRouter Fusion, Cognitive Coverage

Fable 5 Shutdown: Why It Happened

GLM-5.2 & OpenRouter Fusion

Cognitive Coverage

In Other News

Soverius AI

Software Development with Local LLMs