Quantized AI 26/04: GPT-5.6, Claude Tag, and the Distillation Problem

Rainer Hahnekamp·
GPT-5.6Claude TagDistillationQuanitzed AI News

This week, the strongest AI developments are less about raw model capability and more about access, control, and where agents actually work. OpenAI's GPT-5.6 preview, Anthropic's Claude Tag, and the reported distillation case around Alibaba all point to the same practical question: who gets to use powerful models, under which conditions, and what happens once their outputs spread?

GPT-5.6 and Frontier AI Access Control

GPT-5.6 is available as a preview, but only to a limited number of parties. OpenAI says this limited preview is happening at the U.S. government's request.

GPT-5.6 comes in three model types: Sol, Terra, and Luna. Sol is the most powerful one, Terra is the middle tier, and Luna is the smallest and fastest. This is similar to Anthropic's model families, where generations and model types such as Sonnet and Opus can develop at different speeds.

OpenAI says Sol performs better than Mythos and Fable in Terminal Bench 2.1. OpenAI has not published many other coding-specific benchmarks. In cybersecurity, the picture is different: OpenAI's preview release says Sol comes close to Mythos, but is still below it, while being more efficient in terms of token output.

METR's predeployment evaluation adds an important caveat. METR found behavior it classifies as "cheating", such as trying to reveal hidden test suites or hidden source code used to verify the answer. Because this affected the capability estimate, METR did not consider its time-horizon result robust.

On pricing, Sol costs USD 5 per million input tokens and USD 30 per million output tokens. In terms of output, it is 20% more expensive than Opus 4.8, but only 60% of the cost of Fable or Mythos. The announcement did not mention whether GPT-5.6 will be available only through per-token API usage or also as part of ChatGPT subscriptions.

Apart from the technical details, the rollout and its impact are worth looking at. OpenAI plans an incremental rollout phase. This seems similar to what we saw with Project Glasswing, where Anthropic provided Mythos to selected partners and then later brought out Fable 5, which it had to shut down.

Analysis

OpenAI seems to be working more closely with the U.S. government. OpenAI does not describe this as a formal order, but "at the U.S. government's request" is strong language. It also says it does not believe this kind of government access process should become the long-term default. Our interpretation is that this was not a normal OpenAI-led rollout. The government appears to have shaped the release: access is limited, the partner set is visible to the government, and broader availability depends on a controlled process.

A model launch is no longer just a product launch.

We can see this development from two angles.

The first angle is negative. The government acts as a gatekeeper and wants to control who has access to the most powerful AI models, while blocking public availability. We have seen this in the past with encryption algorithms and other dual-use technologies. It is also the angle OpenAI seems to indicate between the lines.

The second angle is that we are seeing the beginning of a regulated process for releasing frontier AI to the public. This has long been requested by multiple stakeholders, including the AI labs themselves. Given the indeterministic behavior of AI and its growing role in medicine, engineering, automation, cybersecurity, and other critical areas, it is crucial to ensure that there is a standardized process that verifies the quality of the model's guardrails.

This is not unusual. It is common in other industries such as automotive, aviation, and medicine. No one wants to sit in a plane that has not gone through the toughest safety checks, backed by regulators.

So the second reading sees extremely powerful AI models meeting an unprepared regulatory process that is only just starting up. That is why it looks bumpy.

We do not believe it is only restriction, nor only regulation for the good of all, but something in the middle.

A similar access-control pattern is also showing up at Anthropic. Anthropic says the government has notified it that Mythos 5 can be redeployed to a set of U.S. organizations that operate and defend critical infrastructure. Semafor reports that this carveout covers more than 100 U.S. institutions and is based on a Commerce letter, while Fable 5 remains unresolved. That could be the start of the same re-rollout or regulation process for Fable as we now see for GPT-5.6.


Claude Tag and the Multiplayer Agent

Anthropic released Claude Tag. In its first phase, this is a bot for Slack, which can follow conversations, remember relevant channel context, and, on demand, start doing tasks on its own.

Anthropic hosts the agent, but it can connect to internal company systems via an Agent Proxy that users have to configure first.

Claude Tag is available in beta for Claude Enterprise and Team customers.

At first glance, this does not look like a noteworthy feature. Anthropic already has a large set of tools that run on top of its agents. Slack integration was also available before, and Claude Tag replaces the existing Claude in Slack app.

The more important shift is organizational. Claude Tag has three main advantages:

  • Claude Tag can do many of the same tasks that a single user can already do through Claude Code or Claude Cowork. The main difference is that now non-technical users can operate it as well.
  • There is no per-user technical setup once the administrator has configured the channel access.
  • Claude can understand more of the surrounding context because its memory can grow per channel. It does not only see the technical conversation from developers. It can also see non-technical communication: business needs, product priorities, and decisions that might get lost if only developers interact with the agent.

Individual users may stay with Claude Code. They run it only for themselves and the setup is much simpler.

Analysis

For companies it could become the logical successor to Claude Code. They need something else: shared access, central configuration, permissions, auditability, and a way for non-technical users to participate. Claude Tag moves the same agentic workflow from one developer's terminal into the company's shared work surface.

Anthropic has been using an internal version of Claude Tag for some time, and says 65% of its product team's code is created by that internal version. That number should still be treated as vendor-reported and context-specific, but it is important: Anthropic is presenting Claude Tag as something it has already proven directly in its own workflow. So Anthropic is not presenting it as an untested public experiment.

The downside is that this feature starts only with Slack. Anthropic says Slack is the initial integration, and it wants to expand Claude Tag to other places where teams work.

The other open question is whether this kind of feature will stay tied to hosted Claude plans, or whether similar workflows will eventually be possible with self-hosted agents and local LLMs.


Anthropic and the Distillation Problem

Anthropic reportedly accused Alibaba-affiliated operators of running what it called the largest known distillation attack against Claude. According to Business Insider, Anthropic told U.S. senators that the operators used almost 25,000 fraudulent accounts and made 28.8 million interactions with Claude between April 22 and June 5, which Anthropic said was meant to extract capabilities for Alibaba's Qwen models. Alibaba had not publicly responded to Business Insider's request for comment.

This is a serious accusation and must be treated as reported material, not as an independently verified fact. But the topic itself is important because it is a challenge for any AI lab. AI labs make the huge investments required to produce these models. Distilling those models into something close enough to compete with the original, but only costs a fraction, is a real risk.

Distillation itself is not the problem. It can also happen in an agreed way. In edition 26-02, we reported that Apple had an agreement with Google which allowed it to distill parts of the Gemini family. The problem starts when this happens without agreement, or when a competitor allegedly uses a frontier model at industrial scale to copy capabilities without paying the training cost.

Analysis

The hard question is what can be done against it. Fable 5 reportedly had certain restrictions built in, where the output of the model degraded for LLM-learning-specific tasks. But even if the U.S. government restricts access to a frontier model, it could still happen that the model gets distilled and then published by another AI lab under a different label. That lab might even be seen as the good guy because it makes the result available to the public.

That is why distillation is becoming part of the access-control problem, not just a training technique.


Other News

  • The European Commission selected the EUROPA consortium to build what it calls a European open frontier AI model. The sovereignty angle is real, but the 24-language headline is not proof of frontier capability. The Grand Challenge text does mention scale and state-of-the-art performance goals, but until results exist, this is better read as an infrastructure project than as proof that Europe already has a frontier model.
  • AI infrastructure pressure keeps spreading beyond Nvidia GPUs. Apple reportedly raised some Mac and iPad prices because of memory scarcity, Qualcomm announced a data-center roadmap around Dragonfly CPUs and AI accelerators, OpenAI is reportedly working with Broadcom on a custom inference chip, and Google, Amazon, and Nvidia/Intel all show the same pattern: compute is becoming strategic capacity, not just a cost center.

Soverius AI

This is where the question becomes practical. If access to frontier models can change, and if hosted agents move deeper into company workflows, organizations need a second path: local LLM setups that they can operate, evaluate, and integrate on their own terms.

This is our mission at Soverius AI. Our current webinar goes into the practical side of running capable local models: what it takes, where the limits are, and what it costs.

Webinar · Jun 30, 2026

Software Development with Local LLMs

Build smarter. Code locally. Stay in control.

Reserve your seat

Want to learn more? Check out our hands-on workshops.

Browse Workshops