GPT-5.6 arrived out of nowhere
June 27, early morning. OpenAI released GPT-5.6.
No press conference. No global ChatGPT rollout. No Altman smiling and saying "now everyone can use it." Instead: a blog post and a limited preview notice.
This GPT-5.6 is different. Not just the model — the release method changed too. The US government required OpenAI to phase the rollout, opening it only to a small group of "trusted partners" whose names are shared with the government.
Translation: the strongest model exists, but you probably can't use it yet.
Sol, Terra, Luna: solar system naming
OpenAI dropped the old Pro/Mini/Instant naming scheme and unveiled three new names:
- GPT-5.6 Sol — flagship, the Sun. OpenAI's most capable model today
- GPT-5.6 Terra — balanced, the Earth. Everyday workhorse, performance ≈ GPT-5.5 but half the price
- GPT-5.6 Luna — lightweight, the Moon. Fast, cheap, built for high-volume calls
Fancy names, same old three-tier structure underneath: strongest handles the ceiling, middle handles most tasks, lightweight handles speed and cost. Names don't matter. Cheap and good matters.
Pricing: flagship stays flat, middle drops 50%
Per million tokens:
- Sol: $5 input / $30 output — same as GPT-5.5 standard, but crushes it in capability
- Terra: $2.5 input / $15 output — half of GPT-5.5
- Luna: $1 input / $6 output — one-fifth of GPT-5.5
Cross-comparison makes it clearer: Anthropic's Claude Fable 5 is $10/$50. Sol costs half that. Zhipu's GLM-5.2 (MIT open-source) is $1.4/$4.4 — same price band as Luna.
OpenAI is using Luna to defend the low-price market against open-source and Chinese vendors, and Sol to undercut Anthropic's flagship pricing. The price war is real.
What makes Sol actually strong
OpenAI spotlighted three areas: coding, bio research, cybersecurity.
Coding: Terminal-Bench 2.1 top score
Sol hit 88.8% on Terminal-Bench 2.1, 91.9% in Ultra mode. Claude Mythos 5 sat at 88.0% — knocked off its 17-day throne overnight.
Terminal-Bench doesn't test code snippets. It tests real development workflows — planning, tool calls, running commands, debugging, verifying results, iterating on failures. This is closer to actual engineering.
Cybersecurity: efficiency dominates
On ExploitBench, Sol matches Mythos Preview's performance using only about one-third of output tokens.
Less tokens = lower cost = higher efficiency. OpenAI hammered this point repeatedly: strong, but efficient.
Two new mechanisms
Max reasoning effort — gives Sol more time to think through problems that can't be solved on first instinct. Built for complex, long-chain tasks.
Ultra mode — multiple sub-agents collaborate in parallel. Instead of one AI assistant grinding through tasks solo, an "AI manager" dispatches smaller helpers to tackle different pieces simultaneously. Complex task timelines compress dramatically.
Safety: all tiers marked "High Risk"
This is the most noteworthy part of GPT-5.6.
For the first time in OpenAI's history, every model in the family — including the smallest, fastest Luna — is flagged as "High Risk" in both cybersecurity and bio/chemical domains. Previously, this rating only appeared on flagship models.
OpenAI stressed that Sol didn't cross the "Critical" threshold — it can identify vulnerabilities but can't autonomously generate complete attack chains. Defensive capability notably exceeds offensive execution.
Put plainly: very strong, but not so strong it spirals out of control.
Limited preview: most people can't access it
The US government required phased rollout. Currently, only API and Codex access is open to a small set of trusted partners, with gradual expansion to all ChatGPT users planned later.
Partner names are shared with the US government. Want to use Sol? You need government approval first.
OpenAI explicitly stated they don't accept government pre-review as a permanent default, arguing it blocks developers and defenders from accessing advanced tools. But short-term compliance is meant to pave the way for eventual full availability.
Two weeks earlier, Anthropic got treated worse — Mythos 5 and Fable 5 were straight-up shut down by export control orders, blocking even US-based employees. Compared to that, OpenAI's arrangement looks mild.
GPT-5.6 is powerful, and cheaper. But "so powerful it needs government approval" is the story worth watching more than the model itself. Frontier model release authority is shifting from companies to governments. That changes not just one product's launch cadence — it changes how the entire AI industry operates.
