Every month, someone asks me: which large language model should I actually be using right now?
That question got harder to answer in June 2026, because all three major players are iterating at a furious pace. Anthropic released Opus 4.8 on May 28, then dropped the Mythos-class Fable 5 on June 9. Google unveiled Gemini 3.5 Flash at I/O as the new default. OpenAI's GPT-5.5 has been steady since April, with GPT-5.6 rumored but not yet materialized.
I spent three days running all three models across six dimensions. Not benchmarks — you can find those everywhere — but real-world usage feel.
The Current Landscape
As of late June 2026, here's where the three giants stand:
Anthropic: The most aggressive pace. Opus 4.6 (Feb) → Opus 4.7 (April) → Opus 4.8 (May 28) → Fable 5 (June 9). Four months, three flagship generations. Fable 5 is their first public Mythos-class model with crushing coding performance. But it's also controversial — the US government asked Anthropic to shut down Fable 5 and Mythos 5 on national security grounds. The situation is still developing. So this comparison focuses on Opus 4.8 as the stable choice, with Fable 5 as a supplement.
OpenAI: The most stable. GPT-5.5 has been the mainstream flagship since April. On June 23, they released GPT-5.5-Cyber for security use cases. GPT-5.6 was planned for June but may be delayed due to the Fable 5 situation. Their strategy: "no rush on new releases, nail the current one first."
Google: The best value. Gemini 3.5 Flash launched at I/O 2026 as the default Gemini model. 1M token context, supports text, vision, video, audio, and code — six modalities. Priced at $1.50/$9 per 1M tokens — absurdly cheap. Gemini 3.5 Pro was delayed to June, but Flash is already competitive.
Dimension 1: Writing
I tested three scenarios: commercial copywriting, technical articles, and creative writing.
Claude Opus 4.8 writes the most "human." Its text has rhythm — natural variation in sentence length, none of that AI evenness. For commercial copy, it strikes a great balance between professional and conversational. Technical articles are its forte — clear logic, and it proactively fills in background you didn't know you needed.
GPT-5.5 is "competent but uninspiring." It can write anything, but nothing wows you. Commercial copy is workmanlike, creative writing skews conservative, technical articles are accurate but dry. Good for consistent output, but if you care about text quality, it falls short.
Gemini 3.5 Flash was surprisingly good. Especially in Chinese — Google's investment in Chinese language data is visibly paying off. Its style leans lively, great for social media and marketing copy. But on long technical pieces, it occasionally goes off track and needs multiple prompts to correct.
Writing ranking: Claude Opus 4.8 > Gemini 3.5 Flash > GPT-5.5
Dimension 2: Coding
The most competitive dimension right now.
Claude Opus 4.8 hit 69.2% on SWE-Bench Pro, up from 64.3%. In actual coding: exceptional context understanding. Give it a large project, and it remembers all file dependencies — change a function, and it proactively reminds you what else needs updating. Its code comments are genuinely helpful, not filler.
As for Fable 5, if it's still accessible, its coding ability is genuinely crushing. Reportedly broke 75% on SWE-Bench Pro, and long-horizon programming (multi-hour, multi-step coding tasks) is the best of any model. But given its unstable status, I wouldn't recommend it as a production workhorse.
GPT-5.5 writes solid code but lacks spark. Good for well-specified, low-creativity code — CRUD APIs, unit tests, doc generation. But ask it to do architecture design or debug complex concurrency issues, and it'll give you something that "looks right but doesn't hold up under scrutiny."
Gemini 3.5 Flash is the weakest coder of the three, but still usable. Its advantage is Google ecosystem integration — if you're on Google Cloud or Firebase, Gemini understands that stack best.
Coding ranking: Fable 5 (if available) > Claude Opus 4.8 > GPT-5.5 > Gemini 3.5 Flash
Dimension 3: Reasoning
I tested with math problems and logic puzzles.
Opus 4.8 scored 96.7% on USAMO — ranked first among all public models. In practice, when solving complex logic problems, it decomposes the problem into sub-problems, tackles each one, then synthesizes — a reasoning chain that mirrors how humans solve problems.
GPT-5.5's reasoning is solid but more "direct" — it goes from A to B, skipping intermediate steps. Faster, but occasionally makes errors in those skipped steps without catching them.
Gemini 3.5 Flash isn't a reasoning specialist, but it excels at multimodal reasoning. Give it a chart to analyze trends, or a video to summarize key points — it's the best of the three.
Reasoning ranking: Claude Opus 4.8 > GPT-5.5 > Gemini 3.5 Flash
Dimension 4: Multimodal
No suspense here.
Gemini 3.5 Flash natively supports text, image, video, audio, and code — six modalities. And with 1M token context, you can literally feed it an entire movie to analyze. This is its home turf.
GPT-5.5 supports text and images; video and audio capabilities are limited. Opus 4.8 also focuses on text and images — multimodal isn't Anthropic's current priority.
Multimodal ranking: Gemini 3.5 Flash > GPT-5.5 ≈ Claude Opus 4.8
Dimension 5: Pricing
Probably what most people care about most.
- Gemini 3.5 Flash: $1.50 / $9 per 1M tokens (input/output) — cheapest by far
- GPT-5.5: ~$5 / $15 per 1M tokens — middle of the road
- Claude Opus 4.8: ~$15 / $75 per 1M tokens — most expensive, but also the most capable
If your usage is high, the price gap is significant. At 1M input + 500K output tokens per month: Gemini ~$6, GPT ~$12.5, Claude ~$52.5.
But Claude's premium is justified — its output quality is genuinely higher, meaning fewer revision rounds, so overall efficiency might not be worse.
Value ranking: Gemini 3.5 Flash > GPT-5.5 > Claude Opus 4.8
Dimension 6: Speed
Gemini 3.5 Flash lives up to its name — fast. Low first-token latency, steady generation at 100+ tokens per second.
GPT-5.5 is medium speed, similar to the previous generation. Opus 4.8 has adaptive thinking mode — fast for simple queries, but "thinks" for a moment on complex ones, so perceived speed varies.
Speed ranking: Gemini 3.5 Flash > GPT-5.5 > Claude Opus 4.8
Scenario-Based Recommendations
After all that, which one should you pick? My advice: choose by use case.
For writing: Claude Opus 4.8. Best text quality, great for articles, copy, reports that need a human touch.
For coding: Claude Opus 4.8 (daily development) or Fable 5 (large projects, if available). GPT-5.5 as backup.
For business owners: GPT-5.5. Most stable, fewest surprises, best API docs, most integration options. If budget-sensitive, Gemini 3.5 Flash.
For multimedia: Gemini 3.5 Flash. Images, video, audio — all in one, and cheap.
On a tight budget: Gemini 3.5 Flash, no contest. Same budget, 10x the tokens.
One-Line Selection Guide
Budget isn't an issue → Claude Opus 4.8; value-first → Gemini 3.5 Flash; don't want to think about it → GPT-5.5.
One last note: this comparison gets updated monthly. The LLM space moves so fast that June's rankings might not hold in July — especially if GPT-5.6 drops, Fable 5 becomes available again, or Gemini 3.5 Pro lands. The landscape will reshuffle.
My advice: don't bet everything on one model. Use all three, switch by scenario. API prices are now low enough that subscribing to all three won't hurt. The real cost isn't the tool subscription — it's the time you spend agonizing over which one to choose.
