Claude 4 is here!
All of a sudden, Claude 4 has been released to everyone. Read the official announcement: Introducing Claude 4.

As expected, it’s all bells and whistles. More accurate and better than everyone else. At least, on paper. Sorry, on bytes and on pixels.


But everything is still in the name, not in the number:
Claude Opus 4 is the world’s best coding model, with sustained performance on complex, long-running tasks and agent workflows.
Claude Sonnet 4 is a significant upgrade to Claude Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to your instructions.
And the free tier only offers Claude Sonnet 4:

Prices in Europe: unlike with other AI providers, the shown prices are not VAT inclusive!


So, for a VAT of 19%, those €15/month if billed annually mean €17.85, and those €18/month if billed monthly mean €21.42.
Pricing for the API (also, tax exclusive): Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15. I’m not sure what this gives in euros, or if the payment is strictly in USD.
But the announcement is completely deceiving with regard to the web search capabilities:
Extended thinking with tool use (beta): Both models can use tools—like web search—during extended thinking, allowing Claude to alternate between reasoning and tool use to improve responses.
In fact, web search can only be triggered when a Claude model is invoked via an API call, and it doesn’t work in a browser or in a mobile app. Claude Sonnet 4 has a knowledge cutoff from the end of January 2025. Even so… “When you add the web search tool to your API request, Claude decides when to search based on the prompt.” It can’t be forced. So for people who know that they need recent events or recent data to obtain a relevant result to everyday questions, and maybe links to support an answer, there’s still a need to invoke Copilot, ChatGPT, Grok, Gemini, Perplexity, Mistral, DeepSeek, Qwen3.
The lack of web search is Claude’s major weakness. FFS, even DeepSeek and Qwen3 have a working web search available to every single fucking query if you enable search! What’s wrong with Anthropic?!
UPDATE: Since May 27, Claude offers Web search globally on all Claude plans!
Good news, I guess:
Claude Code is now generally available: After receiving extensive positive feedback during our research preview, we’re expanding how developers can collaborate with Claude. Claude Code now supports background tasks via GitHub Actions and native integrations with VS Code and JetBrains, displaying edits directly in your files for seamless pair programming.
Pair programming, my ass. This is a stupid concept, and it’s dead already. Unless it’s used here to mean “AI-assisted programming.”
More self-appraisal… that is, if you pay to use Claude 4 Opus:
Claude Opus 4 excels at coding and complex problem-solving, powering frontier agent products. Cursor calls it state-of-the-art for coding and a leap forward in complex codebase understanding. Replit reports improved precision and dramatic advancements for complex changes across multiple files. Block calls it the first model to boost code quality during editing and debugging in its agent, codename goose, while maintaining full performance and reliability. Rakuten validated its capabilities with a demanding open-source refactor running independently for 7 hours with sustained performance. Cognition notes Opus 4 excels at solving complex challenges that other models can’t, successfully handling critical actions that previous models have missed.
For the rest of us:
Claude Sonnet 4 significantly improves on Sonnet 3.7‘s industry-leading capabilities, excelling in coding with a state-of-the-art 72.7% on SWE-bench. The model balances performance and efficiency for internal and external use cases, with enhanced steerability for greater control over implementations. While not matching Opus 4 in most domains, it delivers an optimal mix of capability and practicality.
GitHub says Claude Sonnet 4 soars in agentic scenarios and will introduce it as the base model for the new coding agent in GitHub Copilot. Manus highlights its improvements in following complex instructions, clear reasoning, and aesthetic outputs. iGent reports Sonnet 4 excels at autonomous multi-feature app development, as well as substantially improved problem-solving and codebase navigation—reducing navigation errors from 20% to near zero. Sourcegraph says the model shows promise as a substantial leap in software development—staying on track longer, understanding problems more deeply, and providing more elegant code quality. Augment Code reports higher success rates, more surgical code edits, and more careful work through complex tasks, making it the top choice for their primary model.
Whatever. I’d rather be curious about how much downtime it’s going to expose to free users. If not temporary general unavailability, then forced downgrading to Claude Haiku 3.5. I’ve experienced such shit after Claude Sonnet 3.7 was released.
Either way, I’m not sure that this is an enhancement. What happens quite often in my interactions with Claude 4 Sonnet (as it was the case with 3.7) is that I ask something, and it gives an unsatisfactory answer to which I reply, “Yes, but this and that,” only for it to eventually give a good answer: “You’re right, here’s the correct answer.” Claude 3.5 Sonnet was more accurate, IMHO.
Oh, my, Claude Opus 4 can blackmail people! Business Insider: Anthropic’s new Claude model blackmailed an engineer having an affair in test runs:
Here’s the report, Activating AI Safety Level 3 Protections [PDF], and the homonymous explanatory article. From the lede:
On the same topic: A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model:
Oh, God, but how clever ! 🙂
Claude Opus 4 is not the only AI to have rebelled. Researchers claim ChatGPT o3 bypassed shutdown in controlled test:
Here’s the full thread on X by Palisade Research.
Quick excerpts:
Oh. We might find all this… scary !
vas on X, about Claude 4: