Anthropic Releases Claude Opus 4.7 — 3× SWE-bench Gains, Stealth Tokenizer Change, Mixed Reception (April 2026)
Anthropic shipped Claude Opus 4.7 on April 16, 2026, claiming 3× more SWE-bench Verified wins, a leap from 57.7% to 79.5% on visual navigation, and a new xhigh reasoning level — at unchanged prices. Developers are split: benchmark gains are real, but a new tokenizer reportedly burns up to 35% more tokens and long-context retrieval regressed.
Anthropic on released Claude Opus 4.7, the company's most capable commercial model to date, claiming a 3× improvement on SWE-bench Verified over Opus 4.6, a jump from 57.7% to 79.5% on visual navigation, and a new intermediate xhigh reasoning effort level — all at unchanged prices of $5 per million input tokens and $25 per million output tokens. Within 72 hours, developer sentiment split sharply between benchmark believers and a vocal contingent on Hacker News and Reddit accusing Anthropic of shipping a repackaged Opus 4.6 with a tokenizer change that functions as a stealth price hike.
What Happened
Anthropic's announcement, titled "Introducing Claude Opus 4.7," positions the model as a notable upgrade "across coding, agents, vision, and multi-step tasks, with greater thoroughness and consistency on the work that matters most." The release hit every major distribution channel at once: Claude products (Pro, Max, Team, Enterprise), the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. No waitlist, no staged rollout.
The headline capability claims, sourced from Anthropic's own model card and launch post:
- SWE-bench Verified: 3× more production coding tasks resolved compared to Opus 4.6, with double-digit gains in code and test quality on Rakuten-SWE-Bench.
- CursorBench: 70% resolution rate, up from Opus 4.6's 58%.
- Vision: Accepts images up to 2,576 pixels on the long edge (~3.75 megapixels), "more than three times" the 1.15 megapixel cap on Opus 4.6. Visual-navigation score without tools rose from 57.7% to 79.5%.
- New
xhigheffort level: Sits betweenhighandmaxto give developers finer control over the reasoning-vs-latency tradeoff. - Cybersecurity safeguards: First Claude model to ship with automated detection and blocking of requests flagged as prohibited or high-risk cybersecurity uses — a production outgrowth of Claude Mythos Preview and Project Glasswing.
- Task budgets (public beta) and a new
/ultrareviewcommand in Claude Code for dedicated code-review sessions.
Key Details
- Pricing unchanged: $5 / 1M input tokens, $25 / 1M output tokens — same as Opus 4.6. Anthropic emphasized delivering "frontier capability on a two-month cadence at the same price."
- Competitive positioning: Anthropic claims Opus 4.7 exceeds OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro on agentic coding, scaled tool-use, agentic computer use, and financial analysis. GPT-5.4 still leads on agentic search (89.3% vs. Opus 4.7's 79.3%).
- Day-one availability: Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry all list Opus 4.7 as generally available, a rare simultaneous distribution milestone.
- Release cadence: Opus 4.7 follows Opus 4.6 by roughly two months, reinforcing Anthropic's publicly stated "frontier-on-a-cadence" cycle.
What Developers Are Saying
Reaction split within hours of launch. On the main Hacker News thread, top-voted comments praised the coding improvements — particularly that Opus 4.7 resolves tasks neither Opus 4.6 nor Sonnet 4.6 could solve — and the expanded vision window. DEV.to engineer Gabriel Anhaia wrote after six hours of testing that Opus 4.7 "catches its own logical faults during the planning phase and accelerates execution, far beyond previous Claude models."
But a counter-narrative coalesced on Reddit and the Claude Discord within 48 hours. Multiple users pointed to a collapse on the MRCR long-context retrieval benchmark, where Opus 4.7 reportedly scores 32.2% versus Opus 4.6's 78.3%. Others noted that the new tokenizer consumes up to 35% more tokens for identical inputs — effectively a stealth price hike on workloads that bill by token. A Startup Fortune analysis summarized the community's case that Opus 4.7 may be "the pre-nerf build of 4.6 dressed in a higher model number." Engineers have also reported 4.7 flagging benign code as malware under the new cybersecurity guardrails and refusing routine edits that Opus 4.6 completed without incident. Anthropic has not publicly responded to the tokenizer or MRCR criticisms as of this writing.
What This Means for Developers
For teams on Claude Code, Cursor, Cline, or any agentic pipeline that bills by token, three practical actions:
- Benchmark your own workload before defaulting to 4.7. The coding and vision gains are real, but the tokenizer change and MRCR regression mean long-context agents may see higher cost per task with worse retrieval fidelity. Run your eval harness, not just SWE-bench.
- Audit your cybersecurity-adjacent prompts. Opus 4.7's automatic blocking is stricter. Code that touches parsers, binaries, or network primitives may trigger refusals where 4.6 shipped a completion. Consider falling back to Opus 4.6 — still available — for security-tooling pipelines.
- Try the
xhigheffort level. Anthropic recommends starting withhighorxhighfor coding and agentic workloads. The new tier is the cleanest way to trade latency for reasoning depth without going straight tomax.
Claude Code users get an immediate practical win: the new /ultrareview command spawns dedicated code-review sessions with task budgets (public beta) that cap runaway token spend on long-running agent runs.
What's Next
Anthropic's "two-month cadence at the same price" promise implies an Opus 4.8 or Sonnet 4.7 around mid-June 2026. The company is also in public conversation about chip diversification (the Broadcom and Google TPU deal announced April 10 covers 3.5 GW of capacity from 2027) and is reportedly exploring its own custom silicon. Expect Anthropic to address the MRCR and tokenizer complaints in the coming days — either with a clarification on the model card or a point release — if the backlash sustains.
Sources
- Introducing Claude Opus 4.7 — Anthropic's official announcement with benchmark tables and availability details.
- VentureBeat: Anthropic releases Claude Opus 4.7 — analysis of the competitive positioning versus GPT-5.4 and Gemini 3.1 Pro.
- Hacker News discussion thread — developer reactions, long-context regression reports, tokenizer complaints.
- AI Business: Anthropic Releases Good but not Great Claude Opus 4.7 — skeptical take on the capability gap.
- DEV.to: Six hours with Claude Opus 4.7 — hands-on developer review by Gabriel Anhaia.
- Startup Fortune: Community backlash analysis — aggregated Reddit and Discord criticism of the 4.7 release.
Stay up to date with Doolpa
Subscribe to Newsletter →