Claude Opus 4.7 Released: 3× SWE-bench, Tokenizer Backlash

Anthropic on April 16, 2026 released Claude Opus 4.7, the company's most capable commercial model to date, claiming a 3× improvement on SWE-bench Verified over Opus 4.6, a jump from 57.7% to 79.5% on visual navigation, and a new intermediate xhigh reasoning effort level — all at unchanged prices of $5 per million input tokens and $25 per million output tokens. Within 72 hours, developer sentiment split sharply between benchmark believers and a vocal contingent on Hacker News and Reddit accusing Anthropic of shipping a repackaged Opus 4.6 with a tokenizer change that functions as a stealth price hike.

What Happened

Anthropic's announcement, titled "Introducing Claude Opus 4.7," positions the model as a notable upgrade "across coding, agents, vision, and multi-step tasks, with greater thoroughness and consistency on the work that matters most." The release hit every major distribution channel at once: Claude products (Pro, Max, Team, Enterprise), the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. No waitlist, no staged rollout.

The headline capability claims, sourced from Anthropic's own model card and launch post:

SWE-bench Verified: 3× more production coding tasks resolved compared to Opus 4.6, with double-digit gains in code and test quality on Rakuten-SWE-Bench.
CursorBench: 70% resolution rate, up from Opus 4.6's 58%.
Vision: Accepts images up to 2,576 pixels on the long edge (~3.75 megapixels), "more than three times" the 1.15 megapixel cap on Opus 4.6. Visual-navigation score without tools rose from 57.7% to 79.5%.
New xhigh effort level: Sits between high and max to give developers finer control over the reasoning-vs-latency tradeoff.
Cybersecurity safeguards: First Claude model to ship with automated detection and blocking of requests flagged as prohibited or high-risk cybersecurity uses — a production outgrowth of Claude Mythos Preview and Project Glasswing.
Task budgets (public beta) and a new /ultrareview command in Claude Code for dedicated code-review sessions.

Anthropic launch graphic for Claude Opus 4.7 released April 16, 2026 — Anthropic's launch asset for Claude Opus 4.7, shipped simultaneously to API, Bedrock, Vertex AI, and Microsoft Foundry.

Key Details

Pricing unchanged: $5 / 1M input tokens, $25 / 1M output tokens — same as Opus 4.6. Anthropic emphasized delivering "frontier capability on a two-month cadence at the same price."
Competitive positioning: Anthropic claims Opus 4.7 exceeds OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro on agentic coding, scaled tool-use, agentic computer use, and financial analysis. GPT-5.4 still leads on agentic search (89.3% vs. Opus 4.7's 79.3%).
Day-one availability: Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry all list Opus 4.7 as generally available, a rare simultaneous distribution milestone.
Release cadence: Opus 4.7 follows Opus 4.6 by roughly two months, reinforcing Anthropic's publicly stated "frontier-on-a-cadence" cycle.

What Developers Are Saying

Reaction split within hours of launch. On the main Hacker News thread, top-voted comments praised the coding improvements — particularly that Opus 4.7 resolves tasks neither Opus 4.6 nor Sonnet 4.6 could solve — and the expanded vision window. DEV.to engineer Gabriel Anhaia wrote after six hours of testing that Opus 4.7 "catches its own logical faults during the planning phase and accelerates execution, far beyond previous Claude models."

But a counter-narrative coalesced on Reddit and the Claude Discord within 48 hours. Multiple users pointed to a collapse on the MRCR long-context retrieval benchmark, where Opus 4.7 reportedly scores 32.2% versus Opus 4.6's 78.3%. Others noted that the new tokenizer consumes up to 35% more tokens for identical inputs — effectively a stealth price hike on workloads that bill by token. A Startup Fortune analysis summarized the community's case that Opus 4.7 may be "the pre-nerf build of 4.6 dressed in a higher model number." Engineers have also reported 4.7 flagging benign code as malware under the new cybersecurity guardrails and refusing routine edits that Opus 4.6 completed without incident. Anthropic has not publicly responded to the tokenizer or MRCR criticisms as of this writing.

What This Means for Developers

For teams on Claude Code, Cursor, Cline, or any agentic pipeline that bills by token, three practical actions:

Benchmark your own workload before defaulting to 4.7. The coding and vision gains are real, but the tokenizer change and MRCR regression mean long-context agents may see higher cost per task with worse retrieval fidelity. Run your eval harness, not just SWE-bench.
Audit your cybersecurity-adjacent prompts. Opus 4.7's automatic blocking is stricter. Code that touches parsers, binaries, or network primitives may trigger refusals where 4.6 shipped a completion. Consider falling back to Opus 4.6 — still available — for security-tooling pipelines.
Try the xhigh effort level. Anthropic recommends starting with high or xhigh for coding and agentic workloads. The new tier is the cleanest way to trade latency for reasoning depth without going straight to max.

Claude Code users get an immediate practical win: the new /ultrareview command spawns dedicated code-review sessions with task budgets (public beta) that cap runaway token spend on long-running agent runs.

What's Next

Anthropic's "two-month cadence at the same price" promise implies an Opus 4.8 or Sonnet 4.7 around mid-June 2026. The company is also in public conversation about chip diversification (the Broadcom and Google TPU deal announced April 10 covers 3.5 GW of capacity from 2027) and is reportedly exploring its own custom silicon. Expect Anthropic to address the MRCR and tokenizer complaints in the coming days — either with a clarification on the model card or a point release — if the backlash sustains.

Sources

Introducing Claude Opus 4.7 — Anthropic's official announcement with benchmark tables and availability details.
VentureBeat: Anthropic releases Claude Opus 4.7 — analysis of the competitive positioning versus GPT-5.4 and Gemini 3.1 Pro.
Hacker News discussion thread — developer reactions, long-context regression reports, tokenizer complaints.
AI Business: Anthropic Releases Good but not Great Claude Opus 4.7 — skeptical take on the capability gap.
DEV.to: Six hours with Claude Opus 4.7 — hands-on developer review by Gabriel Anhaia.
Startup Fortune: Community backlash analysis — aggregated Reddit and Discord criticism of the 4.7 release.

What Happened

The headline capability claims, sourced from Anthropic's own model card and launch post:

SWE-bench Verified: 3× more production coding tasks resolved compared to Opus 4.6, with double-digit gains in code and test quality on Rakuten-SWE-Bench.
CursorBench: 70% resolution rate, up from Opus 4.6's 58%.
Vision: Accepts images up to 2,576 pixels on the long edge (~3.75 megapixels), "more than three times" the 1.15 megapixel cap on Opus 4.6. Visual-navigation score without tools rose from 57.7% to 79.5%.
New xhigh effort level: Sits between high and max to give developers finer control over the reasoning-vs-latency tradeoff.
Cybersecurity safeguards: First Claude model to ship with automated detection and blocking of requests flagged as prohibited or high-risk cybersecurity uses — a production outgrowth of Claude Mythos Preview and Project Glasswing.
Task budgets (public beta) and a new /ultrareview command in Claude Code for dedicated code-review sessions.

Key Details

Pricing unchanged: $5 / 1M input tokens, $25 / 1M output tokens — same as Opus 4.6. Anthropic emphasized delivering "frontier capability on a two-month cadence at the same price."
Competitive positioning: Anthropic claims Opus 4.7 exceeds OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro on agentic coding, scaled tool-use, agentic computer use, and financial analysis. GPT-5.4 still leads on agentic search (89.3% vs. Opus 4.7's 79.3%).
Day-one availability: Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry all list Opus 4.7 as generally available, a rare simultaneous distribution milestone.
Release cadence: Opus 4.7 follows Opus 4.6 by roughly two months, reinforcing Anthropic's publicly stated "frontier-on-a-cadence" cycle.

What Developers Are Saying

What This Means for Developers

For teams on Claude Code, Cursor, Cline, or any agentic pipeline that bills by token, three practical actions:

Benchmark your own workload before defaulting to 4.7. The coding and vision gains are real, but the tokenizer change and MRCR regression mean long-context agents may see higher cost per task with worse retrieval fidelity. Run your eval harness, not just SWE-bench.
Audit your cybersecurity-adjacent prompts. Opus 4.7's automatic blocking is stricter. Code that touches parsers, binaries, or network primitives may trigger refusals where 4.6 shipped a completion. Consider falling back to Opus 4.6 — still available — for security-tooling pipelines.
Try the xhigh effort level. Anthropic recommends starting with high or xhigh for coding and agentic workloads. The new tier is the cleanest way to trade latency for reasoning depth without going straight to max.

What's Next

Sources

Introducing Claude Opus 4.7 — Anthropic's official announcement with benchmark tables and availability details.
VentureBeat: Anthropic releases Claude Opus 4.7 — analysis of the competitive positioning versus GPT-5.4 and Gemini 3.1 Pro.
Hacker News discussion thread — developer reactions, long-context regression reports, tokenizer complaints.
AI Business: Anthropic Releases Good but not Great Claude Opus 4.7 — skeptical take on the capability gap.
DEV.to: Six hours with Claude Opus 4.7 — hands-on developer review by Gabriel Anhaia.
Startup Fortune: Community backlash analysis — aggregated Reddit and Discord criticism of the 4.7 release.

Anthropic Releases Claude Opus 4.7 — 3× SWE-bench Gains, Stealth Tokenizer Change, Mixed Reception (April 2026)

What Happened

Key Details

What Developers Are Saying

What This Means for Developers

What's Next

Sources

Anthropic Releases Claude Opus 4.7 — 3× SWE-bench Gains, Stealth Tokenizer Change, Mixed Reception (April 2026)

What Happened

Key Details

What Developers Are Saying

What This Means for Developers

What's Next

Sources