OpenAI Launches GPT-5.5 With State-of-the-Art Agentic Coding — and Doubles API Prices (April 2026)
OpenAI on April 23, 2026 released GPT-5.5 — its first fully retrained base model since GPT-4.5 — claiming a state-of-the-art 82.7% on Terminal-Bench 2.0 and 58.6% on SWE-Bench Pro, while raising API prices to $5/$30 per million tokens, double the GPT-5.4 rate.
OpenAI on released GPT-5.5, calling it the company's "smartest and most intuitive" model yet and claiming a state-of-the-art 82.7% accuracy on Terminal-Bench 2.0, the benchmark that measures multi-step command-line work. The model launched in the API the same day at $5 per million input tokens and $30 per million output tokens — roughly double the GPT-5.4 rate and the largest single-release price hike in the GPT-5.x series.
What Happened
OpenAI published the launch post "Introducing GPT-5.5" on April 23 and rolled the model into ChatGPT Plus, Pro, Business and Enterprise, into Codex, and into the API on the same day. According to the launch post, GPT-5.5 is the first fully retrained base model since GPT-4.5, which the company says explains the unusually large quality jump for what looks like a minor version bump from GPT-5.4.
A separate variant, GPT-5.5 Pro, also shipped to Pro, Business and Enterprise users in ChatGPT and to the API at $30/$180 per million input/output tokens. NVIDIA said in a parallel announcement that GPT-5.5 powers Codex on its infrastructure and that NVIDIA has already begun deploying the model internally for engineering work.
Key Details
- Terminal-Bench 2.0: 82.7% — state-of-the-art on the agentic coding benchmark, narrowly beating Anthropic's Claude Mythos Preview, according to VentureBeat.
- SWE-Bench Pro: 58.6% — solving more real-world GitHub issues end-to-end in a single pass than any previous OpenAI model.
- 1M-token context window with a fully retrained base, marketed as a "new class of intelligence" by OpenAI in its launch materials.
- API pricing: $5 / $30 per million input/output tokens for GPT-5.5 and $30 / $180 for GPT-5.5 Pro — about a 2× increase versus GPT-5.4.
- Latency parity with GPT-5.4 in real-world serving despite the larger model, according to OpenAI's serving notes.
- Hallucination caveat: Artificial Analysis's AA-Omniscience benchmark recorded GPT-5.5 (xhigh) at 57% accuracy — the highest ever — but also an 86% hallucination rate, the highest the lab has measured.
What Developers and Users Are Saying
The reaction has been measured rather than euphoric. Writing in his daily weblog, Simon Willison flagged that GPT-5.5 "will cost roughly twice as much as its predecessor once it reaches the API," arguing that GPT-5.4 may have a longer shelf life as a cheaper default for high-volume agent workloads. The recurring theme on Hacker News and on /r/LocalLLaMA is that the price doubling, not the benchmark gains, is the headline — and that smaller open-weights models like DeepSeek's V4-Pro (released the next day) make the gap on price look even wider.
On the positive side, early Codex users praised GPT-5.5 for what The New Stack called "Mythos-like hacking, open to all" — the model is notably better at long-horizon engineering work, holding context across large systems, and reasoning through ambiguous failures than GPT-5.4. NVIDIA's internal testing reportedly showed measurable gains on real engineering tasks, not just on synthetic benchmarks.
The hallucination jump is the dominant complaint. Multiple developers on X pointed at the AA-Omniscience numbers and called the 86% hallucination rate at the highest reasoning setting "a real regression for retrieval and research workflows" — exactly the use cases OpenAI is positioning the model for.
What This Means for Developers
For anyone building on the OpenAI API, the math changes today. At $5 / $30 per million tokens, agent workloads that loop through tool calls and re-read context — exactly the workloads GPT-5.5 is best at — will see API bills roughly double overnight. Teams that have been autoscaling GPT-5.4 in production should benchmark GPT-5.5 on their own evals before swapping defaults; the benchmark gains are real, but so is the price hike, and the hallucination behavior at the highest reasoning setting needs guardrails.
Codex users get the model automatically as part of their existing subscription. ChatGPT Plus users see GPT-5.5 in the model picker now; Pro, Business and Enterprise users also get GPT-5.5 Pro. No client changes are required, but the launch post warns that the model is now "more willing to confabulate when it doesn't know," so retrieval-grounded prompting matters more than before.
What's Next
OpenAI's launch post says the gains in GPT-5.5 are "especially strong in agentic coding, computer use, knowledge work and early scientific research" and signals that the next round of Codex updates will lean on the new model. A GPT-5.5-Codex variant is expected within weeks, following the company's recent cadence of shipping a coding-tuned variant 4–6 weeks after each base-model release. Anthropic's still-unreleased Claude Mythos and DeepSeek's V4-Pro give OpenAI an unusually crowded competitive backdrop heading into Q2 2026.
Sources
- OpenAI — "Introducing GPT-5.5" — primary launch post, benchmarks and pricing.
- NVIDIA Blog — GPT-5.5 powers Codex on NVIDIA infrastructure
- VentureBeat — Terminal-Bench 2.0 comparison vs. Claude Mythos Preview
- The Decoder — "new class of intelligence at double the API price"
- The New Stack — Industry reaction to GPT-5.5
- Interesting Engineering — 82.7% Terminal-Bench score deep-dive
- BigGo Finance — Pricing and competitive context vs. Anthropic's Mythos
Stay up to date with Doolpa
Subscribe to Newsletter →