OpenAI releases GPT-5.5 with faster agentic coding and fewer tokens per task

OpenAI on Thursday released GPT-5.5, a new model the company says achieves higher scores than GPT-5.4 on coding and agentic benchmarks while matching its predecessor’s per-token latency and using fewer tokens to complete the same tasks.

GPT-5.5 is OpenAI’s follow-on to GPT-5.4, released last month. According to OpenAI’s announcement, the model targets agentic work — tasks that require a model to plan, use tools, navigate ambiguity, and execute over multiple steps rather than respond to a single prompt. OpenAI describes the release as “the next step toward a new way of getting work done on a computer.”

The company released benchmark comparisons against GPT-5.4, GPT-5.5 Pro, Claude Opus 4.7, and Gemini 3.1 Pro. On Terminal-Bench 2.0, which OpenAI describes as testing complex command-line workflows requiring planning, iteration, and tool coordination, GPT-5.5 scored 82.7% against GPT-5.4’s 75.1%. On Expert-SWE — OpenAI’s internal evaluation for long-horizon coding tasks with a median estimated human completion time of 20 hours — GPT-5.5 outperformed GPT-5.4. On SWE-Bench Pro, which evaluates real-world GitHub issue resolution, GPT-5.5 reached 58.6%. OpenAI said the model “improves on GPT-5.4’s scores while using fewer tokens” across all three coding evaluations.

On OSWorld-Verified, which tests computer use, GPT-5.5 scored 78.7% against GPT-5.4’s 75.0% and Claude Opus 4.7’s 78.0%. On BrowseComp, a web research benchmark, GPT-5.5 scored 84.4% and GPT-5.5 Pro scored 90.1%. FrontierMath Tier 1–3 showed GPT-5.5 at 51.7% against GPT-5.4’s 47.6% and Gemini 3.1 Pro’s 36.9%.

OpenAI said GPT-5.5 “matches GPT-5.4 per-token latency in real-world serving, while performing at a much higher level of intelligence.” The company also cited Artificial Analysis’s Coding Index, saying GPT-5.5 “delivers state-of-the-art intelligence at half the cost of competitive frontier coding models.”

Early testers provided attributed accounts. Dan Shipper, founder and CEO of Every, is quoted in the announcement saying GPT-5.5 is “the first coding model I’ve used that has serious conceptual clarity.” He described a test in which he asked GPT-5.5 to replicate a rewrite that a senior engineer had needed days to produce; GPT-5.4 could not do it, and GPT-5.5 could. Pietro Schirano, CEO of MagicPath, said the model merged a branch with hundreds of frontend and refactor changes into a substantially changed main branch, “resolving the work in one shot in about 20 minutes.”

GPT-5.5 is rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. GPT-5.5 Pro is rolling out to Pro, Business, and Enterprise users in ChatGPT. OpenAI said it would bring both variants to the API after working with partners and customers on “the safety and security requirements for serving it at scale.” On April 24, the company updated the announcement to note that GPT-5.5 and GPT-5.5 Pro are now available in the API, and that the system card has been updated to describe additional safeguards.

OpenAI said the model was evaluated across its “full suite of safety and preparedness frameworks,” with internal and external red-teamers and targeted testing for advanced cybersecurity and biology capabilities. The company said it collected feedback from “nearly 200 trusted early-access partners” before release.

The announcement also noted that GPT-5.5 delivers gains “especially in agentic coding, computer use, knowledge work, and early scientific research — areas where progress depends on reasoning across context and taking action over time.” OpenAI did not specify timelines for broader feature rollouts.