GPT-5.5 launches alongside Codex's expansion into a broader agent workspace

OpenAI launched GPT-5.5 and simultaneously shipped substantial Codex upgrades, combining them in a way that Latent Space’s AINews coverage describes as “the critical and retroactively obvious choice to turn Codex into the base of its superapp strategy.” The dual launch, on April 22-23, 2026, came roughly a week after Anthropic’s Opus 4.7 release.

What GPT-5.5 actually is

GPT-5.5 was rolled out across ChatGPT and the Codex integration, with API access described as delayed pending additional safeguards, according to AINews. Community and official benchmark numbers, as reported in AINews, show stronger long-horizon execution, better computer-use behavior, and improved token efficiency rather than a clean across-the-board benchmark improvement.

Reported benchmarks include 82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, 84.9% on GDPval, 78.7% on OSWorld-Verified, 81.8% on CyberGym, 84.4% on BrowseComp, and 51.7% on FrontierMath Tier 1-3. Artificial Analysis described GPT-5.5 as leading or tying several headline evaluations and sitting on a new cost/performance frontier.

According to AINews, Artificial Analysis found that “GPT-5.5 (medium) scores the same as Claude Opus 4.7 (max) on our Intelligence Index at one quarter of the cost (~$1,200 vs $4,800) — although Gemini 3.1 Pro Preview scores the same at a cost of ~$900.” The reported API pricing is $5/$30 per million input/output tokens for GPT-5.5, and $30/$180 for the Pro variant.

The model was described in AINews as co-designed for NVIDIA GB200/300 systems. Context window in the API is 1 million tokens; token use per task was reported as lower than GPT-5.4, according to AINews.

AINews reports early user reactions as mixed: multiple users described the model as more human-sounding and less formal, better suited to persistent agent workflows, but also more exploratory and requiring tighter instruction to stay on track in some use cases.

Codex as superapp

AINews describes the Codex upgrades as potentially the more significant part of the launch. OpenAI shipped browser control, Sheets and Slides editing, Docs and PDF handling, OS-wide dictation, and auto-review mode. Codex can now interact with web applications, click through flows, capture screenshots, and iterate until task completion. Auto-review uses what OpenAI calls a secondary “guardian” agent to reduce the number of approvals required on longer runs.

The effect, according to user reports in AINews, is that Codex is expanding from a coding-focused tool into a broader computer-work agent covering QA, spreadsheets, presentations, app building, research loops, and overnight experimental runs. The newsletter folded in the now-defunct Prism into this update (noted as “RIP” in the coverage), consolidating OpenAI’s agent surface under the Codex umbrella.

The AINews framing is direct: this is a superapp strategy. Codex becomes the base on which OpenAI stacks computer-use, coding, document editing, and agentic orchestration. Rather than maintaining separate products for different surfaces, the functionality is merging into a single interface with a foundation model underneath.

DeepSeek V4 as same-day competition

DeepSeek released DeepSeek-V4 Preview within hours of the GPT-5.5 announcement, which AINews notes was immediately framed by the community as a competitive response. V4-Pro has 1.6 trillion total parameters with 49 billion active; V4-Flash has 284 billion total with 13 billion active. Both carry a 1 million token context and support thinking and non-thinking modes, released under MIT license.

V4-Flash is priced at $0.14/$0.28 and V4-Pro at $1.74/$3.48 per million input/output tokens. Community commentary in AINews highlighted Flash as potentially the more disruptive product if serving quality holds at those costs, given the combination of very low cost, 1 million token context, and open weights. DeepSeek noted that V4-Pro throughput is currently limited by compute constraints, with future Ascend 950 availability expected to enable further price reductions.

The technical report accompanying the release described two new compressed hybrid attention mechanisms, Muon-based training, FP4 quantization-aware training, and pretraining on roughly 32 trillion tokens. The strongest community discussion centered on the 1 million context efficiency: approximately 4x compute efficiency improvements and order-of-magnitude KV-cache reductions relative to earlier DeepSeek architectures.

Agent infrastructure moving to systems problems

A recurring theme in AINews’ broader coverage of this period is that production agent work has become an infrastructure problem, not just a model problem. Several posts emphasized memory, orchestration, harness engineering, and evals as the practical bottlenecks. A writeup on stateless decision memory for enterprise agents proposed replacing mutable per-agent state with immutable decision logs and event sourcing for better horizontal scalability, auditability, and fault tolerance.

Sakana AI launched Fugu, a multi-agent orchestration API that dynamically selects and coordinates frontier models. Cua open-sourced a macOS driver for letting agents control arbitrary applications in the background with multi-player support. LangChain continued expanding LangSmith Fleet with file editing and presentation generation.

AINews frames the GPT-5.5 and Codex launches as moves on a product map that is being contested by multiple parties, with OpenAI consolidating computer-use, coding, document editing, and agentic orchestration under the Codex umbrella.