OpenAI has just launched its newest model, GPT-5.
OpenAI launched its newest model, GPT-5.4, on March 5, 2026. Described by the company as its "most capable and efficient frontier model for professional work," GPT-5.4 brings together advanced reasoning, coding, and direct computer control into a single unified system — consolidating capabilities that were previously spread across separate models.
Where to Access GPT-5.4
GPT-5.4 is available across several platforms from launch day:
- ChatGPT (Plus, Team, Pro): Powers the new GPT-5.4 Thinking mode, which replaces GPT-5.2 Thinking. Free users will also see GPT-5.4 when queries are automatically routed to it.
- ChatGPT Enterprise and Edu: GPT-5.4 Pro is available for these tiers.
- OpenAI API: Developers can access both GPT-5.4 and the higher-performance GPT-5.4 Pro with priority processing for faster production throughput.
- Codex: GPT-5.4 is integrated into OpenAI's coding tool with native computer-use capabilities.
What Makes GPT-5.4 Different
The main story with GPT-5.4 is unification. OpenAI has merged the industry-leading coding capabilities of GPT-5.3-Codex with improved general reasoning and — for the first time in a general-purpose model — native computer-use abilities. The goal is a model that can handle complete professional workflows from start to finish: analyzing spreadsheets, building presentations, running multi-step agentic tasks, all with less back-and-forth input needed from the user.
OpenAI also launched Tool Search alongside GPT-5.4 — a new API system that allows the model to look up tool definitions as needed rather than loading all tools upfront. This reduces token use, improves speed, and lowers costs in systems that rely on large numbers of tools.
Additionally, GPT-5.4 now integrates directly with Microsoft Excel and Google Sheets through new ChatGPT plugins, enabling automated data analysis and task completion inside spreadsheets — a notable move toward enterprise workflow automation.
Smarter Conversations with GPT-5.4 Thinking
In ChatGPT, the new GPT-5.4 Thinking mode introduces an upfront reasoning plan at the start of each response. This lets you interrupt and redirect the model mid-response without forcing a complete restart — a meaningful change from earlier models where any course correction required starting over. The result is more precise outputs aligned with what you actually need, with fewer additional turns required.
Benchmark Performance
GPT-5.4 posts strong numbers across key industry benchmarks:
| Benchmark | GPT-5.4 | GPT-5.3-Codex | GPT-5.2 |
|---|---|---|---|
| GDPval (wins or ties) | 83.0% | 70.9% | 70.9% |
| SWE-Bench Pro | 57.7% | 56.8% | 55.6% |
| OSWorld-Verified | 75.0% | 74.0% | 47.3% |
| Toolathlon | 54.6% | 51.9% | 46.3% |
| BrowseComp | 82.7% | 77.3% | 65.8% |
- GDPval: This benchmark evaluates AI agent performance across 44 professional job categories in major U.S. industries. GPT-5.4 matched or outperformed human professionals in 83% of comparisons — up from 70.9% with GPT-5.2.
- BigLaw Bench: In legal document evaluation, GPT-5.4 scored 91%, according to Niko Grupen at Harvey.
- APEX-Agents: GPT-5.4 took the lead on Mercor's benchmark for professional skills in law and finance. Mercor CEO Brendan Foody noted it excels at long-horizon deliverables — slide decks, financial models, legal analysis — while running faster and at lower cost than competitive frontier models.
- BrowseComp: GPT-5.4 improved by 17 percentage points over GPT-5.2 on persistent web browsing tasks. GPT-5.4 Pro reaches 89.3% on this benchmark — a new state of the art.
Native Computer Use: A First for OpenAI's General Models
GPT-5.4 is OpenAI's first general-purpose model with native computer-use capabilities. It can interact directly with software by processing screenshots and issuing mouse and keyboard commands — meaning it can navigate desktops, browsers, and applications autonomously on a user's behalf.
- OSWorld-Verified: GPT-5.4 achieved a 75.0% success rate on desktop navigation tasks — surpassing the human benchmark of 72.4% and significantly ahead of GPT-5.2's 47.3%.
- WebArena-Verified: 67.3% success rate on browser-based tasks using both DOM- and screenshot-driven interaction.
- Online-Mind2Web: 92.8% success using screenshot-based observations alone.
1 Million Token Context Window
Through the API, GPT-5.4 supports up to 1 million tokens of context — OpenAI's largest context window to date, bringing it in line with long-context models from Anthropic and Google. This enables agents to plan, execute, and verify tasks across extended workflows without losing context. Note that input beyond 272,000 tokens is charged at double the standard input rate.
More Accurate, More Efficient
OpenAI says GPT-5.4 is its most factually reliable model to date. Compared to GPT-5.2:
- Individual claims are 33% less likely to be false
- Full responses are 18% less likely to contain errors
- The model uses significantly fewer tokens to solve the same problems — on some tasks, 47% fewer tokens than previous models
That token efficiency matters in practice. Despite a slightly higher per-token price than GPT-5.2, the reduction in tokens required for many tasks offsets that cost for developers.
Dod Fraser, CEO of Mainstay, shared a real-world example: GPT-5.4 achieved a 95% first-attempt success rate across approximately 30,000 property portals, completed tasks three times faster, and used 70% fewer tokens compared to older computer-use models.
Mario Rodriguez, Chief Product Officer at GitHub, added: "Developers don't just need a model that writes code. They need one that thinks through problems the way they do. We're seeing GPT-5.4 perform exceptionally well at logical reasoning and executing intricate, multi-step, tool-dependent workflows."
Frequently Asked Questions
Who Can Access GPT-5.4 Right Now?
GPT-5.4 Thinking is available today for ChatGPT Plus ($20/month), Team, and Pro ($200/month) subscribers. GPT-5.4 Pro is available for ChatGPT Enterprise and Edu users, and through the OpenAI API. Free users may see GPT-5.4 when queries are automatically routed to it.
What Is GPT-5.4 Replacing?
GPT-5.4 Thinking will replace GPT-5.2 Thinking for ChatGPT subscribers over the coming months, according to OpenAI's release notes.
How Does GPT-5.4 Compare to Claude and Gemini?
GPT-5.4 posts record scores on computer-use benchmarks like OSWorld-Verified (75%), where it now exceeds human performance benchmarks. According to TechCrunch and Fortune, GPT-5.4 puts OpenAI into more direct competition with Anthropic and Google in the enterprise and agentic market — though independent platform rankings on Arena.ai and Artificial Analysis still show models from Anthropic and Google ranking ahead of OpenAI overall.
What Is Tool Search?
Tool Search is a new API feature launching alongside GPT-5.4. Instead of loading all tool definitions upfront, the model can look up the tools it needs on demand. This reduces unnecessary token consumption, improves response speed, and lowers costs for developers building applications with large numbers of tools.
What Are the API Pricing Details?
Based on current OpenRouter listings, GPT-5.4 is priced at $2.50 per million input tokens and $20.00 per million output tokens. Input beyond 272,000 tokens in a session is charged at double the input rate. OpenAI direct billing and enterprise contract pricing may differ — check the official OpenAI pricing page for the most current figures.
Comments
Post a Comment