GPT-5.4 Pro jumps to 150 IQ on MESNA Norway test as OpenAI breaks its own record

Make most well-liked on

OpenAI’s newest GPT-5.4 Professional mannequin has now achieved an IQ rating greater than 99.96% of all human beings, giving markets a contemporary sign that AI functionality beneficial properties are beginning to outpace the standard product-cycle noise.

OpenAI’s GPT-5.4 Professional touches 150 on public IQ benchmark as markets enter one other macro-heavy week

TrackingAI’s public leaderboard now locations OpenAI GPT-5.4 Professional at an IQ rating of 150, a pointy step up from the 136 rating that OpenAI’s o3 posted on the Mensa Norway check final yr.

The soar arrives at a second when market consideration has narrowed round Iran, power, labor softness, and the subsequent inflation print. That creates a unique query for the week forward: how shortly is machine intelligence compounding, and when will that acceleration start to overlap with financial positioning?

Why this issues: A transfer from 136 to 150 on a broadly understood benchmark compresses a posh functionality shift right into a easy sign. For companies, that sign feeds instantly into selections round automation, software program budgets, and headcount planning. For markets, it provides one other variable alongside charges, inflation, and development expectations.

OpenAI launched GPT-5.4 as its most succesful and environment friendly frontier mannequin for skilled work, with stronger coding, instrument use, and pc use, and a context window of as much as 1 million tokens. In the identical launch, OpenAI stated GPT-5.4 achieved a brand new cutting-edge on GDPval and exceeded human efficiency on OSWorld-Verified.

These benchmarks are separate from a public IQ check, but the path of journey aligns. Functionality is rising throughout separate measurement techniques, and that rise is turning into quick sufficient to affect budgeting, hiring plans, workflow design, and software program spend.

A rating of 150 on a public IQ-style benchmark compresses a broader functionality transfer right into a single, transportable sign. The quantity is straightforward to grasp even earlier than the methodology is debated.

The sooner o3 Mensa consequence established the benchmark and its limits. GPT-4.1’s one-million-token context window confirmed how OpenAI was extending mannequin utility throughout long-horizon code and doc duties, whereas our evaluation of OpenAI’s increasing capital loop linked mannequin progress to {hardware} enlargement, financing loops, and infrastructure demand.

Taken collectively, these developments place the newest IQ rating inside a broader business and financial context. A transfer from 136 to 150 on a public benchmark is hanging by itself. A transfer from 136 to 150 whereas OpenAI is pushing deeper into instrument use, pc use, enterprise productiveness, and capital-intensive infrastructure carries broader implications.

Public IQ benchmarks are restricted, however the functionality curve remains to be shifting greater

Public IQ-style checks stay imperfect devices for measuring frontier fashions. TrackingAI runs a public Mensa-style benchmark and in addition maintains a more durable non-public offline check.

IQ-style checks compress a slender slice of cognitive efficiency right into a single quantity, obscuring variation throughout reasoning sorts, context dealing with, creativity, and real-world problem-solving.

For AI and people alike, scores are delicate to check design, coaching publicity, and sample familiarity, which makes them a loud proxy for common functionality.

An IQ of 150 sits on the excessive higher tail of the distribution, typically related to people resembling Albert Einstein or Richard Feynman. In sensible phrases, it implies very quick abstraction, sturdy sample recognition, and the power to navigate complicated, multi-step issues with restricted steering.

The platform experiences scores as rolling averages throughout latest completions, and the methodology raises acquainted questions round immediate construction, reproducibility, training-set contamination, and format familiarity. These issues had been already seen when o3 reached 136, they usually stay energetic now that GPT-5.4 Professional sits at 150.

OpenAI’s o3 scores 136 on Mensa Norway test, surpassing 98% of human population

Associated Studying

OpenAI’s o3 scores 136 on Mensa Norway check, surpassing 98% of human inhabitants

OpenAI’s o3 mannequin reaches Mensa-Degree IQ in impartial testing.

Apr 17, 2025 · Liam ‘Akiba’ Wright

Even with these limits, the broader sample has develop into more durable to dismiss. One remoted benchmark consequence will be defined away as a quirk. A cluster of beneficial properties throughout public IQ-style testing, coding, browser use, desktop navigation, and knowledge-work efficiency carries extra analytical weight.

TrackingAI’s newest leaderboard locations GPT-5.4 Professional on the prime of its public IQ board forward of all Cluade, Gemini, Qwen, and Grok fashions, providing an exterior, legible public benchmark that maps shortly onto the broader functionality debate.

Few individuals want an in depth understanding of benchmark design to understand that 150 sits in a uncommon vary and buyers don’t want to just accept each premise behind an IQ-style check to acknowledge {that a} soar of this dimension suggests acceleration somewhat than drift.

Chart titled “AI IQ Test Results” showing average Mensa Norway IQ scores for major AI models on a bell curve, with OpenAI’s GPT-5.4 variants plotted near the top end of the range. — Chart titled “AI IQ Take a look at Outcomes” exhibiting common Mensa Norway IQ scores for main AI fashions on a bell curve, with OpenAI’s GPT-5.4 variants plotted close to the highest finish of the vary.

Enterprise patrons additionally don’t have to imagine that IQ equals common intelligence to see that techniques with stronger sample recognition, stronger instrument use, and stronger long-horizon activity dealing with are shifting towards economically helpful territory, extending far past puzzle-solving.

This factors towards techniques that may search, plan, confirm, navigate, and produce actual work throughout prolonged contexts. In that setting, the IQ rating capabilities much less as a novelty quantity and extra as a sign of the density of frontier reasoning.

There’s additionally aggressive worth within the leaderboard itself. A management place on a public benchmark reinforces OpenAI’s standing within the race for seen functionality management, particularly at a second when mannequin differentiation is turning into more durable to discern from structure notes alone.

Benchmark management compresses complexity right into a easy hierarchy. It gives builders a sign, enterprise patrons a story deal with, and buyers one other proxy for the place the aptitude frontier at the moment sits.

OpenAI’s benchmark climb is starting to overlap with the financial week forward

The week forward nonetheless runs by way of macro. The Bureau of Labor Statistics calendar clearly lays out the subsequent key releases: the FOMC minutes from the March 17 to 18 assembly, due on April 8; the March Client Value Index, due on April 10; and the March Producer Value Index, due on April 14.

That schedule retains charges, inflation, and development nervousness within the foreground, however beneath that floor, a second financial monitor is taking form, and OpenAI sits close to its heart.

Functionality development in frontier AI more and more intersects with capital allocation. A mannequin that pushes greater on public reasoning checks whereas additionally enhancing in coding, search, and pc use modifications how companies take into consideration workflow redesign. It modifications what software program patrons anticipate from copilots and brokers. It modifications how shortly enterprises transfer from experimentation towards deployment.

Jack Dorsey not too long ago posted that Block is shifting “from hierarchy to intelligence,” utilizing AI to take over coordination work as soon as dealt with by administration layers as the corporate reorganizes round particular person contributors, instantly accountable people, and player-coaches

Functionality development additionally modifications which duties will be carved out of labor value constructions and reassigned to software program. These results transfer by way of narrower channels first, together with doc workflows, spreadsheet workflows, buyer assist, analysis duties, browser automation, inside operations, code era, and verification loops.

OpenAI’s business path reinforces that interpretation. In its GPT-5.4 launch supplies, the corporate described stronger efficiency in skilled work, stronger instrument search, native pc use, and beneficial properties in benchmarked information work throughout occupations that map instantly onto the U.S. financial system.

That locations AI functionality development inside a well-recognized market query, the place spending flows subsequent if these techniques proceed enhancing at this tempo.

The reply extends past mannequin subscription income into cloud demand, chips, knowledge facilities, networking, energy, software program licenses, and labor productiveness assumptions. OpenAI’s increasing capital loop already displays a part of that construction, and the benchmark achieve provides an easier public-facing sign on prime of it.

That overlap is what provides the newest consequence broader relevance throughout a macro-heavy week. Markets already know the CPI setup. Markets already know oil costs can feed into inflation expectations. Markets already know the Fed minutes will likely be parsed for coverage tone.

However is the expansion in intelligence itself starting to behave like a macro variable? Sooner functionality beneficial properties can alter enterprise spending plans, tighten aggressive stress throughout white-collar capabilities, assist greater infrastructure outlays, and strengthen the case for AI-linked capital expenditure even in a slower nominal development atmosphere.

When TrackingAI reveals GPT-5.4 Professional at 150, the quantity falls inside a market that already views OpenAI as greater than a lab. It’s a platform firm, a deployment firm, an infrastructure buyer, and a sign generator for adjoining sectors.

The following check sits in two locations directly. One is methodological; public IQ-style benchmarks will preserve drawing scrutiny, and they need to. The opposite is financial; markets will determine, step-by-step, whether or not functionality jumps of this dimension should be priced alongside labor knowledge, fee expectations, and capital spending developments.

OpenAI’s newest benchmark climb pushes that call nearer. The rating is compact, legible, and straightforward to flow into. Its deeper relevance comes from the identical place as the corporate’s broader product push; the frontier remains to be climbing, and the financial footprint of that climb is turning into more durable to maintain in a separate class.

Talked about on this article

Source link

GPT-5.4 Pro jumps to 150 IQ on MESNA Norway test as OpenAI breaks its own record

Ethereum Foundation Nears 70,000 Staked ETH Target — Details

Bitcoin Holds $67K as Trump Issues 48-Hour Iran Ultimatum – Market Updates Bitcoin News

Related Posts

Next week’s CLARITY Act markup could fall apart over Trump family crypto ethics fight

What Is AML in Crypto? Anti–Money Laundering Explained for Beginners

OMDBlockchain Introduces a Next-Generation Ethereum-Compatible Infrastructure for Global Payments and Web3 Innovation

U.S. Treasury Buys Back $4 Billion of Debt, Why Bullish For Bitcoin

XRP Price Slides Back To Range Support After Failed Upside Push

Zcash just exploded 40% – and the privacy coin rally is no longer just a crypto-native trade

Bitcoin Holds $67K as Trump Issues 48-Hour Iran Ultimatum – Market Updates Bitcoin News

BlockDAG, BNB, XRP, and Dogecoin

$200M+ Bet on Conflict Outcomes in 2026 – Bitcoin News

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password

GPT-5.4 Pro jumps to 150 IQ on MESNA Norway test as OpenAI breaks its own record

OpenAI’s GPT-5.4 Professional touches 150 on public IQ benchmark as markets enter one other macro-heavy week

Public IQ benchmarks are restricted, however the functionality curve remains to be shifting greater

OpenAI’s o3 scores 136 on Mensa Norway check, surpassing 98% of human inhabitants

Every day indicators, zero noise.

OpenAI’s benchmark climb is starting to overlap with the financial week forward

Ethereum Foundation Nears 70,000 Staked ETH Target — Details

Bitcoin Holds $67K as Trump Issues 48-Hour Iran Ultimatum – Market Updates Bitcoin News

Related Posts

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password