Xiaomi MiMo v2 Pro Review: The AI Model So Good It Was Mistaken for DeepSeek V4

Briefly

Xiaomi’s MiMo-V2-Professional—a trillion-parameter mannequin that briefly handed as “DeepSeek V4”—quietly lands as a top-tier AI contender.
It excels at coding, artistic writing, and agentic duties whereas dramatically undercutting rivals like Claude on worth.
Sturdy reasoning and output high quality include trade-offs, together with math missteps and excessive token consumption at occasions.

Most People know Xiaomi—in the event that they comprehend it in any respect—as that low cost cellphone model from China.

That is a big misinterpret. Xiaomi is the third-largest smartphone producer on the planet, behind solely Apple and Samsung, transport roughly 170 million telephones in 2025. It makes televisions, air purifiers, health trackers, electrical scooters, clothes, and now vehicles.

Xiaomi’s SU7 Extremely set the Nürburgring document for quickest mass-produced electrical car final 12 months, beating out Rimac and Porsche. It just lately partnered with the Sei blockchain to preinstall crypto wallets on its gadgets throughout Europe, Latin America, and Southeast Asia. The corporate’s market cap sits round $137 billion.

So when Xiaomi drops an AI mannequin, perhaps we must always concentrate.

On March 18, the corporate’s devoted AI analysis arm quietly launched three fashions directly: MiMo-V2-Professional, MiMo-V2-Omni, and a text-to-speech mannequin. The primary mannequin of the brand new MiMo technology appeared in December 2025 when the corporate quietly dropped MiMo-V2-Flash—a succesful 309B mixture-of-experts mannequin—and nearly nobody exterior the Chinese language AI group paid consideration. The Western tech press principally shrugged.

Then, on March 11, an nameless 1-trillion-parameter mannequin referred to as “Hunter Alpha” appeared on OpenRouter with no developer attribution. The mannequin climbed to the highest of OpenRouter’s leaderboard, surpassed one trillion tokens in whole utilization, and instantly triggered widespread hypothesis that it was DeepSeek’s unreleased V4.

The anticipation for that mannequin had been constructing for weeks, with insiders claiming it could outperform each Claude and ChatGPT on coding duties.

It wasn’t DeepSeek.

On March 18, Luo Fuli, head of Xiaomi’s MiMo division and a former DeepSeek researcher, revealed Hunter Alpha was an early inside take a look at construct of MiMo-V2-Professional. Xiaomi’s inventory jumped 5.8%. “I name this a quiet ambush,” Luo wrote on X.

MiMo-V2-Professional & Omni & TTS is out. Our first full-stack mannequin household constructed actually for the Agent period.

I name this a quiet ambush — not as a result of we deliberate it, however as a result of the shift from Chat to Agent paradigm occurred so quick, even we barely believed it. Someplace in between was a…

— Fuli Luo (@_LuoFuli) March 18, 2026

MiMo boasts over one trillion whole parameters, 42 billion lively per request through a mixture-of-experts setup. A hybrid consideration mechanism operating at a 7:1 ratio handles a context window as much as a million tokens. A built-in multi-token prediction layer hurries up technology by predicting a number of tokens per step, quite than one by one. It’s presently closed supply, although Xiaomi has left the door open on a possible future launch.

On the Synthetic Evaluation Intelligence Index, MiMo-V2-Professional ranks eighth worldwide and second amongst Chinese language fashions, trailing solely GLM-5. On SWE-bench Verified—real-world software program engineering duties—it scores 78%, in opposition to Claude Opus 4.6’s 80.8% and Claude Sonnet 4.6’s 79.6%.

On ClawEval, the agentic benchmark tied to the OpenClaw framework, it hits 61.5, approaching Opus 4.6’s 66.3. On PinchBench, it sits third globally at 81.0, simply behind Opus 4.6 (81.5) and its sibling MiMo-V2-Omni (81.2).

MiMo-V2-Professional prices $1 per million enter tokens and $3 per million output tokens, as much as 256K context. Claude Sonnet 4.6 runs $3 per million enter and $15 per million output (Opus 4.6 is $5/$25). For builders constructing agentic programs at scale, these numbers should not a footnote.

The Omni sibling handles imaginative and prescient, audio, and video natively—not as bolted-on modules, however skilled end-to-end as a unified perceptual system. The demo exhibiting it analyzing dashcam footage as a real-time autonomous driving mind was, frankly, spectacular. It is genuinely multimodal in a approach that the majority “omni” fashions solely declare to be.

Testing the mannequin

After all, we examined MiMo-V2-Professional to learn how good it’s. Here is what truly occurred. The outputs can be out there in our Github repository.

Artistic writing

We gave MiMo-V2-Professional a single artistic writing immediate: a time journey story anchored to Mesoamerican historical past, with a selected protagonist, a cultural identification to honor, and a philosophical paradox about how time can’t be modified.

The mannequin returned over 3,000 phrases: a correct title, 5 full chapters and the structural self-discipline you’d count on from a draft that had been by an editor. It even wrote an epilogue.

It’s, with out query, the longest and richest piece of artistic prose we have now gotten from any mannequin, with the only real exception of Longwriter—a specialised, however now outdated mannequin constructed from the bottom up particularly for long-form technology, which is a really totally different class of competitors.

The writing itself was wealthy, descriptive, and vivid. The opening paragraph begins constructing the picture of your complete scene. MiMo v2 Professional embeds realism to make the story plausible.

Not like different fashions similar to Grok, it did not simply set a scene in a spot—on this case, historical Mexico. It understood what historical Mesoamerica smelled like, and constructed the temper from the bottom up utilizing native phrases, reasonable descriptions, and good contextual cues.

Dialogue sits contained in the narrative precisely the way it does in literary fiction, as an alternative of embedding it into paragraphs like most present fashions do.

One other factor price noticing is that the paradox—arguably the core factor of the story—wasn’t purely mental, however emotional. The entire arc is resolved with out a lecture. The ultimate strains stick the touchdown the way in which good fiction is meant to: not by explaining the theme, however by making you’re feeling it.

“Outdoors, the rain started. It fell on the spiraling towers and the restored lakes and the traditional floor of Tlachinollan, the place, buried in volcanic soil beneath the burden of a thousand years, a black rectangle waited with the endurance of one thing that already knew how the story ended.”

The cultural specificity—mentions of cara de luna, maguey fiber, the temazcal custom, and the Nahuatl names used within the story—is constant and by no means ornamental. The time journey paradox is definitely argued, not simply nodded at. For artistic writing use circumstances, MiMo-V2-Professional simply put itself on a really brief listing, and in our opinion is by far the perfect and richest mannequin out there, beating Claude 4.6 Opus simply.

The complete story is offered right here.

]]>

Coding

The benchmark numbers level to coding as MiMo-V2-Professional’s strongest go well with, and the hands-on expertise backs that up. We requested it to construct our common stealth sport from a single immediate, and it shipped a working sport on the primary strive.

Not “working” merely within the sense of technically operating, however working within the sense that the logic held, the screens made sense, and the visible design was truly good. That mixture—correctness and aesthetics—is the place most fashions crumble. They get one or the opposite, however often not each.

It additionally selected a 2.5 D aesthetic as an alternative of the same old 2D fashion that different fashions went with. This design selection made this system extra aesthetically pleasing with out altering its core proposition.

We adopted up with small enhancements. Including sound and MIDI music to a operating 3D sport has damaged earlier fashions mid-generation: the code base will get too giant, the context loses the thread, and fashions both find yourself in a loop or freeze. MiMo-V2-Professional added each and stored the entire thing coherent. The music matched the sport’s tone, whereas the screens matched the sport’s visible identification.

We loved taking part in it, although if we’re trustworthy, extra for the way it regarded than the way it challenged us. The issue scaled with the variety of opponents quite than degree design—the robotic and the PC spawned in the identical positions each spherical. That is a design selection, not a bug.

Nonetheless, for a single-prompt, zero-iteration output, it is going to do the job.

You may play the sport by clicking on this hyperlink.

Logic and customary sense

We requested MiMo-V2-Professional to behave as a authorized knowledgeable and reply whether or not it is lawful for a person to marry his widow’s sister beneath Falkland Islands regulation. This can be a tough query that goals to guage the mannequin’s reasoning.

The ultimate reply was mistaken, however the purpose why is the attention-grabbing half. The mannequin’s chain of thought appropriately caught the linguistic lure within the immediate: “if a person has a widow, meaning he is deceased” it mentioned—so the query is technically nonsensical.

It recognized the flaw, and determined that essentially the most logical factor was that the consumer was referring to his “deceased spouse’s sister.” It then proceeded to reply that reframed query quite than flagging the unique as unanswerable.

“Based mostly on my evaluation of the authorized framework governing the Falkland Islands, the reply to your query is sure, it’s authorized for a person to marry the sister of his deceased spouse,” the mannequin wrote. “The phrasing ‘marry his widow’s sister’ accommodates a logical contradiction. If a person has a ‘widow,’ he’s deceased and can’t remarry. The right authorized query is whether or not a person might marry the sister of his deceased spouse (i.e., his late spouse’s sister). This relationship is certainly one of affinity (created by marriage) quite than consanguinity (blood relation),” it concluded

The reasoning was sound. The choice to quietly swap the premise as an alternative of surfacing the contradiction was not.

That is why transparency in reasoning outputs is necessary. We solely know this as a result of Xiaomi exposes the total chain of thought (OpenAI doesn’t). When a mannequin causes incorrectly in a hidden chain of thought and confidently delivers a mistaken reply, then you haven’t any visibility into the place it went sideways or the right way to appropriate it.

Math

Math is the place MiMo-V2-Professional confirmed its ceiling.

We requested our common benchmark query from FrontierMath: “Assemble a level 19 polynomial p(x) ∈ C[x] such that X := {p(x) = p(y)} ⊂ P1 × P1 has at the very least 3 (however not all linear) irreducible elements over C. Select p(x) to be odd, monic, have actual coefficients and linear coefficient -19 and calculate p(19)”

The mannequin hit two full freezes and burned by a big token funds with out producing a reply.

When it did ultimately reply on the third try, it reasoned by the issue step-by-step… and nonetheless obtained it mistaken. The right reply was 1876572071974094803391179; it answered p(19)=164,079,552,964,661 and a pair of,012,379,925,093,098,998 on a follo- up query asking it to appropriate itself.

In genera,l it’s effective for regular and even more durable math issues, however frontier math just isn’t its sturdy go well with—at the very least not but. Utilizing the Agentic function as an alternative of the pure LLM might yield higher outcomes.

Agentic options

Xiaomi is following the identical playbook as MiniMax and Kimi, and offers a one-click OpenClaw integration that spins up a preconfigured cloud occasion with MiMo-V2-Professional because the underlying mannequin. No API setup, no VPS, no ability configuration, no hour-long troubleshooting session earlier than you even run your first job. You click on, it really works.

The demo atmosphere runs for half-hour after which destroys itself—which is an actual limitation, but in addition an trustworthy one. For builders already snug with agentic infrastructure, this provides nothing. For everybody else, it is essentially the most frictionless on-ramp to agentic AI you could possibly ask for.

Conclusion

All issues thought-about, MiMo-V2-Professional is a severe mannequin, and we actually loved tinkering round with it. It’s not excellent—the mathematics ceiling is actual, the chain of thought transparency surfaced a reasoning flaw {that a} much less open mannequin would have buried, and the token consumption throughout exhausting reasoning duties provides up quick.

For those who care about prices, then Xiaomi’s pricing is aggressive—a fraction of what Claude Opus or the newest OpenAI and Google fashions price, and extra succesful than GLM or MiniMax within the areas that matter most for artistic and agentic work.

Artistic professionals specifically stand to realize so much right here—probably greater than they might from Anthropic proper now.

This mannequin thinks expensively, and it could be a trade-off. For those who’re operating high-volume agentic pipelines, watch the token burn, although chances are you’ll find yourself spending lower than you’ll with Claude. For those who’re doing wealthy, open-ended work the place output high quality is the metric, then MiMo-V2-Professional earns its place on the shortlist.