Xiaomi’s MiMo V2.5 Multimodal AI Slashes Tokens, Beats Benchmarks

Xiaomi rolls out MiMo V2.5 with multimodal AI and improved efficiency

Xiaomi has released its new MiMo-V2.5 line, which can process multiple types of data – like text, images, and audio – and represents a significant step forward in the company’s development of advanced artificial intelligence.

Summary

Xiaomi has launched the MiMo V2.5 and V2.5 Pro models, combining text, image, audio, and video capabilities into a single system.
MiMo V2.5 Pro has delivered near top-tier benchmark results, resolving 57.2% of tasks on the SWE bench Pro while competing with leading AI models.
The company has priced the models lower while improving token efficiency, using up to 42% fewer tokens than comparable systems for similar performance.

Xiaomi’s new MiMo-V2.5 and MiMo-V2.5-Pro combine image, audio, and video processing into a single system. This means features previously found in separate models are now all included in one release.

What sets MiMo-V2.5 apart?

The previous MiMo-V2-Pro handled text and code well, but relied on a separate, less capable model for images, videos, and audio. MiMo-V2.5 combines all these abilities into one, so users can work with any type of content – text, code, images, videos, and audio – without needing to change tools.

Our system lets you upload photos for ideas, get help from video tutorials, and even pull out key takeaways from recorded meetings – all in one place.

Xiaomi says the new Pro version is a significant upgrade over its previous model, offering much better performance in complex tasks, software development, and long-term projects. They claim it now performs as well as top-performing systems like Claude Opus 4.6 and GPT-5.4 on most coding and task-handling tests.

Performance, pricing, and positioning

The MiMo-V2.5-Pro is designed for complex and challenging jobs. According to Xiaomi, it can independently handle professional tasks requiring over 1,000 separate instructions – work that would typically take human experts days to complete.

The Pro model processes around 60 to 80 pieces of text (tokens) each second and costs $1.00 per million input tokens and $3.00 per million output tokens. The MiMo-V2.5 base model is designed for common tasks, running faster at 100 to 150 tokens per second and costing less – just $0.40 per million input tokens and $2.00 per million output tokens. Both models can handle large amounts of text – up to one million tokens – making them suitable for working with extensive data or long conversations.

In my research, the Pro model has consistently performed very well, placing it among the top performers in our benchmarks. Specifically, on the SWE-bench Pro, it successfully completed 57.2% of the tasks – more than double the average of around 25%. It also showed strong results on τ3-bench and ClawEval, keeping pace with leading models. However, we did see a dip in performance when tackling more challenging reasoning tests like Humanity’s Last Exam, where it achieved a score of 48.0% compared to GPT-5.4’s 58.7%.

In today’s competitive landscape, efficiency is crucial. Xiaomi claims their MiMo-V2.5-Pro model achieves similar results to Kimi K2.6 while using 42% fewer processing units, known as tokens. Even their standard model uses almost half the tokens of other similar systems. This lower usage translates to significant cost savings for developers working on large projects.

Rapid rollout and ecosystem push

Xiaomi has been consistently releasing new models. They launched MiMo-V2-Flash in late 2025, and then, in March, released the V2-Pro, Omni, and TTS models. Shortly after, they introduced the V2.5 series.

Lei Jun revealed plans to invest $8.7 billion in artificial intelligence over the next three years, and it appears they’ve already begun rapidly implementing AI technologies.

Data from the platform shows us more about how AI is being used. Xiaomi’s AI models made up around 21% of traffic on OpenRouter in early April, and their use jumped over 42% in just one week. This increase happened after people could use them for free through the Hermes AI tool, which helped more people discover and start using them.

As part of our recent launch, we’ve made some updates to our pricing. We’ve eliminated the extra fees for utilizing the full 1 million token context window and have refreshed user credits. Currently, these models are accessible through the MiMo API, though access via AI Studio is still restricted.

Xiaomi announced that upcoming versions will emphasize improved problem-solving skills, better compatibility with other applications, and a stronger connection to real-world information, hinting that a new release could happen relatively soon.

2026-04-23 11:16

What sets MiMo-V2.5 apart?

Performance, pricing, and positioning

Rapid rollout and ecosystem push

Read More