The Xiaomi MiMo large model development team shared an in-depth article on their official technical blog, unveiling for the first time the technological approach that has enabled the permanent price reduction of the MiMo-V2.5 series large model API. MiMo-V2.5 boasts five significant advancements: the integration of a KVCache dual-pool system with an SWA-aware prefix tree, the implementation of GCache distributed caching, optimized KVCache affinity scheduling, accelerated MTP performance during the Decode phase, and refined multimodal inference optimization. These innovations collectively empower the model to sustain profitability even after the price adjustment. Furthermore, the 'Trillion-Token Creator Incentive Program', initiated on April 28th, has garnered an overwhelmingly positive response, attracting over 540,000 applicants in total and distributing 100 trillion free Tokens, with an estimated value exceeding RMB 65 million.
