The Ant Bailing large-scale model team has made an official announcement, revealing the open-sourcing of two brand-new, highly efficient reasoning models: Ring-flash-linear-2.0 and Ring-mini-linear-2.0. Alongside this, they've also launched two self-developed, high-performance fused operators. By leveraging architectural optimization and fostering seamless operator collaboration, these innovative models bring about a substantial reduction in costs for deep reasoning scenarios. In fact, the inference costs are merely one-tenth of those associated with dense models of a comparable scale. When pitted against the previous Ring series, there's a cost reduction exceeding 50%.
Their training and inference engine operators are finely tuned to work in perfect harmony, ensuring they maintain top-tier positions across multiple demanding reasoning benchmarks. Presently, both models are available for use on platforms like Hugging Face and ModelScope. This strategic move not only showcases the team's exceptional technical capabilities but also equips developers with powerful, efficient tools to drive their projects forward.