Ali Tongyi Qianwen Unveils Its Indigenous Omni-Modal Large Model, Qwen3-Omni
1 week ago / Read about 0 minute
Author:小编   

On September 26, 2025, Ali Tongyi Qianwen made an official announcement, introducing its indigenous omni-modal large model, Qwen3-Omni. This model was pre-trained in a way that ensures no sacrifice in intelligence across different modalities. In a series of 36 audio and audiovisual benchmark tests, it achieved an impressive 32 open-source SOTA (State-of-the-Art) results and 22 overall SOTA results, surpassing the performance of closed-source powerhouses like Gemini-2.5-Pro. Built on the Thinker-Talker architecture, Qwen3-Omni offers a wide range of capabilities. It supports text-based interaction in 119 languages, enabling seamless communication on a global scale. Moreover, it excels in speech understanding across 19 languages and speech generation in 10 languages. When it comes to real-time conversations, the model impresses with its low latency. Pure audio conversations experience a mere 211ms delay, while video conversations have a latency of just 507ms. Additionally, the model is capable of comprehending audio content that lasts up to 30 minutes, making it a versatile tool for various applications.