Outperforming GPT-Realtime-2: Alibaba’s Voice Large Models Claim Top Spots in Three Categories
1 day ago / Read about 0 minute
Author:小编   

On May 21, reports emerged that Alibaba’s voice large models, Fun-Realtime-ASR and Fun-Realtime-AudioChat, have recently distinguished themselves on Artificial Analysis, a globally recognized AI evaluation platform. These models outperformed leading international counterparts, including GPT-Realtime-2, achieving first-place rankings in three critical metrics: ‘Listening Accuracy’ (measured by word error rate), ‘Listening Comprehension’ (evaluating speech reasoning capabilities), and ‘Conversational Fluency’ (assessing dialogue smoothness).

Alibaba’s voice large models are now extensively deployed across popular applications such as the QianWen App, Gaode Maps, and DingTalk. These integrations empower users with seamless services, including real-time speech-to-text conversion, intelligent navigation interactions, and automated meeting minute generation, enhancing both efficiency and user experience.