Outperforming GPT-Realtime-2: Alibaba’s Voice Large Models Claim Top Spots in Three Categories

1 day ago / Read about 0 minute

Author：小编

On May 21, reports emerged that Alibaba’s voice large models, Fun-Realtime-ASR and Fun-Realtime-AudioChat, have recently distinguished themselves on Artificial Analysis, a globally recognized AI evaluation platform. These models outperformed leading international counterparts, including GPT-Realtime-2, achieving first-place rankings in three critical metrics: ‘Listening Accuracy’ (measured by word error rate), ‘Listening Comprehension’ (evaluating speech reasoning capabilities), and ‘Conversational Fluency’ (assessing dialogue smoothness).

Alibaba’s voice large models are now extensively deployed across popular applications such as the QianWen App, Gaode Maps, and DingTalk. These integrations empower users with seamless services, including real-time speech-to-text conversion, intelligent navigation interactions, and automated meeting minute generation, enhancing both efficiency and user experience.

Previous page：SpaceX: Unveils Ambitious Plans for Lunar and Mart...

Next page：Youdao Fully Launches Open-Source for 'Ziyue 4' Mu...

Return to List

Hot Reading

1 day ago

Yearslong fight over users' right to tweak smart TV software heads to trial

1 day ago

Intuit to lay off over 3,000 employees to refocus on AI

1 day ago

NanoClaw creator turns down $20M buyout offer, raises $12M seed instead

2 day ago

OpenAI co-founder Andrej Karpathy joins Anthropic’s pre-training team