Ali Unveils Two Innovative Voice Models for Tailored Roles and Realistic Background Sound Simulation
23 hour ago / Read about 0 minute
Author:小编   

Ali has rolled out two cutting-edge voice models: Fun - CosyVoice3.5 and Fun - AudioGen - VD. The first model, Fun - CosyVoice3.5, is a voice cloning tool that operates based on reference audio. In contrast, Fun - AudioGen - VD is a timbre design model that functions independently of any reference audio. Both models are equipped with 'instruction-following' features, making them versatile for use in a wide range of applications.

Fun - CosyVoice3.5 shines in the Chinese 'difficult cases' section of the Seed - TTS benchmark test, significantly lowering error rates for uncommon characters and phrases. It also offers free-style mode instruction control, effectively tackling the common issues associated with traditional cloning models.

On the other hand, Fun - AudioGen - VD is dedicated to 'creating something out of nothing' in timbre design. It allows for personalized timbre and emotion customization, as well as the simulation of intricate soundscapes.

Edited by Yang Juanjuan, and meticulously proofread by Chen Diyan.