Doubao Audio Generation Model 1.0 Unveiled, Heralding the Advent of the 'Audio Director' Era
2 day ago / Read about 0 minute
Author:小编   

The Doubao Audio Generation Model 1.0 has made its official debut, boasting two cutting-edge core technologies: multimodal reference generation and long-duration timbre consistency. Users can effortlessly generate complete audio by simply inputting prompts encompassing character dialogues, emotional tones, background music, and ambient atmosphere. When it comes to crafting lengthy audio pieces, the model can consistently uphold the distinct characteristics of the character's voice throughout. Moreover, it showcases zero-sample multimodal audio creation prowess, allowing for the generation of high-quality target audio based on text descriptions or reference audio inputs, all without the necessity for supplementary training. This facilitates a profound separation of timbre and style, along with the ability to perform multiple roles with a single voice, thereby significantly reducing the barriers to entry for professional audio production.