Google Enhances Gemini 2.5 TTS Model to Boost Speech Expressiveness
2025-12-11 / Read about 0 minute
Author:小编   

On December 11, 2025, Google made a significant announcement regarding the Gemini 2.5 Flash and Pro text-to-speech (TTS) preview models. This announcement marked the launch of a substantial update, which replaced the previous version that had been introduced in May.

The newly updated models bring about remarkable improvements. They significantly enhance the expressiveness of speech, allowing for more nuanced and engaging vocal delivery. Moreover, users now have greater control over speech speed, enabling them to adjust the pace of speech according to specific needs. Another key improvement lies in the consistency across multiple speakers. Whether it's different virtual characters or various narrators, the models ensure a uniform and seamless experience.

These advanced models support a wide range of 24 languages, catering to a global audience. At the same time, they manage to maintain stable character voices. This means that even when switching between different languages or speakers, the unique vocal traits of each character remain intact, providing a consistent and recognizable auditory identity.

The updated models have already found practical applications. They have been integrated into platforms like Wondercraft. On this platform, they facilitate multi-character dialogues, allowing different virtual characters to engage in natural and lifelike conversations. Additionally, the director modes they offer enable the generation of natural speech, giving users more creative control over the vocal output.

For those eager to experience the new TTS capabilities, Google provides two convenient avenues. Users can explore these features through Google AI Studio and Playground. These platforms are particularly well-suited for high-fidelity voice scenarios. For instance, in the production of audiobooks, the enhanced expressiveness and consistent character voices can bring stories to life. In educational videos, clear and engaging speech can better convey information. And in marketing content, the ability to control speech speed and maintain consistent voices can help create more impactful and memorable messages.