On January 22, 2026, Ali QianWen's Qwen team made an exciting announcement—the open-source release of the Qwen3-TTS, a multi-codebook full-series speech generation model. This model comes in two distinct sizes: 1.7B and 0.6B. The 1.7B version is specifically crafted for scenarios that demand top-notch, ultimate performance, while the 0.6B variant is designed to strike a perfect balance between efficiency and capability, catering to a wider range of use cases.
The Qwen3-TTS model boasts an impressive array of features. It supports voice cloning, enabling users to replicate specific voices with remarkable accuracy. Additionally, it facilitates voice creation, allowing for the generation of entirely new and unique voices. The model is also adept at producing human-like speech, making interactions feel more natural and engaging. Moreover, it offers voice control based on natural language descriptions, a feature that simplifies the process of directing speech generation.
In terms of linguistic coverage, the Qwen3-TTS model is truly global. It encompasses 10 mainstream languages, such as Chinese, English, Japanese, and Korean, along with a diverse range of dialects. This broad language support ensures that the model can cater to a wide spectrum of users across different regions and cultural backgrounds.
One of the key technical strengths of the Qwen3-TTS model lies in its self-developed encoder and dual-track architecture. These innovations enable the model to achieve low-latency speech restoration without compromising on quality, resulting in high-fidelity audio output. This combination of speed and accuracy makes the Qwen3-TTS model an ideal choice for applications where real-time speech generation is crucial.
For those interested in exploring the capabilities of the Qwen3-TTS model, it is readily available on popular platforms such as GitHub and HuggingFace. These platforms provide easy access to the model, allowing developers and researchers to integrate it into their projects and leverage its powerful speech generation capabilities.
