As reported by The Decode, AI-focused startup Resemble AI has introduced an open-source text-to-speech model named "Chatterbox Turbo." This innovative model is capable of cloning a human voice using just a 5-second audio clip, delivering sound quality that surpasses that of ElevenLabs and Cartesia. With an initial audio output latency of less than 150 milliseconds, Chatterbox Turbo is ideally suited for real-time applications, including AI agents, customer service systems, gaming, virtual avatars, and social media platforms. The model is distributed under the MIT license, granting users the freedom to use it commercially, modify it, and redistribute it without any cost. It is already accessible on various platforms, and the complete code can be found on GitHub.
