​Inworld Unveils TTS-1.5: Boasting Low Latency and Costing 25 Times Less Than Comparable Alternatives
2026-01-22 / Read about 0 minute
Author:小编   

Inworld AI has just rolled out its real-time speech AI model, Inworld TTS-1.5. This cutting-edge tool is being lauded as the swiftest and highest-caliber speech generation solution on the market today. Notably, the P90 first-sound latency for the 1.5 Max iteration clocks in at under 250 milliseconds. Meanwhile, the 1.5 Mini version takes it a step further, boasting a latency of less than 130 milliseconds—marking a fourfold acceleration compared to its predecessor.

The Max version doesn't just excel in speed; it also delivers top-tier sound quality and an array of vocal expressions. Following enhancements, TTS-1.5 has seen a 30% uptick in expressiveness, coupled with a 40% reduction in the word error rate. This effectively minimizes problems like auditory misperceptions, resulting in speech that's even more lifelike. Moreover, the model now boasts expanded multilingual support, encompassing 15 languages, all while maintaining a cost that's over 25 times lower than competing solutions. The 1.5 Max version is tailored for the majority of use cases, whereas the 1.5 Mini version is fine-tuned for applications where latency is a critical factor.