Fish Audio Releases S2-Pro Model, Setting a New Standard for High-Fidelity Real-Time Speech Synthesis
3 hour ago / Read about 0 minute
Author:小编   

According to MarketChpost, Fish Audio has officially launched its flagship text-to-speech (TTS) model, S2-Pro, featuring an innovative dual autoregressive architecture. It supports 44.1kHz high-fidelity audio output and zero-shot voice cloning, allowing the reproduction of the speaker's identity and emotional state with just 10 to 30 seconds of reference audio. Additionally, it achieves an initial audio latency of approximately 100 milliseconds on NVIDIA H200 hardware, setting a new benchmark for real-time interactive AI applications.