OpenAI Unveils Three Cutting-Edge Real-Time Voice Models, Elevating Voice Interaction to New Heights
5 hour ago / Read about 0 minute
Author:小编   

On May 8, 2026, OpenAI made waves in the tech world by introducing three innovative real-time voice models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. These models have been seamlessly integrated into the Realtime API, promising a revolutionary upgrade to voice interaction experiences.
GPT-Realtime-2 stands out with its GPT-5 level reasoning prowess, enabling it to adeptly handle interruptions and invoke tools on the fly.
GPT-Realtime-Translate, on the other hand, is a linguistic marvel, supporting input in a staggering 70 languages and output in 13, facilitating seamless simultaneous translation.
Lastly, GPT-Realtime-Whisper excels in low-latency streaming transcription, ensuring swift and accurate text conversion.
All three models operate on a pay-per-Token or per-minute billing structure, offering flexibility and cost-efficiency to users.